Long-term precipitation forecast for drought relief using atmospheric circulation factors: a study on the Maharloo Basin in Iran

Long-term precipitation forecasts can help to reduce drought risk through proper management of water resources. This study took the saline Maharloo Lake, which is located in the north of Persian Gulf, southern Iran, and is continuously suffering from drought disaster, as a case to investigate the relationships between climatic indices and precipitation. Cross-correlation in combination with stepwise regression technique was used to determine the best variables among 40 indices and identify the proper time lag between dependent and independent variables for each month. The monthly precipitation was predicted using an artificial neural network (ANN) and multi-regression stepwise methods, and results were compared with observed rainfall data. Initial findings indicated that climate indices such as NAO (North Atlantic Oscillation), PNA (Pacific North America) and El Nino are the main indices to forecast drought in the study area. According to R 2 , root mean square error (RMSE) and Nash–Sutcliffe efficiency, the ANN model performed better than the multi-regression model, which was also confirmed by classification results. Moreover, the model accuracy to forecast the rare rainfall events in dry months (June to October) was higher than the other months. From the findings it can be concluded that there is a relationship between monthly precipitation anomalies and climatic indices in the previous 10 months in Maharloo Basin. The highest and lowest accuracy of the ANN model were in September and March, respectively. However, these results are subject to some uncertainty due to a coarse data set and high system complexity. Therefore, more research is necessary to further elucidate the relationship between climatic indices and precipitation for drought relief. In this regard, consideration of other climatic and physiographic factors (e.g., wind and physiography) can be helpful.


Introduction
Arid and semi-arid climates cover over one-quarter of the land area of the earth and experience serious water scarcity, more than any other climate region.Drought has tremendous social and economic impacts all over the world.According to the report from Iranian Parliament, the cost incurred by drought in Iran alone is estimated to be about USD 2.5 billion annually (IRNA Press, 2011).It is therefore essential to have a proactive approach to reduce the impacts of the drought.Precipitation forecasting is an important way to support water resources management so as to mitigate the harmful effects of droughts and climate change.
Short-term weather forecasting is mostly based on radar and satellite information analysis.Long-term prediction fills the gap between short-range weather forecasting and climate prediction, and points to timescales of more than 1 month to 1 year (Tourigny and Jones, 2009).The main source of the predictability results from large-scale atmospheric circulation anomalies due to tropical sea surface temperature (SST) anomalies (Teschl and Randeu, 2006).Sea surface temperature anomalies have relatively large timescales, and

S. K. Sigaroodi et al.: A study on the Maharloo Basin in Iran
may be predictable at a useful level of skill up to a season or a year (Shukla et al., 2000).
Many studies have investigated the relationship between SST anomalies and climatic phenomena variations, especially precipitation.Previous studies have demonstrated the influence of the El Niño/La Niña-Southern Oscillation (ENSO) on different climates since the 1980s (Vautard and Legras, 1988;Barnston and Livezey, 1987;Fraedrich, 1994).Ferranti et al. (1990) and Flateau et al. (1997) studied the significant impact of the Madden-Julian Oscillation (MJO) on the speed of propagation in equatorial Indian and western Pacific oceans.Barlow et al. (2002) suggested that the prolonged, westward-concentrated La Niña was a major factor for the drought in the center and southwest Asia.Palmer et al. (2004) studied precipitation forecasting in autumn over the Mediterranean coast.Bladé et al. (2012) found high uncertainty on summer rainfall forecasts using the summer North Atlantic Oscillation (SNAO) in Europe and Mediterranean, showing that the Mediterranean region was anomalously wet during high SNAO summers when strong anticyclonic conditions and suppressed precipitation overcame the United Kingdom.Guérémy et al. (2012) forecast French Mediterranean heavy precipitation events using weather regimes during autumn, and established an atmospheric link between Pacific SST anomalies and precipitation over this area.
Long-term predictions often exhibit high uncertainties.Scientists have made considerable effort to achieve greater accuracy and reliability through easily used methods and available data (Wu et al., 2011).To improve forecast accuracy, some researchers have attempted to discover nonlinear relationships between climatic phenomena and climatic indices patterns (Kim and Barros, 2001;Tourigny and Jones, 2009;Guérémy et al., 2012).Li et al. (2012) applied the back-propagation method and identified five key indices out of 24 factors as the most effective variables in runoff forecasting during the flood season in the Nenjiang River basin in China.
The present study investigated Maharloo Lake in Iran to explore more accurate long-term precipitation forecasting using multi-regression analysis and artificial neural network methods.The key contribution was the establishment of a 10-month-ahead precipitation forecasting model to support drought-risk management and the applicability of the ANN model in long-term prediction using atmospheric circulation factors.

Study area
Maharloo Lake (Fig. 1) is a saline shallow lake located 200 km north of the Persian Gulf in southern Iran.The lake covers an area of about 250 km 2 , and the basin area is about 31 500 km 2 .It is so salty that some areas are saltmined during the dry season.The area is situated in an arid and semi-arid region.Rainfall varies from 150 mm on the plains to 650 mm on the high mountains, with an average of 350 mm.The rainfall is concentrated in cold seasons, while the precipitation is very low from June to October.The lake is recharged by two seasonal rivers: the Sultanabad and Khoshk.
During winter, several migratory bird species from north of Caspian Sea, flamingos (Phoenicopterus roseus), common shelducks (Tadorna tadorna) and mallards (Anas platyrhynchos), spend 4 months in the area feeding on brine shrimp (Artemia franciscana).Thus, the lake has important ecological value.
Recently the lake water has decreased, especially during drought episodes.In 2008, about 90 % of Maharloo Lake dried out, which caused the number of flamingos to decrease from 150 000 to only 5000 (ISNA Press, 2008).In addition, the lake watershed is used for agricultural and industrial activities that consume a large portion of water.Therefore, long-term prediction of precipitation can help adjust agricultural and industrial activities and consider lake sustainability based on ecological water rights.
The climate of the area is affected by different systems, including the Mediterranean and Black Sea from the west, the Caspian Sea from the north, and the Persian Gulf and Arabian Sea from the south, which add to the difficulty and uncertainty in precipitation prediction.

Precipitation data
Precipitation data from four gauges (Shiraz, Dehkade, Ali Abad and Dashtbal) located in Maharloo Basin during January 1967 to December 2009 are used.Time series of monthly precipitation values obtained from these gauges and the Thiessen method were used to calculate the mean monthly precipitation series for the entire basin.

Atmospheric circulation factors
Atmospheric dynamics are influenced by solar activity through solar sunspot activity and radiation intensity.Sunspots are temporary phenomena on the photosphere of the sun that appear visibly as dark spots compared to surrounding regions.Solar radiation intensity is the energy source of climatic systems and is a relatively stable influence, so it is generally considered as a constant factor.However, sunspots cycle approximately 11 years.Sunspot number is correlated with precipitation and drought.Wang et al. (1997) studied the relationship between relative sunspot number and runoff in the Yellow River basin.Friis-Christensen andLassen (1991) andWeickmann et al. (2000) found that sunspot cycle length showed good correlation with Northern Hemisphere land temperature and drought periods.
Please replace " (Kim and Barros, 2001)" by (Kim and Barros, 2001;Wang et al. 2006)" at Page 3, line 5 of second column.The specific heat and mass of the water on earth are large (Li et al., 2012).The huge heat capacity of the oceans brings an obvious and long-term effect to atmosphereocean interactions, which affects atmospheric circulation in subsequent periods (Penland and Matrosova, 1998).Atmospheric circulation factors represent the physical properties correlated with precipitation.In this study, 40 climate indices from the NOAA website (http://www.esrl.noaa.gov/gmd/dv/ftpdata.html)were used.Table 1 lists the climatic indices selected as atmospheric circulation predictors and their recorded period.The cold and warm phases of ENSO phenomena are accessible on the website www.ggweather.com/enso/oni.htm.

Description of methods
Since large differences existed in the means and variations between the parameters, the data were normalized (Trenberth, 1994;Teschl and Randeu, 2006;Guérémy, 2012) before they were used in the model.After normalization, all time series of monthly rainfall and climate indices were in the range of 0 and 1 (see Appendix: Eq.A1).

Multivariate regression
The time lag between dependent and independent variables was different for each input variable.Cross-correlation was used to find the proper independent variables and identify their time lags.The final multivariate equation was determined using the stepwise method.The stepwise regression method selects the predictive variables by an automatic procedure starting with the best-correlated variable.It adds the variable (if any) if the addition of this variable significantly improves model performance.This process repeats until no improvement is obtained (Prasad et al., 2010).

Artificial neural network
The application of ANN in hydrology forecasting started in the early 1990s, covering rainfall-runoff modeling (Fernando and Jayawardena, 1998), streamflow forecasting (Kim and Barros, 2001;Wang et al., 2006) and groundwater level forecasting (Coulibaly et al., 2001).
The ANN usually consists of an input, hidden and output layer.Multi-layer perceptron (MLP) is the most widely used ANN in forecasting models, and was applied in this study.The same independent variables of the multivariate regression model were used as the input.The data set was divided into two groups: 80 % of the data were used for model training, and 20 % were used for cross-validation.For each month, neural network training and validation were repeated 20 times, and the best result was selected according to R 2 and root mean square error (RMSE).In each separate run, the first and the last 10 % of the time series were selected for validation.The order of the data was randomized before dividing the data set into two groups.Both the training and validation data were plotted for comparison.Figure 2 presents the best results, where the validation data were indicated by black arrows.

Evaluation criteria
The R 2 and RMSE between model outputs and observations were used as the primary indicators of model performance.The higher the R 2 value and the smaller the RMSE, the better were the model results.Other criteria including Nash-Sutcliffe efficiency, accuracy percentage, Heidke skill score, trend accuracy and Taylor diagrams were used to further quantify forecasting accuracies (see Appendix: Eqs.A2 to A5).Nash-Sutcliffe efficiency ranges from −∞ to 1, and a value higher than 0 means model predictions were better than the mean of observations.
Accuracy percentage shows what fraction of the forecasts is in the correct category, and it ranges between 0 and 1.To calculate this value, monthly precipitations were categorized into five classes (very dry, dry, normal, wet and very wet) based on SIP factor (Khalili and Bazrafshan, 2003) (see Appendix: Table A1).
Heidke skill score (HSS) indicates the fraction of correct forecasts after eliminating randomly correct forecasts since some forecasts can be correct due purely to random chance.
Trend accuracy gives the percentage for which the actual output changes in the correct direction relative to the previous desired value.Trend accuracy measures the proportion of the trend that has been correctly predicted.In this case, the trend is either "up" or "down".
Taylor diagrams provide a statistical summary of how well modeled patterns match observed patterns in terms of correlation, RMSE and variance.

Regression results
For each independent variable, the best time lag to the dependent variable was determined through cross-correlation.Table 2 shows the correlation between the PNA (Pacific North America) index and precipitation in different time-lag months.For example, the best time lag to predict precipita-tion in January using the PNA index was 5 months (the previous August).It is seen that, for different months, the best time lag is different.
Table 3 shows the top 10 factors out of 40 indices and the corresponding best time lags.They were ranked by R 2 , and only the R 2 values that are higher than 0.05 were listed.The indices in the first line of the table are the best indices to predict monthly precipitation for univariate regression.These indices explained less than 25 % (mostly less than 20 %) of total variation, and such low values of R 2 implied that precipitation in the area was not affected by one particular region only with a constant interval.
To improve model performance, more independent variables were added via the stepwise regression method.Tables 4, 5 and 6 show the results of the correlation matrix, analysis of variance (ANOVA) table, and multivariate coefficients for January precipitation, respectively.
It is seen from Table 4 that the precipitation in January P (Jan) has significant correlation with PNA, QBO (Quasi-Biennial Oscillation) and TNA (Tropical Northern Atlantic).Also, there is a strong correlation between PNA and TNA.Given the fact that the correlation between P (Jan) and PNA is higher than the correlation between P (Jan) and TNA, TNA was ruled out while PNA and QBO were finally selected as the independent variables for predicting the precipitation in January, as proved by Tables 5 and 6.
Similarly, the procedures were applied to all the other months.The final selected independent variables are listed in parentheses in Table 3, and Table 7 presents the univariate and multivariate regression results for an entire year.Results showed that R 2 increased and the regressions explained up to 44 % (mostly more than 30 %) of total variation.For July and September, only univariate regression was selected because adding more variables did not make improvement to the prediction.

ANN results
Since a neural network can arrive at different solutions for the same data due to initialization of network weights, results from 20 repetitions for each month were selected.The results showed that the ANN model explained more than 40 % (up to 76 %) of total variation.Figure 2 presents comparisons between the ANN model results and the observations.Although  the ANN model results were better than the regression model results, both methods failed to predict some extreme values.

Evaluation of results
Table 8 presents the precipitation classification results of the ANN and regression models, and Table 9 gives the R 2 , RMSE, Nash-Sutcliffe, trend accuracy and Heidke skill score values.The values of R 2 , RMSE, Nash-Sutcliffe, trend accuracy and Heidke skill score were higher when the time lag was 10 months.In general, the ANN model performed better than the regression model.Trend accuracy determines the accuracy of variation direction, and was almost equal in both methods.However, this criterion is sensitive to false prediction; thus even one false prediction can decrease its value seriously.For example, the false prediction of February precipitation in 1999 using the ANN method caused a decrease in trend accuracy, even though the other evaluation criteria were better.
Taylor diagrams can highlight the goodness of different models compared to observations.The diagram can be visualized as a series of points on a polar plot.The azimuth angle refers to the correlation coefficient between the predicted and  observed data.Radial distance from the origin represents the ratio of the normalized standard deviation (SD) of the simulation to that of the observation.The distance from the reference point (observations) is a measure of the centered RMSE (Taylor, 2001(Taylor, , 2005)).Therefore, an ideal model (being in full agreement with the observations) is marked by the reference point with the correlation coefficient equal to 1, and the same amplitude of variations compared with the observations (Heo et al., 2014).Figure 3 displays the normalized standard deviation (SD) and correlation coefficient R 2 of the ANN and regression models.The ANN results were closer  to the observation points than were the regression results.In the diagram, the SDs of all predicted data of both methods were less than the observations, indicating that neither method captured the fluctuation of the natural events well.

Discussion and conclusions
The Maharloo watershed is suffering from water scarcity, while the watershed is dominated by agricultural and industrial activities that demand a large amount of water.Therefore, long-term prediction of precipitation can help adjust agricultural and other activities and consider lake sustainability based on ecological water requirements, especially during drought period when the ecosystem is more frangible.
The present study applied multivariate regression and ANN methods for the long-term prediction of precipitation in the Maharloo Lake basin in Iran.It used atmospheric circulation factors and cross-correlation to identify proper indepen-dent variables and time lags, among 40 indices and 12-month delay for each target month.The monthly precipitation was predicted and compared with the measured data.
According to Table 3, the NAO, PNA, QBO and SWMRR indices were more frequently used than the other indices, indicating that the regional precipitation of the Maharloo Basin was mainly affected by the North Atlantic Oscillation, the Pacific North American and Southwest Monsoon Region.These results agree with Guérémy et al. (2012), who discussed the link between Pacific SST anomalies and precipitation over the Mediterranean region.Therefore, the Pacific SST anomalies can affect the Mediterranean region as well as southern Iran.
Analyses on the ENSO phenomena and the seasonal precipitation indicated that only autumn had a good match with the ENSO phenomena (Fig. 4).The high precipitation in autumn of 1977, 1982, 1986, 1994 and 2004 are in accordance with ENSO warm phase (El Niño), and the low precipitation in autumn of 1970, 1973, 1983, 1998, 1999 and 2007  well correlated with ENSO cool phase (La Niña).However, no significant relationship between the ENSO phenomena and the seasonal precipitation was found in the other seasons.These results were consistent with the findings of Barlow et al. (2002), who discovered the effects of cold phase ENSO (La Niña) on the drought in central and southwest Asia, and Nazemosadat et al. (2006), who showed the ENSO phenomenon had a significant effect on autumn precipitation in Iran.
The results revealed that monthly precipitation anomalies could be forecasted about 10 months in advance using the selected indices in Table 7, although the R 2 is not high.The relatively low R 2 can be attributed to the far distance between  2014) and Li et al. (2012).In other words, the higher rates of observed precipitation variation compared with the modeled predictions implied that other regional conditions (such as temporal wind and local humidity, orographic conditions) also affected and complicated the precipitation system, which is crucial to be considered in future research.The comparison to monthly observed precipitation and the evaluation indices indicated that in general climatic indices are effective for predicting precipitation in both dry and wet seasons (overall accuracy = 64 %).However, the accuracy is not equal in different months.The accuracy (Heidke skill score) in wet season is comparatively lower than in dry season (Fig. 5), which may be because the study area is far from both the Pacific and Atlantic oceans.
The predicted results in dry months (June to October) were better than the other months.Although the precipitation in these months is naturally low, the ANN model could successfully forecast the rare rainfall events that happened in some of the years (e.g., June 1979, August 1994, 1996, and September 1994, which are indicated in Table 8).It is also indicated in Fig. 2 that the rainfall peaks are mostly not well predicted, while the droughts (low rainfall) are well captured by both methods.
Detailed comparison of numerical values is usually not straightforward; thus comparing the precipitation classes makes it easier to evaluate the different techniques.As shown in Table 8, the performance of ANN is mostly better than the regression method.Also in Table 9, Heidke skill scores can more clearly quantify the performance difference between ANN and regression methods than other indicators.The Maharloo Lake basin is usually lacking rainfall in summer; thus the five classes decreased to three classes (normal, dry and very dry).Consequently, the results of trend accuracy and Heidke skill score related to these classes were significantly higher than other months.
The better results of the ANN model compared to the regression model on the relationships between atmospheric indices and regional precipitation showed the high flexibility of the ANN method and the nonlinear nature of the relationships.These results were consistent with the findings of previous studies (Teschl and Randeu, 2006;Li et al., 2012;Wu and Chau, 2010), which showed the ability of the ANN model to determine an atmospheric link between SST anomalies and precipitation over the inland area of the Persian Gulf.
In general, due to large spatiotemporal distance between climatic indices and precipitation in destination as well as the coarse resolution of the data, the accuracy of the models was relatively low, which was also pointed out by some other scientists (Ferranti et al., 1990;Fernando and Jayawardena, 1998;Palmer et al., 2004;Guérémy et al., 2012).However, without considering the Heidke skill score, which eliminates randomly correct forecasts, the results (Table 9) are optimistic, especially for the dry months.The accuracy together with RMSE and R 2 can give reliable model output.
A future study should determine how different indices affect regional precipitation, whether the results are extendable to a larger area, and whether global changes such as global warming are influential in source or destination areas.

Figure 2 .
Figure 2. Comparisons between observations and ANN model results as well as regression model results: the arrows indicate the chosen data for validation (normalized precipitations are shown on the vertical axis).

Figure 3 .
Figure 3. Scatterplot of the predicted data of the regression and ANN models on a Taylor diagram.

Figure 4 .
Figure 4. Seasonal precipitation in accordance with El Niño and La Niña events.

Figure 5 .
Figure 5. Monthly average precipitation and the corresponding prediction accuracy by ANN model on Maharloo Basin.

Table 1 .
Primary selection of atmospheric circulation predictors.

Table 2 .
Cross-correlation between PNA index and precipitation.

Table 3 .
Ranked indices and proper time lags for each month of a year.

Table 4 .
Correlation matrix between January precipitation and selected indices.

Table 5 .
ANOVAs for the regressions of precipitation in January.

Table 6 .
Coefficients of univariate and multivariate regression models.

Table 7 .
Univariate and multivariate regression models for each month of a year.