Stochastic modeling of Lake Van water level time series with jumps and multiple trends

Introduction Conclusions References


Introduction
Closed-basin lake ecosystems have been important issues in the world since decades.A very dramatic example is the Aral Sea from Central Asia that has repeatedly filled and dried under the effect of both natural and human forces.The most recent dry period started in the early 1960s due mainly to the expansion of irrigation that has drained the tributary rivers, reduced the lake level by 23 m, shrunk the lake surface area by 74 %, decreased its volume by 90 %, and increased its salinity from 10 g L −1 to more than 100 g L −1 , causing negative ecological changes as well as social problems with impacts on the population residing around the lake (Micklin, 2007).Another example is Lake Urmia in northwestern Iran, one of the largest hypersaline lakes in the world.Due to drought and increased demands for agricultural water in the lake's basin, the salinity of the lake has risen to more than 300 g L −1 in recent years, and large areas of the lake bed have been desiccated (Eimanifar and Mohebbi, 2007).The instrumental record of the water level oscillations in the Caspian Sea that covers a period starting in the 1900s showed a progressive and dramatic decrease during the 1940s which continued till 1977, after which the surface water level has shown a continuous rise that has resulted in tremendous costs to its surrounding countries.According to Vaziri (1997), the reasons for the recent increase can be categorized from the implementation of the former Soviet Union's projects to the earth's tectonic movements, from the occurrence of a wet period for the region resulting in increase of river inflows to global warming, and reduction of surface evaporation.Another example of water level decline was observed in Lake Toba, Indonesia, due to use of water for power generation (Acreman et al., 1993).
As water resources become more limited due to higher demand, the significance of investigation on the changes in water bodies increases vastly.Thus, modeling lake water level fluctuations and storage possibilities of closed-basin water areas have gained particular attention.Water level in closedbasin lakes depends on the balance between precipitation and evaporation.Such lakes with large areas contain large volumes of water and can effectively filter climatic noises, and may serve as a good indicator of climate change through the change observed in their water levels over a period of time (Rodinov, 1994).In addition, future water lake level change scenarios can be used for long-term planning of water resources.Observations and measurements in water bodies provide important information about hydrology of the interested water system.However, observations and measurements are not always available to the users due to their time consuming and costly installation and maintenance.Therefore, modeling approaches are developed for understanding the hydrology of the water bodies, including lakes.
Several studies are available to model and evaluate lake water level fluctuations.Examples selected from the literature are as follows: a study by Khavich and Ben-zvi (1995) presents a model based on the water budget to forecast daily changes in water level in Lake Kinneret, Israel during flood periods.The water balance of the Aral Sea was simulated by Small et al. (1999) with a lake model coupled to NCAR's regional climate model (RegCM2).Multi-source satellite-driven data such as satellite-based rainfall estimates, modeled runoff, evapotranspiration, and a digital elevation dataset were used to model Lake Turkana, East Africa water levels from 1998 to 2009 (Velpuri et al., 2012).The support vector machine (SVM), artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS) models were developed to predict water levels for Lake Erie, North America (Khan and Coulibaly, 2006), for Lake Urmia, Iran (Talebizadeh and Moridnejad, 2011), and for Lakes Egirdir and Iznik, Turkey (Guldal and Tongal, 2010;Kisi et al., 2012).
Similarly, water level fluctuations of Lake Van have been studied.Due to high salinity of its water, changes in the deep closed-basin Lake Van water level have been important to analyze in order to investigate the mixing conditions (Kaden et al., 2010).A method based on the water balance equation with an added cumulative departures concept to show additional water level increments along the time axis was applied by Kadioglu et al. (1999).Monthly water level of the lake was studied with a stochastic model providing a basis for the assessment of the expected extreme water levels at different risk percentages (Sen et al., 2000).In order to predict the lake water level fluctuations, models based on the triple diagram, ANN and SVMs were developed for Lake Van (Altunkaynak et al., 2003;Cimen and Kisi, 2009).
In this study, precipitation data obtained from meteorological stations surrounding the Lake Van are first analyzed.Water budget of the lake is established based on the inflow (precipitation and runoff) to and outflow (evaporation) from the lake.A stochastic model is then generated at annual scale using the lake water level data recorded at the gauging station in Tatvan at monthly time scale.Data used in the application are defined, water budget is explained below, after which the developed autoregressive models are detailed.Results and conclusions come later.

Lake Van and water level data
Lake Van is a saline closed-basin lake located in Eastern Anatolia, Turkey, between the coordinates 37 • 43 -39 • 26 North and 42 • 40 -44 • 31 East (Fig. 1a).It is the largest lake in Turkey and the largest soda lake in the world.The surface area and drainage area of the lake are 3502 and 12 956 km 2 , respectively.Its elevation above the mean sea level is 1646 m with a deepest point of 451 m, and a water volume of 576 km 3 .The lake is fed by direct precipitation on the lake, runoff from contributing rivers and snowmelt.Lake Van does not have any natural outlet, water discharges by evaporation only.The lack of outflows causes the accumulation of salts in the lake and increases the salinity (Thiel et al., 1997).Therefore, water in the lake cannot be used for drinking or irrigation because of its high salinity, and only limited species of fresh water fish can live in its waters.
Water level rises after spring with melting of snow from surrounding mountains whereas the lake has its lowest level during winter months due to terrestrial climate.Under normal climate conditions, 50-60 cm of fluctuation is observed in the lake water level at annual scale.
Monthly lake water level data recorded from 1944 to 2007 (768 month data) in Tatvan gauging station (Fig. 1b) are used in this study.As seen from Fig. 2, the water level in the lake stays at an almost constant level after which a sudden upward jump is recorded.The water level remains constant at the increased level until another upward jump, after which a gradually increasing trend is observed in the 1990s to reach its maximum level.A gradually decreasing water level followed by an increasing trend is recorded at the end of the time series.Lake water level data in Fig. 2 correspond to differences between the measured water level and the limnograph base level (1646.59m).Minimum and maximum lake water levels were recorded as 1646.68 and 1650.53 m, respectively.During the observation period, the mean lake water level was 1648.31 m.Standard deviation of annual difference in the lake water level is 92 cm.

Water budget
According to Landmann et al. (1996), rivers add 2.1 km 3 yr −1 and direct precipitation adds another 1.7 km 3 yr −1 to Lake Van, while evaporation from the lake amounts to 3.8 km 3 yr −1 under stationary lake water level conditions.As no outlet other than evaporation exists for discharging water from the lake, the sudden and gradual changes (jumps and trends) observed in the lake water level (Fig. 2) can be linked to the water budget between precipitation, runoff and evaporation of the lake and be balanced by the change in the lake water level.For this purpose, the water budget of the lake was established at annual scale based on precipitation data obtained from 11 meteorological stations surrounding the lake (Fig. 1b).Only precipitation, runoff and evaporation were used for establishing the water budget of the lake.As inflows, direct precipitation falling on the lake surface was taken together with runoff coming through the river system to the lake.Precipitation data available in the form of an annual total from 1975 to 2010 were up-scaled from station-or point-scale to lake-or area-scale by using the Thiessen polygon method.Inflow to the lake falling as precipitation over the lake surface was calculated as the summation of precipitation in each meteorological station weighted with the corresponding surface area of the lake (see Fig. 1b).This was considered as the total precipitation over the lake.Similarly, precipitation over the lake watershed was summed after it was weighted with the corresponding watershed area of each meteorological station.The sum of that part of precipitation falling over the watershed was converted into runoff by multiplying it with the runoff coefficient reported as 0.26 for the Lake Van closed basin (Bayazit, 1999).Evaporation data were available in Batur et al. (2008) in the form of annual total at a station located in Van from 1956 to 2007.
As the three components, precipitation, runoff, and evaporation, are balanced with change in the lake water level, a comparison between the lake water level calculated through the water budget and that measured at the gauging station is meaningful.In Fig. 3, change in the water level of the lake was referenced to the previous year; i.e., change can either be positive or negative depending on if an increase or a decrease is observed or calculated relative to the previous year's water level.It is seen that calculated and measured lake water levels match each other; maximum levels in 1988 and 1993, and minimum level in 2000 show this clear match.However, although calculated values follow a similar pattern with respect to the measured ones they are rather different in several years.For instance; in the year 2000, such difference is about 50 cm.This can simply be connected to the coarse analysis made with the data explained above.Such an analysis has been important to see if the gradual and sudden changes are due to hydrometeorological conditions.Only precipitation and evaporation data were considered.Surface runoff was taken into account as the product of the reported average runoff coefficient and the total precipitation falling over the lake watershed.It is clear that the reported average runoff coefficient is not able to demonstrate temporal changes as well as spatial heterogeneity within the watershed.In the water budget study, no snow component was considered either.For a more detailed analysis, not only surface runoff but also baseflow components should be considered in order to capture the contribution of the delayed runoff recharging the lake.However, even under these circumstances, the water budget of Lake Van has shown enough evidence that trends and jumps in the lake water level are due to changing hydrometeorological conditions in the lake closed basin.If more detailed data are taken into account and an analysis with a shorter time interval than a year is made, better approximations can be foreseen for observed and calculated lake water levels.
As a result, any sudden or gradual change (upward or downward jump, increasing or decreasing trend) in the lake water level are due to the response of the lake to the imbalanced effect of inflow and outflow.No delay is observed in the lake response probably because of the coarse time scale used in this study.If the analysis is made at a shorter time interval (say monthly) a delay in the lake response can be meaningful.For example; Gencsoy (1997) discovered a 2 month delay in the lake response based on monthly inflow and outflow data.

Mono-and multiple-trend models
Autoregressive moving average (ARMA) models (Box and Jenkins, 1970) are widely used in hydrology and water resources studies (Yevjevich, 1972).Consider a stationary time series,. . ., x t−1 , x t , x t+1 ,. . .normally distributed with mean µ and variance σ 2 , observed at equally spaced times,. . ., t t−1 , t t , t t+1 ,. . . .In the model construction, the variable x of the time series is transformed by standardization of with which the time series is converted to another time series. . ., y t−1 , y t , y t+1 ,. . .with zero mean and unit variance.
The ARMA-type models are well established in the literature (Salas et al., 1980) and therefore, only very brief information is provided below.This type of models is applied to stationary time series free of deterministic components; trends and periodicities.Therefore, any trend and periodicity disrupting the homogeneous structure of the water level data of Lake Van should be determined and removed from the time series to obtain the stochastic components of the time series.The trend can be represented by a first-order polynomial function as while the periodicity which corresponds to cycles or periodic changes with time can be fitted by a Fourier series.
In this study, modeling the lake water level was performed in two different ways.It is obvious from Fig. 4a that the water level time series has an increasing trend which can be fitted by Eq. ( 2) when the whole time series is taken at once.It is clear, at the same time, that the time series can be divided into segments each with its own statistical characteristics.Considering that the time series depicts a number of different segments, it was decided, in this study, to fit multiple trend lines to each segment within the time series.The lake water level time series is divided into periods each fitted by a trend line through use of segmentation software called SEG-MENTER (Aksoy et al., 2007(Aksoy et al., , 2008;;Gedikli et al., 2008Gedikli et al., , 2010a, b), b).
The segmentation of a time series simply divides a given number of observations into subseries with statistical characteristics that are similar within each subseries and different between subseries.This is called jump analysis and can also be considered a change point detection problem.The simplest case is the segmentation-by-constant in which it is aimed to determine the change points where the average of the current segment is statistically different than the averages of the next segment as well as that of the previous one.Not only segmentation-by-constant but also segmentation with regression-by-lines or higher order polynomials can be used.In this study, segmentation with regression-by-lines was used due to the linear trend fit in Eq. ( 2).When the linear segmentation is concerned, a time series can be segmented into as many linear pieces as half the number of items in the time series.This information might be useful in many cases.Such information is important in particular when the inner trends might behave differently than the trend taken over the whole dataset, which is the case in Fig. 4b.It is seen that the increasing mono-trend fitted to the whole time series in Fig. 4a can have different behavior when multiple inner trends are taken into account, as in Fig. 4b.Two models were developed in this study.In the first model, the time series was treated as a whole under the hypothesis that the time series has an increasing trend as in Fig. 4a.In the second model, the time series was divided into a number of segments, changing from 2 to 30, to each a linear trend can be fitted (Teltik, 2008).The former was called mono-trend model while the latter was defined as the multiple-trend model.Both the mono-and multiple-trend models were used for simulation of synthetic lake water level time series under the hypothesis that the observed monoand multiple-trend structure of the lake water level will persist during the simulation period.The mono-trend hypothesis can be accepted while the repeatability of the multiple-trend structure of the time series during the simulation period can be questioned.Investigations on the Lake Van water level (Landman et al., 1996) showed that the lake had experienced such changes (increase and decrease in the lake water level) in its history, and the highest lake terrace was about 70 m above its present level.As referenced by Sen et al. (2000), the water level fluctuations on the order of few meters were reported in the 19th century.Based on this information, it is therefore possible to assume that the multiple-trend hypothesis of the water level time series is likely to occur during the simulation period although the order, length and steepness might change.
Among the multiple-trend models, the four-trend model was considered.This is mainly based on the obvious segmentation in Fig. 4b.It is seen that the first trend is fitted to an almost constant segment followed by a negative trend after which two increasing trends more severe than the monotrend in Fig. 4a are observed.Not only because of this, but also due to the observation that more extreme water levels were obtained by the use of the four-trend model among the multiple-trend models selected.The periodic component of the water level data was treated with the Fourier series in mono-and multiple-trend models.
For the sake of obtaining a parsimonious model, AR(1), AR(2) and ARMA(1,1) models are commonly used in the literature when water resources-related data are concerned.In this study, these models were tested to alternate each other.Akaike Information Criteria (AIC) computed for the mono-and multiple-trend models pointed AR(2) as the most suitable model, which is given by where r 1 and r 2 are lag-1 and lag-2 correlation coefficients, respectively.AIC were given together with model parameters in Table 1 for the mono-and multiple-trend models.In both mono-and multiple-trend models, the minimum AIC was obtained for the AR(2) model demonstrating better suitability of AR(2) than the AR(1) and ARMA(1,1) models (see Salas et al., 1980 for details of AR(1) and ARMA(1,1) models).Also correlograms of the models were calculated as in Fig. 5.It is clearly seen that correlograms of the AR(2) models are in a better agreement with the correlogram of the observed lake water level time series for both the mono-and multiple-trend cases.The correlogram of the AR(2) model matched exactly the observed correlogram for the first lags.This shows the validation and hence applicability of the developed AR(2) model.Maximum lake water level is of practical importance because of the inundation problem after a possible lake water increase, in areas surrounding the lake.Therefore, monthly synthetic lake water level time series, 1000 yr long, are generated by the mono-and multiple-trend AR(2) models.Maximum lake water levels of different return periods were calculated from the synthetic sequences.Table 1 shows maximum lake water levels for given probabilities together with their observed counterparts.
As seen from Table 2, the probable maximum lake water level for the multiple-trend model is higher than that of the mono-trend model for all return periods.As an example, maximum lake water level for a 50 yr return period was simulated as 395.8 cm by using the mono-trend model while it was calculated as 404.8 cm in the multiple-trend model.In one hand, it is seen from the results of the mono-and multiple-trend models that maximum lake water level never reached 400 cm even for the 1000 yr return period.On the other hand, the multiple-trend model exceeded this threshold for the 50 yr return period.In this context, the multiple-trend model is assumed to be more appropriate than the monotrend model for planners and practitioners who want to be on the safe side particularly for the long-term planning of the lakeshore infrastructure development projects.

Conclusions
In this study, changes in the water level of the closed-basin Lake Van were analyzed by establishing the water budget between the inflow to and outflow from the lake.The inflow into the lake is composed of precipitation in the form of either rainfall or snow.It can fall on the lake directly or arrives later at the lake as runoff.The only outflow from the lake is evaporation.Inflow and outflow of the lake were balanced by increasing or decreasing lake water level that can be observed in the form of sudden or gradual changes.Water budget of Lake Van showed that any sudden or gradual change in the lake water level was due to the response of the lake to the imbalanced precipitation, runoff and evaporation.
Also two stochastic models were developed for the simulation of the water level time series of the lake.Mono-trend and multiple-trend cases were evaluated.In the mono-trend case, a linear trend line was fitted to the whole time series of the lake water level at once.In the second case, differently from previous studies, a multiple-trend approach was used, in which the time series was divided into a number of segments each represented by its own linear trend.Periodicity in both cases was represented by a Fourier series.After the time series was made trend-and periodicity-free, second-order autoregressive AR(2) type stochastic models were developed for both mono-trend and multiple-trend cases.
A 1000 yr synthetic lake water level time series was generated by using the mono-and multiple-trend models.Simulations have shown that the multiple-trend models generate higher maxima than the mono-trend models.Therefore, in an engineering point of view, the synthetic lake water level time series generated by the multiple-trend model is more suitable and safer compared to the mono-trend model for the purpose of planning coastal infrastructural projects in the lakeshore.

Fig. 1 .
Fig. 1.(a) Location of Lake Van in Turkey, (b) Lake water level gauging station in Tatvan and precipitation stations surrounding the lake.

Fig. 3 .
Fig. 3. Observed and calculated change in the lake water level relative to previous year.

Fig. 4 .
Fig. 4. Lake water level time series fitted by (a) mono-trend line, and (b) multiple-trend lines.

Table 1 .
AIC values and parameters calculated for each model tested.

Table 2 .
Observed and synthetic maximum lake water levels.