Stochastically generated streamflow time series are widely used in water resource planning and management. Such series represent sets of plausible yet unobserved streamflow realizations which should reproduce the main characteristics of observed data. These characteristics include the distribution of daily streamflow values and their temporal correlation as expressed by short- and long-range dependence. Existing streamflow generation approaches have mainly focused on the time domain, even though simulation in the frequency domain provides good properties. These properties comprise the simulation of both short- and long-range dependence as well as extension to multiple sites. Simulation in the frequency domain is based on the randomization of the phases of the Fourier transformation. We here combine phase randomization simulation with a flexible, four-parameter kappa distribution, which allows for the extrapolation to as yet unobserved low and high flows. The simulation approach consists of seven steps: (1) fitting the theoretical kappa distribution,
(2) normalization and deseasonalization of the marginal distribution,
(3) Fourier transformation, (4) random phase generation, (5) inverse Fourier transformation, (6) back transformation, and (7) simulation. The simulation approach is applicable to both individual and multiple sites. It was applied to and validated on a set of four catchments in Switzerland. Our results show that the stochastic streamflow generator based on phase randomization produces realistic streamflow time series with respect to distributional properties and temporal correlation. However, cross-correlation among sites was in some cases found to be underestimated. The approach can be recommended as a flexible tool for various applications such as the dimensioning of reservoirs or the assessment of drought persistence.

Stochastic simulation of streamflow time series for individual and multiple sites by combining phase randomization and the kappa distribution.

Simulated time series reproduce temporal correlation, seasonal distributions, and extremes of observed time series.

Simulation procedure suitable for use in water resource planning and management.

Stochastically generated streamflow time series are used in various applications of water resource planning and management. These applications include water and reservoir management, the determination of the dimensions of hydraulic structures such as reservoirs, and the estimation of hydrological extremes such as droughts and floods.
Stochastically generated time series mimic the characteristics of observed data and represent sets of plausible realizations of streamflow sequences

Stochastic models for the generation of synthetic streamflow time series need to fulfil certain requirements. They should reproduce both the marginal distribution of observed streamflow time series as well as their temporal dependence structure

Many different approaches have been proposed for the stochastic simulation of streamflow time series, each able to fulfil some but usually not all of the desired properties listed above. One commonly used approach is the use of a synthetic weather generator in combination with a rainfall–runoff model

A first group of models consists of parametric models such as autoregressive moving average (ARMA) models and their modifications. While these models are commonly used in stochastic hydrology,
they only allow for modelling of short-range dependence because their autocorrelation decreases strongly with increasing lag time

Several alternatives to these well-established simulation procedures have been proposed, which allow for a flexible choice of marginal distributions. These include models where the temporal dependence structure is modelled with copula functions, which are, however, difficult to apply for higher orders of autocorrelation

All these previously mentioned models are based on the time domain. An alternative to time-domain models is frequency-domain models

We now turn to some theoretical background on Fourier transformation and phase randomization. For a more detailed introduction to the Fourier transformation, the reader is referred to textbooks by

The basic idea behind all surrogate methods is to randomize the Fourier phases of the underlying (hydrological) process. The Fourier transformation converts a time-domain signal into a frequency-domain signal, which is complex-valued. This transformation may be depicted as a decomposition of the time series into sine and cosine waves of different amplitude, phase, and period

The surrogate approach utilizes the property that realizations of linear Gaussian processes differ only in their Fourier phases and not their power spectrum. It preserves the autocorrelation structure of the raw series by conserving its power spectrum through phase randomization. The procedure consists of three main steps

The Fourier transformation of a given time series

Map showing the four Swiss catchments: (1) Plessur, (2) Birse, (3) Thur, and (4) Cassarate.

Here, we use phase randomization to simulate stochastic streamflow time series to be used in various water resource management studies. The stochastic series generated using phase randomization are combined with a theoretical distribution to allow extrapolation to unobserved values which still realistically represent daily streamflow values. The observed streamflow time series require pre-treatment before phase randomization can be applied. First, they need to be normalized because phase randomization assumes Gaussianity

The kappa distribution was found to be suitable for fitting observed streamflow data in US catchments

The method is extended to the simulation of stochastic streamflow time series at multiple sites. To model the cross-correlation between sites, the phase randomization performed in Step 4 of the procedure is performed in the same way for all the stations in the data set

The simulation was validated on the observed streamflow time series of a set of four catchments in Switzerland (Fig.

List of catchments and catchment summary including ID, river name, gauging station, catchment area, station elevation, mean elevation, and flow regime.

The model outlined in the previous section was fitted to the observed time series over 50 years (1960–2009) for each individual catchment. The application of this approach is only recommended for records longer than 30 years to reduce uncertainty in the estimation of the parameters of the kappa distribution. The model was then run, on the one hand, for each individual catchment and, on the other hand, for the four sites jointly.
In both cases, 100 sets of stochastic streamflow time series of the same length as the observed series were generated as in

Both the temporal correlation structure and seasonal streamflow statistics were used to compare observed and simulated streamflow time series in order to assess the validity of the stochastic streamflow generation model.
As in

Observed (grey) and stochastically generated (orange) annual hydrographs at daily resolution over 30 years for the Plessur catchment.

Comparison of observed and stochastically generated time series for the melt-dominated Plessur catchment (upper two rows) and the rainfall-dominated Birse catchment (lower two rows) for the following characteristics: mean hydrograph over 50 years, autocorrelation function, partial autocorrelation function, seasonal distributions, monthly means, monthly maxima, monthly minima, and monthly standard deviations. Black lines represent observations, while orange lines represent simulations.

The stochastic streamflow generator was found to produce realistic annual hydrograph realizations as illustrated in Fig.

The stochastic generator produces time series with mean regimes similar to the observed mean regime, and reproduces both the autocorrelation (ACF) and partial autocorrelation functions (PACF). Seasonal distributions match well thanks to the good fit of the kappa distribution to the data. Monthly means and standard deviations match particularly well, while monthly maxima and minima show some deviations from the observed maxima and minima, as was intended by using a theoretical instead of an empirical distribution.
The suitability of the kappa distribution for producing realistic high and low flows is confirmed in Fig.

Low and high flows for observed (grey) and simulated (orange) time series for the four catchments Plessur, Birse, Thur, and Cassarate. The results are given for 10 simulation runs (S1–S10), and high flows are plotted with (middle column) and without (right column) outliers. Whiskers extend to the lowest/highest data point, which is still within 1.5 times the interquartile range.

The stochastic streamflow generator is able to reproduce not only the streamflow distribution and the short-range dependence in the data, but also the long-range dependence over several years (Fig.

Autocorrelation (ACF) of annual streamflow sums of the observed and simulated streamflow time series for the catchments Plessur, Birse, Thur, and Cassarate.

The good performance of the stochastic streamflow generator with respect to streamflow distribution and temporal correlation – both short and long range – is not limited to these four example catchments, but generalizes to other data sets used as input.

The stochastic streamflow generator can be extended from the simulation at individual sites to the joint simulation at multiple sites. In addition to reproducing distribution and temporal correlation at individual sites, it should then be able to reproduce the cross-correlation among sites, which describes the similarity of time series at two sites. Figure

Cross-correlation function (CCF) of observed (black line) and simulated (orange lines) daily streamflow for pairs of stations at Plessur, Birse, Thur, and Cassarate.

The stochastic streamflow generator based on phase randomization has been shown to produce realistic streamflow time series with respect to both distributional properties and temporal correlation. Compared to models commonly used for the stochastic generation of streamflow time series, such as autoregressive moving average models, the simulation approach presented here reproduces not only short-range, but also long-range dependence.
However, the representation of this dependence is limited to ranges within the length of the observed time series. Instead of producing one long time series, the simulation procedure allows for the simulation of multiple series of the same length as the original series. The use of ensembles of the same length as the observed time series might not be equivalent to using a long time series. Still, long-range dependence features may not be generated in either case since the model is fitted based on a limited number of years of observations. While the reproduction of the temporal dependence was well reproduced here, this is not necessarily the case under all conditions.

Phase randomization was here combined with the flexible four-parameter kappa distribution, which was found to effectively represent daily streamflow values. The distribution of daily flows was found to be modelled well in all seasons. However, the use of one distribution per day has the disadvantage of introducing a lot of parameters, which makes the model non-parsimonious

The generator can, on the one hand, be used to simulate streamflow at individual sites, and, on the other, to simulate jointly at multiple sites, which is not necessarily the case for other existing models. Its application to the example catchments, however, resulted in somewhat underestimated cross-correlations between stations. This underestimation can be explained by the fact that phase randomization preserves the cross-correlation in the normal domain but not necessarily in the domain of the original distribution.
This cannot be overcome even if the simulation run which best reproduces these cross-correlations is extracted from a large set of simulations. However,

The streamflow generator was here used on observed streamflow time series. The input time series, however, do not necessarily need to consist of observed values. One could also use the generator on streamflow simulated with a hydrological model. This extends its application to climate impact studies where a hydrological model is driven by meteorological time series generated with global and/or regional climate models. Alternatively, the representation of non-stationary conditions in the properties of the marginal distribution or the temporal dependence structure could also be achieved by adjusting the parameters of the marginal distribution or the frequency spectrum, respectively. Phase randomization simulation can potentially accommodate not only changing climate conditions, but also changes in land use or water extractions. The approach is not limited to the simulation of streamflow time series, but extends to other hydro-meteorological variables such as precipitation, evapotranspiration, or snowmelt. This would require the test and identification of a suitable marginal distribution. In the case of intermittent processes, mixed-type marginal distributions would need to be used

The stochastic streamflow generator presented here represents a flexible tool for streamflow simulation at individual or multiple sites. It can be used for various applications such as the design of hydropower reservoirs, the assessment of flood risk, or the assessment of drought persistence and the estimation of the risk of multi-year droughts.

The stochastic simulation procedure for a single site using the empirical, kappa, or any other distribution and some of the functions used to generate the validation plots are provided in R package PRSim. The stable version can be found in the CRAN repository

The observational discharge data were provided by the Federal Office for the Environment (FOEN) and can be ordered from

AB and MIB jointly developed the concept and methodology of the study. MIB and RF set up the simulation approach. MIB did the data analysis, produced the figures, and wrote the first draft of the manuscript. The manuscript was revised by RF and AB and edited by MIB.

The authors declare that they have no conflict of interest.

We thank the reviewers Ashish Sharma, Demetris Koutsoyiannis, and Simon Papalexiou for their constructive comments.

This research has been supported by the Swiss Federal Office for the Environment (FOEN) (grant no. 15.0003.PJ/Q292-5096), the Deutsche Forschungsgemeinschaft (DFG) (grant no. Ba-1150/13-1), and the Swiss National Science Foundation (SNF) (grant no. 175529).

This paper was edited by Nadav Peleg and reviewed by Demetris Koutsoyiannis, Ashish Sharma, and Simon Michael Papalexiou.