A coupled stochastic rainfall – evapotranspiration model for hydrological impact analysis

A hydrological impact analysis concerns the study of the consequences of certain scenarios on one or more variables or fluxes in the hydrological cycle. In such an exercise, discharge is often considered, as floods originating from extremely high discharges often cause damage. Investigating the impact of extreme discharges generally requires long time series of precipitation and evapotranspiration to be used to force a rainfall-runoff model. However, such kinds of data may not be available and one should resort to stochastically generated time series, even though the impact of using such data on the overall discharge, and especially on the extreme discharge events, is not well studied. In this paper, stochastically generated rainfall and corresponding evapotranspiration time series, generated by means of vine copulas, are used to force a simple conceptual hydrological model. The results obtained are comparable to the modelled discharge using observed forcing data. Yet, uncertainties in the modelled discharge increase with an increasing number of stochastically generated time series used. Notwithstanding this finding, it can be concluded that using a coupled stochastic rainfall– evapotranspiration model has great potential for hydrological impact analysis.


Introduction
Precipitation is the most important variable in the terrestrial hydrological cycle that determines soil moisture and discharge from a watershed.As such, it also impacts water management where generally the occurrences of extreme events, e.g.storms or droughts which have very low frequen-cies, are of concern.Hence, very long time series of precipitation are needed.Because this kind of data is not always available, one may consider using a stochastically generated rainfall time series (Boughton and Droop, 2003).Stochastic rainfall models can be used to produce very long time series or to compensate for missing data from finite historical records (Wilks and Wilby, 1999).Several types of rainfall models have been proposed in the literature.Onof et al. (2000) grouped all continuous rainfall models into four types: (1) meteorological models, (2) stochastic multi-scale models, (3) statistical models and (4) stochastic process models.Meteorological models are able to describe the physical processes of all weather variables, including rainfall, by making use of very large and complex sets of equations.Numerical weather prediction and general circulation models are two common examples of this type of models.Stochastic multi-scale models describe the spatial evolution of the rainfall process regardless of scale factors.In general, these models involve an assumption of temporal invariance of rainfall over a range of scales (Bernardara et al., 2007).Statistical models, which can be used for simulating the precipitation trends, usually treat the occurrence and the amount of precipitation separately (Wilks and Wilby, 1999).The rainfall occurrence is represented by a sequence of dry and wet periods, usually simulated by Markov chains or alternating renewal models.The precipitation amounts can be arbitrarily generated by making use of some popular distributions, e.g. the exponential (Todorovic and Woolhiser, 1975), the gamma (Stern and Coe, 1984;Viglione et al., 2012) or the mixed exponential distribution (Woolhiser and Roldán, 1982;Wilks, 1998;Mason, 2004).Stochastic process mod-Published by Copernicus Publications on behalf of the European Geosciences Union.els use simple assumptions of physical processes to simulate the hierarchical structure of the rainfall process.In this approach, only a limited number of parameters are needed (Verhoest et al., 2010).The Bartlett-Lewis (BL;Rodriguez-Iturbe et al., 1987a) and the Neyman-Scott (Kavvas and Delleur, 1981) models are the most commonly used models of this type.In this study, we only focus on the BL models.These models have been applied successfully in different areas, such as Great Britain (Onof and Wheater, 1993;Onof et al., 1994;Cameron et al., 2000), Ireland (Khaliq and Cunnane, 1996), Belgium (Verhoest et al., 1997;Vandenberghe et al., 2010;Vanhaute et al., 2012), the United States of America (Rodriguez-Iturbe et al., 1987b;Velghe et al., 1994), New Zealand (Cowpertwait et al., 2007), Australia (Gyasi-Agyei, 1999;Heneker et al., 2001) and South Africa (Smithers et al., 2002).The BL models are chosen in this study for three main reasons: (1) they show a good performance in all recent studies, (2) they are capable of generating time series on a sufficiently fine timescale (less than 1 h), (3) their calibration is easy given the limited number of parameters and (4) they mimic well the stochastical behaviour of the historical time series at Uccle (Verhoest et al., 1997;Vanhaute et al., 2012), which is used in this study.The BL model will be employed on a monthly basis such that temporal changes in precipitation characteristics due to the annual cycle can be underpinned.Long-term changes, e.g.due to climate change, however, cannot be accounted for in this model set-up.
Besides precipitation, the water balance is also highly influenced by the amount of water that is lost due to evapotranspiration.An accurate estimation of evapotranspiration is essential for hydrological and agricultural designs, irrigation plans and for water distribution management (Droogers and Allen, 2002).The daily reference evapotranspiration is often modelled based on the Penman, Priestley-Taylor or Hargreaves equations; however, one major limitation of these models is that they require extensive input data, such as daily mean temperature, wind speed, relative humidity and solar radiation, which are not always available.Therefore, one may consider to rely on another approach based on stochastically generated time series.More importantly, in order to obtain a correct evaluation of the water balance of a catchment and its discharge, these stochastic evapotranspiration data need to be consistent with the accompanying precipitation time series data (Pham et al., 2016).In this case, we can make use of the copula-based approach introduced in the work of Pham et al. (2016) in which the statistical dependence between evapotranspiration, precipitation and temperature is described by three-and four-dimensional vine copulas.
Many modelling approaches exist for simulating catchment discharge.The simplest models are the conceptual models in which several linear (or nonlinear) reservoirs are put in series and/or parallel.Well-known examples of such conceptual models are the following: the Hydrologiska Byräns Vattenbalansavdelning model (Bergström, 1995), the NedborAfstromnings Model (Nielsen and Hansen, 1973) and the Probability Distributed Model (PDM; Moore, 2007).Alternatively, physically based models are based on scientific knowledge of different hydrological processes and their interactions.Generally, these models contain many more parameters than the conceptual ones and require more input data, such as soil type, vegetation-related information, etc. Well-known examples of such models are the Soil and Water Assessment Tool (Arnold et al., 1998), the Système Hydrologique Européen (Abbott et al., 1986) and the Common Land Model (Dai et al., 2003).In this study, we do not intend to seek the best hydrological model to assess our objective, but we opt for a model that is used in operational water management.More specifically, we will use PDM, as this model is used by the Flemish Environmental Agency (Cabus, 2008), and apply it to a catchment in Flanders, Belgium.The objective of this research is to assess whether the BL stochastically generated rainfall and consistent evapotranspiration time series can be used for hydrological impact analyses.In particular, we will evaluate different ways to apply stochastically modelled time series as forcing data to simulate the catchment's discharge.By regarding the actual observed time series as one realization of the meteorological process, the corresponding discharge can also be regarded as one realization.Actually, due to chaos occurring in the climatological system, a different time series could have been observed resulting in a discharge time series different from the actual observed one.The latter will hence provide other design values than those corresponding to the actual observed time series.In order to account for this kind of uncertainty, different cases, in which the number of stochastically generated input variables to the model is increased, are investigated.For these cases, the increase in uncertainty in modelled extremes and what portion of this increase can be attributed to the different stochastic generators, is assessed.
Section 2 describes the historical records and all models used within this study.Section 3 briefly introduces the coupled stochastic rainfall-evapotranspiration model and all the considered situations to simulate discharge from stochastic forcing data.The discharge simulations from different scenarios are then evaluated in Sect. 4 allowing for assessing the impact of stochastic data on the simulation of discharge.Finally, conclusions and recommendations are given in Sect. 5.

Historical data
This study uses observed time series measured in the climatological park of the Royal Meteorological Institute (RMI) at Uccle, near Brussels, Belgium.The data include time series of observed precipitation (mm) from 1898 to 2002, and mean daily temperature T ( • C) and daily reference evapotranspi-ration E (mm day −1 ) from 1931 to 2002.The time series of E is derived using the Penman-Monteith equation.The precipitation data have been recorded with a time resolution of 10 min from 1 January 1898 to 31 December 2002 measured by a Hellmann-Fuess pluviograph (Démarée, 2003).This data set is quite unique in hydrology due to its extraordinary length with a sampling frequency of 10 min.Its high quality is ensured by using the same method of processing and measuring at the same location since 1898 (Ntegeka and Willems, 2008).This time series has been used in several studies (Verhoest et al., 1997;Vaes and Berlamont, 2000;De Jongh et al., 2006;Ntegeka and Willems, 2008;Vandenberghe et al., 2010;Vanhaute et al., 2012;Pham et al., 2013;Willems, 2013;Pham et al., 2016) and is used to calibrate the rainfall model as explained in Sect.2.4.This time series has also been reprocessed to daily total precipitation (mm day −1 ), further referenced to as P , for the period of 1931-2002, which is then used together with the time series of T and E for the construction of different stochastic models.
In order to use the above-described data to fit copulas, the data should be independent and identically distributed (iid), indicating that the distribution of the data should not change with time.To this end, the time series is split into monthly series to which a vine copula model can be fitted.Hence, for each month a different model will be obtained.However, the data distributions can also change within the monthly series, i.e. a within-month trend may exist.Therefore, the daily distributions, each containing 72 observations, were compared within each month by means of an ANOVA test when distributions were homoscedastic, a Welch ANOVA test (Welch, 1951) when distributions were heteroscedastic, or a Kruskal-Wallis test (Kruskal and Wallis, 1952) when distributions were non-normal and heteroscedastic, at a significance level of 0.001.The results of these tests indicate that within-month trends exist for temperature and evapotranspiration, whereas no trend was found for precipitation.In order to meet the requirements of the data to be iid, temperature and evapotranspiration data were standardized as follows: with x s,d,y being the standardized value of temperature or evapotranspiration at day d of year y, x d,y the original measured value of temperature or evapotranspiration at day d of year y, and µ d and σ d being the mean value and standard deviation of x at day d, respectively.

Probability Distributed Model (PDM)
The PDM is a lumped rainfall-runoff model which basically conceptualizes the absorption capacity of soil in the catchment as a collection of three different storages (Moore, 2007;Cabus, 2008; see Fig. 1): i.e. (1) a probability distributed soil moisture storage (S 1 ) based on a Pareto distribution of soil moisture capacity to separate direct runoff Q dr and subsurface runoff Q gr , (2) a surface storage (S 2 ) to transform direct runoff into surface runoff and (3) a groundwater storage (S 3 ) to convert subsurface runoff to baseflow.The input for S 1 is the net precipitation (P − E), in which P and E are the precipitation and evapotranspiration, respectively.Further, water loss from S 1 may be due to Q dr or Q gr .The former is then converted to surface runoff Q ro through surface storage S 2 , a fast response system involving a sequence of two linear reservoirs with small storage time constants k 1 and k 2 .
The direct runoff flow only happens for those parts of S 1 that are completely filled.The recharge to the groundwater, controlled by the drainage time constant k g , is transferred into baseflow Q bf through groundwater storage S 3 , a slow nonlinear response system with a large storage time constant k b .
The sum of Q ro and Q bf equals the total discharge Q t ; note that a constant flow that presents any returns or abstractions to or from the catchment, represented by a parameter q const , can also be added.For a more detailed theoretical explanation and mathematical description of the model, we refer to Moore (2007).
In this study, PDM is calibrated for the Grote Nete catchment using the particle swarm optimization algorithm (PSO; Kennedy and Eberhart, 1995).This catchment, covering about 385 km 2 in the north of Belgium, has a maritime, temperate climate with an average precipitation of about 800 mm year −1 (Vrebos et al., 2014).Given the relatively short distance between Uccle and the Grote Nete catchment, and the fact that the meteorological conditions are nearly the same, one can assume that the statistics of the modelled discharge obtained with the forcing data observed near the catchment and those observed at Uccle are negligible.Furthermore, the rainfall-runoff model will not be used to make predictions, but rather to demonstrate the impact of different alternative realizations of precipitation (P ), temperature (T ) and evapotranspiration (E) on discharge values.Therefore, although PDM will be applied to observations from Uccle in this study, it is calibrated on the basis of a time series of more than 6 years (from 13 August 2002 to 31 December 2008) at an hourly time step (precipitation, evapotranspiration and discharge) that is available for the catchment.Observations recorded during the period of 13 August 2002 to 31 December 2006 are used for model calibration, while the remaining data (from 1 January 2007 to 31 December 2008) are used for model validation.

2.3
Copula-based stochastic simulation of evapotranspiration and temperature

Vine copulas
A copula is a multivariate function that describes the dependence structure between random variables, independently of their marginal distributions (Sklar, 1959).The theorem of Sklar (Sklar, 1959) states that if F 12 (x 1 , x 2 ) is the joint dis- tribution function of two random variables X 1 and X 2 with marginal cumulative distributions F 1 and F 2 , then there exists a bivariate copula C 12 such that with For more theoretical details, we refer to Sklar (1959), Nelsen (2006) and Joe (1997).
The use of copulas allows us to decompose the construction of a joint distribution function into two independent steps, i.e. the modelling of the dependence structure and the modelling of the marginal distribution functions (Nelsen, 2006;Salvadori and De Michele, 2007).As such, copulas allow the use of complex marginal distribution functions (Salvadori et al., 2007).Because of this advantage, the application of copulas is becoming more and more popular in hydrological and meteorological studies.However, due to the complications in the construction of the copula model for more than two variables, most research is limited to the bivariate case (Pham et al., 2016).
A flexible construction method for high-dimensional copulas, known as the vine copula construction, has been introduced in the work of Bedford and Cooke (2002), in which multivariate copulas, and hence the multivariate densities, are constructed as a product of bivariate copula densities.Vine copulas constitute two main advantages.First, they are simple and straightforward to apply.Second, they are very flexible and have the ability to model a wide range of dependence structures because the bivariate copulas can be selected from a large number of copula families (Kurowicka and Cooke, 2007;Aas et al., 2009;Czado, 2010).However, one has to be aware that the flexibility offered by vine copulas demands the estimation of a large number of parameters for which the data set should encompass sufficient information.
There is, however, a large number of possibilities for the construction of vine copulas (Aas et al., 2009); for example, there are 24 and 240 different constructions of vine copulas for the four-and five-dimensional cases, respectively (Aas et al., 2009).Examples of two regular four-dimensional vine copulas are given in Fig. 2a, b.One usually focuses on two special types of regular vine copulas: canonical vine copulas (C-vine copulas) and D-vine copulas (Kurowicka and Cooke, 2007).If all mutual dependences involve the same variable, the construction yields a C-vine copula (Fig. 2c).If all mutual dependences are considered one after the other, i.e. the first with the second, the second with the third, the third with the fourth, etc., the construction yields a D-vine copula (Fig. 2d).In this study, C-vine copulas are used for the constructions of copula-based generators of temperature and evapotranspiration.More details on the construction of and simulation from a C-vine copula are given in the work of Aas et al. (2009).

Copula-based stochastic simulation of evapotranspiration
In order to generate stochastic time series of evapotranspiration, we make use of the vine-copula-based approach proposed in the work of Pham et al. (2016) in which C-vine copulas are used to describe the dependences between evapotranspiration and other variables, such as temperature, precipitation and dry fraction within a day.The advantage of the method is that the statistical properties of the evapotranspiration time series and the dependence structures between evapotranspiration and other variables are well maintained.Furthermore, the model construction and simulation are simple to apply.After comparing the results of different vine models, Pham et al. (2016) found that the best sim- ulations of daily evapotranspiration were provided by the four-dimensional C-vine copula V T P DE relating daily temperature (T ), precipitation (P ), dry fraction (D) and evapotranspiration (E), and the three-dimensional C-vine copula V T P E relating T , P and E. As there is no major difference in performance between simulations using V T P DE and V T P E (Pham et al., 2016), for simplicity, we choose to use only temperature, precipitation and evapotranspiration data in the vine copula model for evapotranspiration.In order to avoid monthly effects, the temperature and evapotranspiration data were first standardized and a different C-vine copula model is used for each month.However, subsequent observations of the time series may not be independent, meaning that values within the time series may be autocorrelated.This is accounted for by extending the vine copula V T P E as used in Pham et al. (2016) with the evapotranspiration of the previous day (E p ).In this way, a four-dimensional C-vine copula V T P E p E is constructed for each month.The best bivariate copula families for the C-vine copulas are chosen using Akaike's information criterion (AIC; Akaike, 1973) from five one-parameter copula families, i.e. the Gaussian, the Gumbel, the Frank, the Joe and the Clayton family and one twoparameter family, the t-copula family.Table 1 lists the se-lected copula families.The empirical cumulative distribution functions are used as marginal distributions, and the final copula parameters of the one-parameter families are determined on the basis of the relationship between the copula parameter and Kendall's tau, whereas the parameters of the t-copula family are estimated through maximum likelihood estimation.
Further, the White goodness-of-fit test (Schepsmeier, 2015) is applied to check whether the dependence present in the data is captured by the C-vine copulas.For this test, p values larger than the significance level indicate that the dependence structure of the data can be described by the selected copulas.In this study, all but one p values were larger than the used significance level of 0.05.The dependence structure of the data can thus be described by the selected copula families.
The construction of V T P E p E is given as follows (see Fig. 3a).First, values (u T ,j , u P ,j , u E p ,j , u E,j ) of U T , U p , U E p and U E are derived from the marginal distributions of respectively T , P , E p and E (j = 1, . .., n and n is the number of data points), and are used to select and fit the bivariate copulas C T P , C T E p and C T E .These bivariate copulas are conditioned on U T through partial differentiation as given in  Table 1.Bivariate copula families selected by AIC for V T P E p E , where F stands for Frank, Ga for Gaussian, G for Gumbel, C for Clayton, J for Joe and t for t-copula family.

Month
V Eq. ( 3), resulting in the conditional cumulative distribution functions F P |T , F E p |T and F E|T .
Using these three conditional distributions, the conditional probabilities are calculated for all data points (u T ,j , u P ,j , u E p ,j , u E,j ).To these conditional probabilities, which are also uniformly distributed between [0,1], two bivariate copulas C P E p |T (F P |T , F E p |T ) and C P E|T (F P |T , F E|T ) are fitted, of which the partial derivatives to F P |T can be computed to obtain F E p |T P and F E|T P .Again, using these two conditional distributions, a bivariate copula C E p E|T P (F E p |T P , F E|T P ) is fitted, which can also be conditioned by calculating the partial derivative.For more detailed information about the construction of vine copulas, we refer to Aas et al. (2009).Once the C-vine copula model is fitted, a corresponding time series of evapotranspiration values can be generated, for a given time series of rainfall and temperature data, by sampling the copula (Fig. 3b).To that end, values of U E are calculated as where r is a random value drawn from a uniform distribution between [0,1].Then the corresponding evapotranspiration value e can be calculated using the inverse marginal distribution function: It is clear that the values of U E are affected by the random value r; therefore, several simulations will show some vari-Table 2. Bivariate copula families selected by AIC for V T p P T , where F stands for Frank, Ga for Gaussian, G for Gumbel, C for Clayton, J for Joe and t for the t-copula family.
ability.To account for these stochastic effects, the simulation was repeated 50 times.Figure 4 displays the comparisons between probability density functions of observed and simulated evapotranspiration obtained by V T P E p E for the different months.From these plots, it can be seen that the probability density functions of the stochastic evapotranspiration are very similar to those of the reference evapotranspiration in Uccle (red line).In order to assess whether the dependence structures between simulated evapotranspiration and other variables are maintained, for each of the 50 simulations, the mutual dependences between E and the other variables, T or P , were assessed via Kendall's tau for each month.Figure 5 shows box plots of the obtained values of Kendall's tau for E vs. T and E vs. P dependences for 50 simulations.These figures show that, in general, the observed dependences between both E vs. T and E vs. P are preserved with the stochastic simulated evapotranspiration.

Copula-based stochastic simulation of temperature
Temperature data are required for the stochastic modelling of evapotranspiration.However, in situations where no longterm time series of temperature is available, it is necessary to use a stochastically generated temperature time series.We use a similar approach as Pham et al. (2016) to develop a stochastic temperature model based on copulas.This model makes use of the dependence between the temperature and the precipitation of the same day (i.e. at day j ) and the temperature of the previous day (i.e. at day j − 1).Similarly as for the stochastic evapotranspiration model, a C-vine copula is employed in which T j −1 is chosen as the core variable.
The model is referred to as V T p P T , where T p refers to the temperature of the previous day.The construction procedure of V T p P T is similar to the one of V T P E p E with that difference that only 3 instead of 6 bivariate copulas need to be fitted (see Sect. 2.3.2).The simulation process of the temperature model is different from that of the evapotranspiration model, in the sense that it requires a modelled input from the previous time step (i.e.T p ) in order to generate a new value for T .The simulation algorithm of T can be performed as follows: Similarly as for the evapotranspiration model, the best bivariate copula families for the C-vine copulas are chosen using the AIC.Table 2 illustrates which copula families were selected.This table shows that the Frank copula family is often selected for C T p P and C P T |T p , while the Gaussian and the t-copula families are often chosen for C T p T .Further, the White goodness-of-fit test (Schepsmeier, 2015) is also applied to check whether the dependence present in the data is captured by the C-vine copulas.All p values were larger than the used significance level of 0.05, indicating that the dependence structure of the data can be described by the selected copula families.The final copula parameters of the one-parameter families are determined on the basis of the relationship between the copula parameter and Kendall's tau, whereas the parameters of the t-copula family are estimated through maximum likelihood estimation.These copulas are then used for generating temperature given the time series of precipitation.
To assess the performance of the model, the statistics of 50 stochastic time series of temperature using the observed daily precipitation from 1931 to 2002 are compared to those of the observations.The empirical probability density functions of the monthly mean temperature for each of the simulated 72 year time series are shown in Fig. 6.The statistics of the simulations seem to be relatively similar to the observations.Figure 7 shows the monthly maximum temperature of the ensemble and of the observed temperature series corresponding to their empirical return periods.This figure shows that the extremes are well modelled for all months.

Simulated precipitation by the MBL model
In situations where no long time series of precipitation is available, one can use a stochastic rainfall model.In this study, the modified Bartlett-Lewis (MBL) model (Rodriguez-Iturbe et al., 1988) is selected to generate the precipitation time series based on the results from Pham et al. (2013) in which the MBL model is considered to be the best version of the different BL models tested on the Uccle data set.The MBL model is calibrated using the generalized method of moments, i.e. the difference between the model statistics obtained by means of analytical expressions and the empirical statistics obtained from the observed time series that is to be minimized.The calibration of the MBL model  in this study is based on the mean, variance, lag-1 autocovariance and zero-depth probability (ZDP) at the aggregation levels of 24, 48 and 72 h instead of 10 min, 1 and 24 h that were used in Pham et al. (2013).As in Pham et al. (2013), the shuffled complex evolution algorithm (Duan et al., 1994) was employed to search for the optimal parameters.The rea-son for only selecting aggregation levels of at least 1 day is to consider situations where only daily precipitation data would be available.The values of the calibrated parameters are given in Table 3.Details of the MBL model and the model calibration are provided by Pham et al. (2013) and Vanhaute et al. (2012).The stochastic rainfall time series is simulated   at the same 10 min time resolution as the observations.In order to assess the performance of the model, the abilities of the model to reproduce some general historical statistics, such as mean, variance, the lag-1 autocovariance and ZDP, at aggregation levels of 10 min, 1, 12, 24 and 48 h are investigated based on an ensemble of 50 time series.In Fig. 8, some general statistics at different aggregation levels are compared for 50 time series obtained by the MBL model and the observed time series at Uccle.In order to further unveil the behaviour of the model, the general statistics are calculated at different aggregation levels for each year and presented in the form of a frequency distributions (Fig. 9).From both Figs 8 and 9, it can be seen that the mean is generally reproduced well by the model at all levels of aggregation.At the sub-hourly level, the variance and autocovariance are slightly overestimated.For higher aggregation levels, an increasing variation is found for both statistical properties.At higher levels of aggregation, the ZDP is relatively similar to that found for the observed time series, whereas for hourly and sub-hourly levels, a slight deviation in ZDP values are found with respect to the observations.
Figure 10 shows the empirical univariate return periods of the annual maximum rainfall depths of the observed and simulated series, considering five different aggregation levels.Compared to the observations, it seems that the MBL model is able to preserve the maxima at all aggregation levels.It can be seen in this study that the MBL model does not suffer from the problem of underestimation of extreme values at sub-hourly aggregation levels that were reported in the work of Verhoest et al. (1997) and Cameron et al. (2000).From the analysis, it seems that the MBL model is capable of preserving the sub-daily statistics even though the calibration procedure only included daily and multi-day statistics.Yet, further research is needed to explore this improved behaviour.
Figure 10 also shows that a large variation in extreme values is found for larger return periods.The MBL model allows for generating rainfall time series mimicking the statistics of the observed series.Due to its structure, the modelled precipitation values are not restricted to the range of rainfall values in the observations, making this model able to generate rainfall events having a return period larger than the observed time series.Yet, it can thus be expected that within the modelled time series of 72 years, events may occur having a true return period that is longer than the length of the modelled time series.If longer time series would be simulated, a better estimation of the rainfall corresponding to return periods that are shorter than the observed time series should be obtained.To demonstrate this, all 50 series generated are concatenated, resulting in one time series of 50 × 72 = 3600 years, for which the return periods are calculated empirically and plotted (only for return periods less then 100 years) as a blue line in Fig. 10.As can be seen for return periods shorter than 100 years, a good fit with the observations is obtained, showing that MBL is capable of reproducing extremes.Yet, the user should use much longer time series than the maximum return period aimed for.

Discharge simulation scenarios
The catchment discharge is calculated by the PDM that uses precipitation and evapotranspiration data as inputs.In order to assess the impact of each stochastic variable on the modelling of discharge, three cases have been developed that can be compared to a reference situation (cf.Fig. 11).The reference situation is obtained by running the PDM with the observed time series of precipitation and evapotranspiration.
In case 1, it is supposed that insufficient evapotranspiration data would be available (e.g. a shorter time series than the observed precipitation), the stochastic evapotranspiration can then be generated using the three-dimensional C-vine copula, i.e.V T P E p E , given observed rainfall and temperature.The simulation is repeated 50 times in order to account for stochastic effects.In case 2, where only a sufficiently long time series of precipitation is available, the process starts with temperature simulations, then evapotranspiration can be modelled using the observed precipitation and stochastically generated temperature using the V T P E p E copula.As presented before, temperature values will be generated by the three-dimensional C-vine copula V T p P T that relates temperature T to daily precipitation P and the daily temperature of the previous day T p .To account for the stochastic effect, 50 time series of temperature are generated.Next, each of the 50 time series of temperature, together with the observed precipitation data, are used to simulate 50 corresponding time series of evapotranspiration.Therefore, in total 2500 time series of evapotranspiration are generated.Case 3 accounts for a situation in which data would be insufficiently available for all input variables.In this case, an ensemble of 50 time series of precipitation could be generated using the MBL model.For each of these time series, 50 time series of temperature and 2500 time series of evapotranspiration can be obtained using the same approach as in case 2. In total, 125000 time series of evapotranspiration are generated in case 3.In order to construct copula models and evaluate discharge simulations in all cases, this study uses the same time series of precipitation, evapotranspiration and temperature at Uccle.In all cases, discharge is simulated using the PDM that was calibrated for the Grote Nete catchment in Belgium (see Sect. 2.2).By this approach, the uncertainty due to the PDM can be partly excluded from the study, i.e. we study the change in performance with respect to the reference situation.It makes sense because the three cases use exactly the same PDM, a similar uncertainty due to the model is assumed for all cases as for the reference situation.Therefore, the change in performance for all cases with respect to the reference situation can be attributed to the differences in inputs to the model.The discharge simulations in the three cases are denoted as Q s1 , Q s2 and Q s3 , respectively, while the reference discharge is denoted by Q rf .

Case 1
The catchment discharge can be simulated by means of the PDM that uses precipitation and evapotranspiration data.In case 1 (cf.Fig. 11), where only daily observed precipitation and temperature data are available, 50 stochastically generated evapotranspiration time series are generated using the three-dimensional C-vine copula V T P E p E .The results shown in Sect.2.3.2 and the work of Pham et al. (2016) reflect that the C-vine copula V T P E p E performs well and its simulations lie very close to the values of the observed evapotranspiration.The left column of Fig. 12 displays the comparison between the probability density functions of Q rf and Q s1 for January, April, July and October.It can be seen that the distributions of Q s1 are quite similar to those of the reference discharge for these months.Similar results are obtained for the other months.For a further analysis of mean discharges and annual extremes of Q s1 , we refer to Sect.4.3.

Case 2
In case 2 (cf.Fig. 11), only a time series of precipitation of sufficient length is available and the temperature values are simulated using the C-vine copula V T p P T .The observed precipitation and stochastically generated temperature values are then used for reproducing the evapotranspiration by means of the C-vine copula V T P E p E .Through comparing the results of this case with that of case 1, we can assess the impact of introducing a stochastic temperature model on the modelled evapotranspiration time series and the modelled discharge.
As shown in Sect.2.3.3 and Fig. 13 (left column), the stochastically generated temperature data generated by the C-vine copula V T p P T model are reliable and can be used together with the recorded precipitation to simulate 2500 time series of evapotranspiration in the next step (i.e. for each temperature series, 50 evapotranspiration series are generated).The probability density functions of the 2500 time series of the simulated evapotranspiration are shown in Fig. 14 (middle column).It can be seen from the figures that these distributions are similar to those of the observations in Uccle and those of the modelled evapotranspiration in case 1 (cf.Fig. 14, left column, for January, April, July and October.)Similar results are obtained for the other months.Figure 12 (middle column) displays a comparison between the probability density functions of the simulated discharge (Q s2 ) and the reference discharge (Q rf ).In general, the grey areas representing 2500 simulated time series are slightly wider than those in case 1 (Fig. 12, left column).We conclude that the introduction of stochastically generated temperature does not cause considerable deviations in the simulation of evapotranspiration and discharge.

Case 3
This case accounts for a situation in which no time series (of sufficient length) are available as shown in Fig. 11.The first step consists of generating 50 time series of precipitation by means of the MBL model (see Sect. 2.4) and aggregating these to the daily level.Then, each of those time series is used for modelling 50 time series of temperature, each used for generating 50 evapotranspiration series.Therefore, in total 125 000 time series of evapotranspiration are generated.Finally, 125 000 time series of the catchment discharge are simulated using the stochastically generated time series of precipitation and corresponding evapotranspiration values.This case will allow for assessing the uncertainty introduced by using the MBL model for generating precipitation values as input to a rainfall-runoff model.
First, the simulated time series of precipitation are used as inputs to the C-vine copula V T p P T to generate time series of temperature.The modelled copula-based temperature values are compared with the observed temperature in Uccle in terms of the probability density functions in Fig. 13 (right column).From these figures, it can again be seen that the distributions of the simulations follow those of the observations.With respect to the probability density functions, the simulated evapotranspiration (Fig. 14, right column) in this case is similar to the observed evapotranspiration, but more deviations can be observed in this case than in the previous cases.The modelled time series of precipitation and evapotranspiration are then used for modelling the discharge.The probability density functions of the simulated discharge values for some months are displayed in Fig. 12 (right column).Similar results are obtained for the other months.From the  different plots, it can be concluded that the simulations still follow the distribution of the reference discharge (red line).
Compared to the simulated discharge of cases 1 and 2, even higher extreme values are generated and the grey areas representing the ensemble of 125 000 time series are generally wider, indicating that mainly the stochastic generation of precipitation has introduced considerable variations into the discharge simulations.The top row of Fig. 15 illustrates this by comparing the annual extremes of the observed and the simulated discharge series for all cases.However, it should also be noted that the results for cases 2 and 3 are obtained on the basis of a wider ensemble of time series as compared to case 1 (2500 for case 2 and 125 000 for case 3).In order to also compare the variations obtained on the basis of an equal number of time series within the ensemble (i.e.50 time series), for each time series of observed (case 2) or simulated (case 3) precipitation, one corresponding time series of temperature and one corresponding time series of evapotranspiration are generated.The bottom row of Fig. 15 illustrates the extremes obtained on the basis of this ensemble of 50 time series of discharge.These results also show that most of the variation obtained in case 3 is due to the stochastic generation of precipitation.This increase in uncertainty, however, should be treated with care.As stated before, the generated rainfall series may include extremes that are larger than the ones in the observed time series.Such large precipitation values will inevitably result in a large surface runoff production causing extreme discharges.The large variability in extreme rainfall as observed in Fig. 10 will consequently lead to large variabilities in modelled extreme discharges (cf.Fig. 15).If, however, the discharge extremes from a longer time series are studied, the variation in extremes is strongly reduced.To demonstrate this, 50 rainfall time series of 3600 years and corresponding evapotranspiration time series (remark that only one series is generated per rainfall time series) are used as input to the rainfall-runoff model, and the extremes, having return periods shorter than 1000 years, are plotted for each of these 50 time series (Fig. 16).As can be seen, the large uncertainties in extremes, encountered when using 72-year time series as input, are highly reduced, showing a slight overestimation for larger return periods if compared to those modelled using the observed time series of rainfall and evapotranspiration.Yet, it is impossible to state whether true overestimations are obtained, or that, due to the stochastic nature of rainfall (and evapotranspiration), no discharge events corresponding to a 72-year return period occurred in the observed time series and therefore the maximum discharge value was wrongly assigned a too high return period (i.e. the maximum discharge based on the observed time series of precipitation and evapotranspiration corresponds to a return period of about 25 years based on the simulations using the modelled very long time series of precipitation and evaporation).Similarly as discussed for Fig. 10, this result makes a plea for using modelled discharge time series of a length that is a multiple of the maximum return period of discharge aimed at, where longer time series reduce the variation in discharge values at high return periods at the expense of runtime.Further research will be needed to seek the trade- The RMSD is plotted against the cumulative relative frequency of the discharge given by the empirical cumulative distribution (ECDF) value.
off between length of the time series and the remaining uncertainty.
In order to further investigate the quality of the simulated discharge for all cases, Fig. 17 presents the comparison between the probability density functions of the daily averages of the modelled and reference discharge for January, April, July and October.For all cases, the daily mean seems to be preserved by the modelled discharge.However, through investigating the width of the grey areas of the simulated time series for each case, as expected, we can conclude that the most certain results are observed in case 1, followed by case 2 and case 3.This also holds true for the other months.Similar situations are witnessed for the univariate return period of annual extreme discharge (Fig. 15), in which the least and largest variations between the reference and simulated discharge are noticed for Q s1 and Q s3 , respectively.An especially remarkable expansion of grey areas is witnessed in case 3. It is clear that each stochastic component, i.e. modelled precipitation, temperature or evapotranspiration, has contributed an additional amount of variation to the modelled discharge.The differences between the simulated discharge from different cases are less evident in terms of probability density functions but more pronounced for the mean and extreme discharge.
To account for the variations between the modelled and reference discharge, the simulated discharge values are further evaluated using the root mean square deviation (RMSD): where Q m (i) and Q o (i) are respectively the modelled and reference discharge value at a cumulative relative frequency i ∈ [0, 1] and n is the number of the members in the ensemble considered.
Figure 18 displays the RMSD calculated for simulated discharge in different cases.It can be seen from the figure that for all cases, larger RMSD values are found for the higher values of discharge.In other words, simulations of the higher values of discharge are generally less accurate.There are insignificant differences between the RMSD for case 1 and 2 for all months.The use of stochastically generated temperature time series seemed to contribute minor uncertainty to the discharge simulations in this study.The largest errors often are obtained in case 3, where the discharge is simulated from stochastically generated precipitation and evapotranspiration values.

Conclusions
In water management, discharge is a very important variable which can be simulated via a rainfall-runoff model using recorded precipitation and evapotranspiration data.However, in situations that suffer from data deficiency, one may consider using stochastically generated time series.In this study, the impact of using the stochastically generated precipitation and evapotranspiration on the simulation of the catchment discharge is investigated.In order to assess the influence of each stochastic variable on the discharge simulations, three different cases have been considered.In the first case, it is assumed that insufficient evapotranspiration data would be available, requiring stochastically generated evapotranspiration based on observed precipitation and temperature data by means of a copula.In the second case, where only precipitation data would be sufficiently available, the temperature and evapotranspiration are each reproduced by vine copulas.The third case addresses the situation where too short time series of observations are available.In this case, the precipitation time series could be generated using an MBL model calibrated to the limited precipitation data available and then the time series of temperature and evapotranspiration could be obtained using the copula-based models.In all cases, C-vine copulas V T P E p E and V T p P T are used for the simulations of evapotranspiration and temperature, respectively.From the comparison between the simulations with the observations, the C-vine copulas seem to reproduce the time series of evapotranspiration and temperature well.It is clear that each stochastic component has a certain impact on the discharge simulations, and each additional stochastic variable will contribute an additional variation, and thus uncertainty.As expected, the simulations of the discharge obtained for case 1 show the smallest variability, while those in case 3 result in the largest variability.In general, no major differences are observed between the simulations and observations in cases 1 and 2, the characteristics of the discharge series seem to be preserved through the process for these cases.Noticeable variations are witnessed in case 3, where the discharge is simulated using modelled time series of precipitation and evapotranspiration.
With respect to extreme discharge, it was shown that the uncertainties encountered in case 3 are partly caused by the limited length of the time series used.The uncertainties in the predictions are highly reduced when input time series are used that are much longer than the maximum return period aimed at.As in this particular case, all forcing data are generated, the modeller is not restricted to the length of an observed time series, and can hence generate time series of whatever length as input to the hydrological model, taking into account that the longer the time series used, the more the uncertainty reduces at the expense of increasing runtime.
From this study, we may conclude that in situations that suffer from a lack of observations, one can rely on the stochastically generated series of precipitation, temperature and evapotranspiration to reproduce time series of discharge for water resource management.However, care should be taken as the modelled extreme discharges may experience the largest errors.

Figure 3 .
Figure 3. Construction of C-vine copula V T P E p E (a) and simulation of E from V T P E p E (b).

Figure 4 .
Figure 4. Comparison between the probability density functions of evapotranspiration of observed and simulated values: Uccle is in red and the ensemble of 50 time series simulated using the C-vine copula V T P E p E is in grey.

Figure 5 .
Figure 5.Comparison between Kendall's tau for the relations of E vs. T (a) and E vs. P (b) of observed and simulated values: Uccle is shown by the green line and the 50 simulated time series are shown by box plots.

Figure 6 .
Figure 6.Comparison between the probability density functions of the monthly mean T of the observed and simulated values: Uccle is in red and the ensemble of 50 time series simulated using the C-vine copula V T p P T is in grey.

Figure 7 .
Figure 7.Comparison between the return periods of monthly extremes of the observed and simulated temperature values: Uccle is in red and the ensemble of 50 time series simulated using the C-vine copula V T p P T is in grey.

Figure 8 .Figure 9 .
Figure 8.Comparison between observed and simulated precipitation data for the mean, variance, autocovariance and zero-depth probability (ZDP): Uccle is shown by the blue triangles and the ensemble of 50 simulated time series by the MBL model is shown by the box plots.

Figure 10 .
Figure 10.Comparisons between the return periods of extremes for the observed and simulated precipitation data at different aggregation levels: Uccle is in red and the ensemble of 50 simulated time series by the MBL model is in grey.Calculation of the extremes for a given return period on a time series that is based on concatenating the 50 simulated time series results in the blue line.

Figure 11 .
Figure 11.Different cases for discharge simulation.P or , E or and T or refer to the observed time series.P s , E s1 , E s2 , E s3 , T s2 and T s3 refer to the simulated time series (red blocks).Red arrows indicate the simulation processes related to stochastically generated time series.

Figure 12 .
Figure12.Comparison between the probability density functions of the reference discharge Q rf (red) and the ensemble of time series of simulated discharge values (grey) using observed precipitation, observed temperature and simulated evapotranspiration values in case 1 (a), using observed precipitation and simulated temperature and evapotranspiration in case 2 (b) and using simulated precipitation, temperature and evapotranspiration in case 3 (c).

Figure 13 .
Figure 13.Comparison between the probability density functions of observed temperature time series (red) and the ensemble of simulated time series of temperature values (grey) using the C-vine copula V T p P T on the basis of observed precipitation in case 2 (a) and on the basis of simulated precipitation in case 3 (b).

Figure 14 .
Figure14.Comparison between the probability density functions of observed evapotranspiration time series (red) and the ensemble of simulated time series of evapotranspiration (grey) using the C-vine copula V T P E p E on the basis of observed precipitation, observed temperature in case 1 (a), on the basis of observed precipitation and simulated temperature in case 2 (b) and on the basis of simulated precipitation and temperature in case 3 (c).

Figure 15 .
Figure 15.Comparison between the empirical return periods of annual extremes of the observed and simulated discharge for all cases.Reference discharge Q rf is in red and the ensemble of time series of simulated discharge is in grey.The top row shows extremes obtained on the basis of unequal ensemble widths of 50 (case 1), 2500 (case 2) or 125 000 (case 3) time series.The bottom row shows extremes obtained on the basis of equal ensemble widths of 50 time series.

Figure 16 .
Figure 16.Comparison between the empirical return periods of annual extremes of the observed and simulated discharge for case 3 based on 50 time series of 3600 years of rainfall and corresponding evapotranspiration.

Figure 17 .
Figure 17.Comparison between the probability density functions of the mean of discharge of the observed and simulated values in three cases.Reference discharge Q rf is in red and the time series of simulated discharge is in grey.

Figure 18 .
Figure18.Root mean square difference (RMSD) for simulated discharge in different cases: case 1 (red), case 2 (blue) and case 3 (green).The RMSD is plotted against the cumulative relative frequency of the discharge given by the empirical cumulative distribution (ECDF) value.

Table 3 .
Optimal parameter set for the (monthly) MBL (modified Bartlett-Lewis) model.