Many multi-site stochastic models have been proposed for the generation of daily precipitation, but they generally focus on the reproduction of low to high precipitation amounts at the stations concerned. This paper proposes significant extensions to the multi-site daily precipitation model introduced by Wilks, with the aim of reproducing the statistical features of extremely rare events (in terms of frequency and magnitude) at different temporal and spatial scales. In particular, the first extended version integrates heavy-tailed distributions, spatial tail dependence, and temporal dependence in order to obtain a robust and appropriate representation of the most extreme precipitation fields. A second version enhances the first version using a disaggregation method. The performance of these models is compared at different temporal and spatial scales on a large region covering approximately half of Switzerland. While daily extremes are adequately reproduced at the stations by all models, including the benchmark Wilks version, extreme precipitation amounts at larger temporal scales (e.g., 3-day amounts) are clearly underestimated when temporal dependence is ignored.

Stochastic precipitation generators are often employed in risk assessment
studies to estimate the return periods of very rare flooding events (e.g., 10 000-year events).
The observed series of streamflows are too short to
produce reliable estimations of very rare and large floods. Typically,
extreme hydrological events can be reproduced using long series of simulated
precipitation data as input to hydrological models

In the last two decades, a number of precipitation models have been proposed
to deal with the temporal and spatial properties of daily precipitation, for
both intermittency and amount, and all have different strengths and
weaknesses. Many of these models use exogenous variables to predict the
statistical properties of precipitation using generalized linear models

Alternatively, purely stochastic precipitation models can be used. These can be broadly classified into three main types.

Light-tailed distributions such as exponential, Gamma, and Weibull distributions, which are applied in the vast majority of the existing precipitation models, often lead to an underestimation of extreme daily precipitation amounts.

While nonparametric densities with Gaussian kernels

Alternatively, current statistical procedures consisting of fitting a
flexible distribution to the bulk of the observations and using it for
extrapolation are highly questionable, as major assumptions are usually
violated

the application of a heavy-tailed distribution to precipitation amounts at
each station

the determination of robust estimates of the shape parameter of this
distribution, which indicates the heaviness of the tail, using a regionalization
approach, as in

Furthermore, following

We first describe the study area in Sect.

The Aare River basin covers the northern part of the Swiss
Alps and has an area of 17 700 km

Location of the 105 precipitation stations in Switzerland. Different partitions of the Aare River basin into 5 and 15 sub-basins are shown.

Figure

The proposed precipitation models are designed to simulate flood scenarios, via a conceptual
hydrological model, for the whole Aare River basin and for
its different sub-basins. For Switzerland,

As indicated above, GWEX refers to multi-site
precipitation models that rely strongly on the structure proposed by

Precipitation amount

At each location, the temporal persistence of dry and wet events is
introduced with a

At each site, the probability of having a wet day on day

The spatial dependence of the precipitation states

Let

Illustration of the relationship between

The relationship between

Given the occurrence of precipitation

marginal heavy-tailed distributions,

a tail-dependent spatial distribution,

an autocorrelated temporal process.

At a given location

In this work, the distribution representing the precipitation intensity at
each location,

This distribution can be described by a smooth transition between a
gamma-like distribution and a heavy-tailed Generalized Pareto distribution
(GPD). This transition is obtained via a transformation function,

Spatial and temporal dependence of precipitation
amounts is represented using a multivariate autoregressive model of order 1
(MAR(1)). A MAR(1) process has been used by different authors

The stochastic Gaussian process

Innovations

Following

E-GPD distributions are first fitted to precipitation amounts available at
each location

Following

The

In this work, the estimation of the

Concerning the spatial and temporal dependence of precipitation amounts,
direct estimates of

The matrix

Similarly to the occurrence process, the seasonal aspect of the precipitation intensity is taken into account by performing the parameter estimation for each month, on a 3-month moving window.

Flowchart of the different model versions. The differences between the models are summarized inside green boxes.

Different versions of the proposed multi-site precipitation model are
considered in this paper, each corresponding to different extensions of the
Wilks model. A flowchart summarizing the increasing complexity of these
models is presented in Fig.

A first benchmark version of the multi-site model, referred to here as
“Wilks”, is considered. It closely matches the multi-site model proposed by

The at-site occurrence process is a Markov chain of order 1.

The marginal distribution on precipitation amounts is a mixture of
exponential distribution, for which the probability density function is defined as

The parameters

Precipitation amounts are not considered to be temporally correlated;
i.e., the matrix

A modified Wilks version is considered, for which the at-site occurrence
process is a Markov chain of order 4 and the mixture of exponential
distributions is replaced by the E-GPD distribution. As indicated above,

The initial GWEX model has the following characteristics.

The at-site occurrence process is a Markov chain of order 4.

The marginal distribution for precipitation amounts is the E-GPD distribution.

Precipitation amounts follow a MAR(1) process with innovations modeled by a Student copula.

In this paper, an alternative version, referred to as GWEX_Disag, is also proposed. GWEX_Disag is applied to 3-day precipitation amounts and has the same characteristics as GWEX, except the following.

The at-site occurrence process is a Markov chain of order 1.

A threshold of 0.5 mm separates dry and wet states.

With GWEX_Disag, daily scenarios are first generated at a 3-day scale and
then disaggregated at a daily scale using a method of fragments

The 3-day precipitation amounts are directly modeled and have a better chance of being adequately reproduced.

The disaggregation of 3-day precipitation amounts creates an inherent link between the occurrence and the intensity processes. For very extreme precipitation events, we can expect these processes to be dependent (higher chance of being in a wet state over the whole Aare River basin, as well as large and persistent precipitation amounts).

The proposed stochastic models intend to preserve
the most critical properties of precipitation at different spatial and
temporal scales, especially extreme precipitation amounts. For hydrological
applications, it can be assumed that a precipitation model preserving these
properties has a better chance of adequately reproducing flood properties for
small sub-basins as well as for large basins. This statement is supported by
empirical evidence provided by

The performance of the different multi-site precipitation models is thus
assessed for multiple spatial and temporal scales. We investigate whether or
not the statistical properties of precipitation data are adequately
reproduced at the scale of the stations and for different partitions of the
Aare River basin (see Fig.

For the different evaluated statistics, performance is categorized according
to the comprehensive and systematic evaluation (CASE) framework proposed by

Performance categorization criteria from

Otherwise, we consider that performance is poor, indicating that the model fails to reproduce this particular statistical properly.

In summary, good performance represents cases for which the observed metric is clearly well reproduced by the model, whereas fair performance indicates a reasonable match between the observed and the simulated metrics. The number of metrics for which poor performance is obtained is thus the first criterion indicating the overall performance of a model.

For illustration purposes, we also present the results of the evaluation for
three precipitation stations corresponding to different hydrological regimes
(see Table

Hydrological regimes and characteristics of extreme
floods in Switzerland

This section presents the results of the multi-scale
evaluation framework (see Sect.

The precipitation observations are split into two sets. (1) A total of 45 years randomly chosen among the period 1930–2014 are used to estimate the parameters, and (2) the 40 remaining years are used to evaluate the performance of the models. This separation between an estimation set and a validation set is crucial to test the ability of the model to adequately represent the statistical properties of events which have not been used during the fitting procedure. In this study, the multi-scale evaluation is only applied to the 40-year validation set.

The different model parameters are estimated with the 45-year estimation set
of observations, following the methodology described in Sect.

Regionalized

For GWEX, the estimation of the

For GWEX_Disag, the regionalization method is applied at a 3-day scale (see
Fig.

Regionalized

Figure

Empirical and fitted distributions (dashed curves for mixture of exponentials and solid curves for E-GPD) at a daily scale, for the three illustrative stations and for the months of January, April, July, and October.

For each multi-site precipitation model investigated in this paper (Wilks, Wilks_EGPD, GWEX and GWEX_Disag), we generate 100 daily precipitation scenarios with these parameter estimates, each scenario having a length of 100 years. These scenarios are compared to the precipitation observed for the 40-year validation period.

The monthly number of wet days obtained from observed and simulated
precipitation data are compared in Fig.

At-site number of wet days for all sites and months: inter-annual
mean and standard deviation (SD). The 90 % probability limits are shown for
the different seasons. Overall performance is represented by the indicated
percentages of good, fair, and poor performance for all sites and months
(

Distribution of dry spell lengths at the stations: the 90 % probability limits are shown. Overall performance is represented by the indicated percentages of good, fair, and poor performance for all sites. Inset plots provide a zoom for durations of 1 to 5 days.

Distribution of wet spell lengths at the stations: the 90 % probability limits are shown. Overall performance is represented by the indicated percentages of good, fair, and poor performance for all sites. Inset plots provide a zoom for durations of 1 to 5 days.

Figures

The frequencies of wet spell lengths are adequately reproduced by the Wilks,
Wilks_EGPD, and GWEX models, with more than 50 % of good performance. The
lower overall performance of GWEX_Disag for this metric is due to a slight
underestimation of the longest wet spells for some stations (which is however
not the case for the stations shown in Fig.

Figure

An adequate reproduction of lag-1 inter-site correlations is important for
the reproduction of persistent precipitation events. Simulated lag-1
cross-correlations are close to 0 for the Wilks and Wilks_EGPD models, as
expected given that these versions ignore the temporal dependence.
Consequently, these two model versions significantly underestimate observed
lag-1 cross-correlations, which range between 0 and 0.4. Concerning GWEX,
lag-1 serial autocorrelations at the stations (black points in the bottom
plots) are perfectly aligned along the

Comparison of unlagged inter-site correlations (

The reproduction of precipitation amounts at a daily scale is assessed in
Fig.

Daily amounts for all spatial scales and months: inter-annual mean (top) and standard deviation (SD, bottom). The 90 % probability limits are shown. Overall performance is represented by the indicated percentages of good, fair, and poor performance for all spatial scales and months.

Figures

Daily annual maxima for all spatial scales: relative differences, expressed as a percentage, between observed and simulated 10-year (top plots) and 50-year (bottom plots) return periods. The 90 % probability limits are shown. Overall performance is represented by the indicated percentages of good, fair, and poor performance for all spatial scales.

The 3-day annual maxima for all spatial scales: relative differences, expressed as a percentage, between observed and simulated 10-year (top plots) and 50-year (bottom plots) return periods. The 90 % probability limits are shown. Overall performance is represented by the indicated percentages of good, fair, and poor performance for all spatial scales.

At the daily scale (Fig.

Comparing Wilks_EGPD and GWEX, the scores are almost identical, which suggests that the tail dependence introduced by the Student copula in GWEX does not produce a significant improvement for the reproduction of extremes. However, if we focus on the largest spatial scales (at the basins), and in particular on the entire Aare River basin (orange lines), it seems that the slight underestimation of the 50-year return periods obtained with Wilks_EGPD is reduced thanks to this tail dependence. GWEX_Disag also reproduces the largest precipitation amounts at all spatial scales adequately, even if a slight overestimation of the maxima at the largest spatial scales can be suspected. Nevertheless, this performance shows that the disaggregation process leads to an adequate reproduction of the daily maxima.

At the 3-day scale (Fig.

As the model is fitted at a 3-day scale, 3-day maxima are adequately reproduced.

As the method of fragments uses observed 3-day temporal structures to disaggregate 3-day amounts, the daily amounts resulting from a generated 3-day maxima are physically plausible. In particular, the temporal and spatial structures of large and persistent observed precipitation events are used, which ensures consistency between the generated extreme events at the daily and 3-day scales.

GWEX and GWEX_Disag both adequately reproduce extreme precipitation amounts at daily and 3-day scales, as well as at all spatial scales. As indicated above, these models will be used to generate long precipitation scenarios, which will feed a hydrological model in order to produce flood scenarios. Ultimately, the reproduction of the flood properties using GWEX and GWEX_Disag will indicate which model is the most adequate. Since they correspond to the same model version fitted at daily and 3-day scales, respectively, we can expect that resulting floods will have slightly different properties.

Precipitation models are usually developed for the purpose of risk assessment
in relation to natural hazards (e.g., droughts, floods). Most existing
precipitation models aim to reproduce a wide range of statistical
properties of precipitation, at different scales, in order to be used as a
general tool in different contexts. In this study, our main objective was to
provide a precipitation generator that could be used together with a
hydrological model for the evaluation of extreme flooding events in a region
covering approximately half of Switzerland. As a consequence, we were
especially interested in the reproduction of extreme precipitation amounts at
medium to large spatial scales. As the daily and 3-day precipitation amounts
are a major determinant of flood magnitudes in large Swiss basins

In this paper, we considered different multi-site precipitation models
targeting the reproduction of extreme amounts at multiple temporal (daily,
3-day) and spatial scales. Different extended versions of the model
introduced by Wilks

In this study, we support the use of a systematic evaluation framework. The
CASE framework proposed by

The different multi-site precipitation models have been applied to 105 stations located in Switzerland. A multi-scale evaluation led to the following conclusions.

A fourth-order Markov chain outperforms a first-order Markov chain for the transitions between dry and wet states, notably for the reproduction of dry spell lengths.

At the scale of the stations, daily amounts (average, SD, and extremes) are reasonably well reproduced by all the models.

With only three parameters, the E-GPD provides a parsimonious and flexible
representation of the whole of precipitation amounts. Its GPD tail is in agreement
with recent results, showing that extreme precipitation amounts must be modeled by
heavy-tailed distributions

At a 3-day scale, precipitation extremes are severely underestimated by Wilks and Wilks_EGPD. This underestimation can be explained by an incorrect representation of the persistence by these models.

GWEX and GWEX_Disag adequately reproduce extreme precipitation amounts at daily and 3-day scales, and at all spatial scales. These models are deemed adequate for the evaluation of extreme flood events.

Future research will investigate whether the floods simulated by a hydrological model using the generated precipitation scenarios have statistical properties in agreement with observed floods. An extensive investigation is currently underway with a distributed version of the HBV hydrological model, applied to 87 sub-basins of the whole study area and using precipitation scenarios produced by GWEX as inputs. This hydrological evaluation of our weather scenarios will be presented in future publications.

For a 3-day period

A set of observed 3-day sequences are retained as candidate periods

For each observed 3-day candidate period

This score measures the similarity between the simulated spatial field for
the period

Absolute differences between relative precipitation intensities are computed (the lowest scores are therefore obtained for spatial fields with similar shapes) among the observed periods, corresponding to the same season and order of magnitude selected in the previous step.

For each simulated period

with similar expressions for days

While the 3-day spatiotemporal consistency is generally conserved by applying the preceding steps, it can happen that the simulated 3-day amount is positive even though there is no positive precipitation among the 10 best 3-day observed fields. In this case, we seek similar observed amounts at this station only and randomly choose one 3-day period among the 10 best 3-day periods.

The data have been downloaded from Idaweb, a data portal which provides users in the field of teaching and research with direct access to archive data of MeteoSwiss ground-level monitoring networks. However, the acquired data may not be used for commercial purposes (e.g., by passing on the data to third parties, by publishing them on the internet). As a consequence, we cannot offer direct access to the data used in this study.

The authors declare that they have no conflict of interest.

We gratefully acknowledge financial support for this study provided by the Swiss Federal Office for Environment (FOEN), the Swiss Federal Nuclear Safety Inspectorate (ENSI), the Federal Office for Civil Protection (FOCP), and the Federal Office of Meteorology and Climatology, MeteoSwiss, through the project EXAR (“Evaluation of extreme Flooding Events within the Aare-Rhine hydrological system in Switzerland”). The authors would like to thank MeteoSwiss (the Swiss Federal Office of Meteorology and Climatology) for providing the meteorological data. We also thank the editor and two anonymous reviewers for their constructive comments, which helped us to improve the manuscript. Edited by: Carlo De Michele Reviewed by: Korbinian Breinl and one anonymous referee