Many climate impact assessments require high-resolution precipitation time series that have a spatio-temporal correlation structure consistent with observations, for simulating either current or future climate conditions. In this respect, weather generators (WGs) designed and calibrated for multiple sites are an appealing statistical downscaling technique to stochastically simulate multiple realisations of possible future time series consistent with the local precipitation characteristics and their expected future changes. In this study, we present the implementation and validation of a multi-site daily precipitation generator re-built after the methodology described in Wilks (1998). The generator consists of several Richardson-type WGs run with spatially correlated random number streams. This study aims at investigating the capabilities, the added value and the limitations of the precipitation generator for a typical Alpine river catchment in the Swiss Alpine region under current climate.

The calibrated multi-site WG is skilful at individual sites in representing the annual cycle of the precipitation statistics, such as mean wet day frequency and intensity as well as monthly precipitation sums. It reproduces realistically the multi-day statistics such as the frequencies of dry and wet spell lengths and precipitation sums over consecutive wet days. Substantial added value is demonstrated in simulating daily areal precipitation sums in comparison to multiple WGs that lack the spatial dependency in the stochastic process. Limitations are seen in reproducing daily and multi-day extreme precipitation sums, observed variability from year to year and in reproducing long dry spell lengths. Given the performance of the presented generator, we conclude that it is a useful tool to generate precipitation series consistent with the mean climatic aspects and likely helpful to be used as a downscaling technique for climate change scenarios.

In Switzerland, precipitation is a key weather variable with high relevance for sectors such as energy production, infrastructure, tourism, agriculture and ecosystems. Owing to a complex topography, daily precipitation varies strongly in space and time (Frei and Schär, 1998; Isotta et al., 2013). The spatial distribution of daily precipitation frequency and intensity depends on the topography, with higher frequencies and intensities along the northern Alpine ridge during summer, and a strong north–south gradient with heavier intensities in southern Switzerland from spring to autumn. The most prominent weather situations causing these precipitation patterns are shallow pressure systems favouring convective precipitation, orographically induced precipitation (e.g. föhn situations), and frontal passages. Precipitation amounts and frequencies are typically largest in summer, mainly due to convective processes (Frei and Schär, 1998).

Given the expected changes in the hydrological cycle over the twenty-first century (Allen and Ingram, 2002; Held and Soden, 2006), the need for reliable and quantitative future local precipitation projections in Switzerland is continuously growing. To effectively assess the impacts related to changes in precipitation, often highly localised daily data are needed that are ideally both consistent in time and in space (e.g. Köplin et al., 2010). Currently, in Switzerland, various impact assessment reports rely on the statistically downscaled precipitation change data derived from regional climate models by the well-known and simple delta change approach, which shifts an observed time series by a model-derived change in the mean climate (BAFU, 2012; Bosshard et al., 2011; CH2014-Impacts, 2014). The delta change approach accounts for changes in the mean annual cycle, but potential changes in inter-annual variability, and changes in wet day frequency and intensity or of spell lengths are not taken into account. Hence, the data are also not suitable for the analysis of future changes in extreme events (Bosshard et al., 2011). It is our aim here to develop a statistical downscaling method for Switzerland that overcomes some of these limitations and that subsequently can be easily applied to climate model output.

Over recent years a vast number of statistical downscaling methods have been developed that go far beyond a simple delta change approach (Maraun et al., 2010). These include bias-correction methods (e.g. Themeßl et al., 2011), regression-based methods (e.g. Hertig and Jacobeit, 2013) or weather generator (WG) approaches (e.g. Chandler and Wheater, 2002; Mezghani and Hingray, 2009). For our purposes, the latter method is especially appealing, since it includes a stochastic component. This is a major improvement compared to a (deterministic) delta change approach, allowing us to investigate multiple time series and uncertainty at the local scale that are consistent with a given (current or future) mean climate. Moreover, it allows the incorporation of changes in the temporal correlation structure and consequently alterations of the dry–wet sequences. From an agricultural impact (e.g. Calanca, 2007) or water resource management (e.g. Samuels et al., 2009) perspective, this is a key aspect of future precipitation change.

A serious limitation of many WGs is that they are often calibrated to observations at single sites only, thereby lacking the spatial correlation structure that is required for many applications, particularly in the context of hydrological impact modelling in a topographically complex terrain such as the Alps. A number of sophisticated approaches in time–space precipitation simulation have been put forward in the literature to address this issue, such as K-nearest neighbour resampling approaches (e.g. Buishand and Brandsma, 2001), copula-based approaches (e.g. Bárdossy and Pegram, 2009), Poisson cluster models (e.g. Cowpertwait, 1995; Fatichi et al., 2011) or more sophisticated field generators (e.g. Paschalis et al., 2013; Peleg and Morin, 2014). Of increasing popularity are Markovian multi-site models (e.g. Baigorria and Jones, 2010; Wilks, 1998) and in particular non-homogeneous hidden Markov models (NHMMs) (e.g. Bellone et al., 2000; Hughes et al., 1999; Kioutsioukis et al., 2008; Robertson et al., 2004, 2009). The latter approach models transitions between pre-defined precipitation state patterns conditional on the synoptic-scale circulation. Each of these time–space WGs come with method-specific benefits and limitations for the reproduction of the daily precipitation statistics and consequently its use in impact models. For instance, some of them do better in simulating more realistically longer-term variability (e.g. generalised linear model (GLM) based multi-site WGs, Chandler, 2014), while some are explicitly adapted to deal with extreme precipitation (e.g. Huser and Davison, 2014).

The main purpose of our precipitation generator is its use as a downscaling tool in a climate change context. It should be easily transferable to different climatological regions and time periods and its generated time series should serve several impact applications that have different needs in terms of time–space consistency. For these reasons we opt for a precipitation generator whose degree of complexity and associated calibration requirements are still sufficiently easy to handle. Mehrotra et al. (2006) inter-compared three stochastic multi-site precipitation occurrence generators over a region over Australia and found that the generator by Wilks (1998) outperforms hidden Markov models and K-nearest neighbour resampling techniques in terms of overall performance, time required for model running and simplicity of the model structure. Hence the multi-site precipitation generator proposed by Wilks (1998) serves our purposes. It is a relatively simple tool based on a Richardson-type WG (Richardson, 1981) run with spatially correlated random number streams.

It is the aim of this study to investigate the capabilities, the added value and the limitations of this multi-site generator in order to better interpret the climatic changes in the simulated time series for a future climate, which is part of an upcoming study. In particular, the actual amount of stochastically generated variability will be assessed as well as the added value of a multi-site model against multiple single-site models. The analysis is done for Swiss catchment Thur. While not being of the same level of familiarity as other catchments, the Thur catchment serves as an ideal test bed for our purposes, as will be detailed in Sect. 2. In Sect. 3 we recapitulate the basic procedures for multi-site precipitation simulation after Wilks (1998) and detail how the generator was calibrated over the catchment. Results of the validation against observations and against single-site generators will be presented in Sect. 4. We end the article with a discussion (Sect. 5) and a summary and outlook (Sect. 6).

This study focuses on the hydrological catchment of the river Thur (located
in the north-eastern part of Switzerland, Fig. 1a) that is a feeder river of
the Rhine, with a length of about 135 km and a catchment area of
approximately 1696 km

In an upcoming study our generated synthetic time series over the Thur catchment will serve as input to two hydrological models to assess the runoff regime under current and future climates (similar to in Jasper et al., 2004). The Thur is a well-studied and well-observed river catchment in Switzerland (e.g. Fundel et al., 2013; Kunstmann et al., 2006) providing high-quality hydrological measurement series for a robust calibration of hydrological runoff models. It further represents the largest Swiss river without a natural or artificial reservoir and therefore exhibits discharge fluctuations similar to unregulated Alpine rivers.

Owing to the complex topography over this catchment area (ranging from less than 400 m a.s.l. to more than 2500 m a.s.l.), precipitation exhibits a large variability both in space and in time (see Fig. 1b and c based on gridded observational data from Frei and Schär, 1998). Over 1961–2011 and for a winter and summer month, the data clearly show larger precipitation frequencies and intensities over higher-elevated regions compared to the lowlands. A large portion of these precipitation characteristics can be explained by a north-east to south-west lying mountain range (Alpstein) extracting precipitation from westerly flows and triggering convective storms. These spatio-temporal variations hence serve as an ideal observation basis to validate and analyse the capabilities and limitations of the WG.

The core of our multi-site WG is a Richardson-type precipitation generator
(Richardson, 1981) consisting of an occurrence and amount model. To model
occurrence at a single station we rely on a first-order two-state Markov
chain (Gabriel and Neumann, 1962; Richardson, 1981; Wilks and Wilby, 1999).
The use of a first-order model in our WG was justified by inspecting the
Akaike information criterion (AIC) (Akaike, 1974) and the Bayesian
information criterion (BIC) (Schwarz, 1978). Both criteria revealed a
substantial improvement when going from a zero-order to first-order model,
but the additional gain at a second- or higher-order model was negligible
(not shown). We used a specific wet day threshold of 1 mm day

For an estimate of the transition probabilities, we rely on their conditional
relative frequencies (Wilks, 2011). Other important precipitation indices can
be inferred. The wet day frequency (wdf,

Given a simulated wet day from the occurrence model, precipitation amounts
are set. This is done by sampling from a mixture model of two exponential
distributions (Wilks, 1999a):

The simulation process is based on Richardson (1981) at single stations with
the five above-introduced parameters, i.e. the transition probabilities

The main extension to a multi-site model after Wilks (1998) is to drive several single-site WGs simultaneously with spatially correlated but serially independent random numbers. To generate correlated random number streams, we rely on a Cholesky decomposition (e.g. Higham, 2009). The latter requires matrices that are positive definite, which is not always granted. In their absence, a fall-back solution based on the nearest positive correlation matrix is chosen (e.g. Higham, 1989). This problem, however, occurs only a few times in our study. One of the main hurdles in simultaneously generating precipitation at multiple sites is to ensure that the spatial dependence is also preserved in the final generated time series (Wilks and Wilby, 1999; Wilks, 1998). This difficulty mainly arises from the stochastic process that partly destroys the initially imposed correlation structure again (Wilks, 1998). To circumvent this problem, Wilks (1998) suggested an optimisation procedure based on a bisection method (Burden and Faires, 2010) that minimises the difference between the generated spatial correlation and the target correlation of observations. In our case, the iteration is repeated until a precision of 0.005 is reached. This estimation procedure is done prior to the actual simulation and has to be done for each station pair and each month. For further details regarding the set-up of stochastic simulation and in particular the implementation of multi-site simulation, we refer the reader to the Supplement.

The precipitation generator is calibrated on a monthly basis. First, all the
single-site input parameters (

Reproduction of average wet day frequency (wdf), mean wet day
intensity (wdi), wet–wet transition probability (

To test whether our WG is properly implemented, we evaluated the reproduction
of WG input parameters extracted from the generated time series. A correct
reproduction in parameters such as wet day intensity, frequency and
transition probabilities is a prerequisite for all the subsequent analyses
presented in Sect. 4. The evaluation was performed for four subjectively
defined climatic regimes: a very dry, a dry, a wet and a very wet climate.
The corresponding model parameters are indicated in Fig. 2 with dashed
vertical lines. For each of these precipitation regimes, 100 synthetic daily
time series were generated. To test the effect of sample size, different
sizes of time windows were used: (a) 10 000 days, (b) 1000 days, (c) 100
days and (d) 30 days. The latter corresponds to the same sample size as for
the simulation of precipitation occurrence over the Thur catchment. For each
of the generated time series, the WG parameters were re-estimated and the
95 % inter-quantile range was computed across the set of 100 realisations
(Fig. 2). Three main results can be inferred: (a) our precipitation generator
is able to correctly reproduce the key WG parameters, implying that the
chances of substantial coding errors are small. (b) As expected, the estimate
of the input parameters becomes more uncertain for smaller sample sizes; in
fact, the uncertainty range increases by a factor of 18.3 when the sample
size is reduced from 10 000 to 30. At a sample size of 1000, the uncertainty
range stays at around

Long-term mean and variability of monthly precipitation sums during the period 1961–2011 for eight stations in the Thur catchment. The black (blue) lines refer to the mean annual cycle of observed (modelled) precipitation sums. The grey (blue) shaded areas represent the inter-quartile ranges of observed (simulated) monthly precipitation sums. The simulation comprises 100 realisations covering every 51 years. The numbers at the bottom indicate for each month the percentage of variance explained by the precipitation generator. Note that the scale of the y-axis differs between different stations.

An in-depth evaluation of the generated time series with our calibrated multi-site WG is now undertaken with real observations. First, the reproduction of the daily and longer-term precipitation statistics at individual sites is analysed (Sect. 4.1). In a second step, the performance of the multi-site model is investigated regarding spatially aggregated precipitation indices in comparison to WGs without incorporating spatial dependencies (Sect. 4.2).

Based on our ensemble of synthetic time series, each containing 51 years, we analyse the reproduction of key precipitation characteristics. This validation goes beyond the reproduction of pure model parameters used to calibrate the WG (Sect. 3.3.2), as it includes precipitation statistics that are not directly used in the specification and calibration of the model. Note that we present this analysis for the same time period as used for calibrating our WG. This is justified for the study here, as long as we treat and use our WG to simulate long-term monthly precipitation statistics. In such a set-up, the stationarity of the model is given by definition. However, in a climate prediction or projection context, this stationarity assumption would have to be tested, and hence separate calibration and validation periods are needed.

In the first step of validating our WG, we focus on the reproduction of the
long-term mean in monthly precipitation sums. Figure 3 shows both the
modelled (blue) and observed (black) long-term monthly precipitation sum for
each of the eight investigated stations. In general, the annual cycle of
precipitation sums is well reproduced. Consistently, this is also true of the
long-term seasonal as well as for the annual precipitation sums (not shown).
But, the WG tends to slightly underestimate precipitation sums in June and
August, and overestimate them in October. In addition, the two stations
Bischofszell (BIZ) and Herisau (HES) show rather large positive deviations
from the observed record during the winter months. In order to explain part
of these deviations, we decomposed the long-term mean of monthly (

Observed and modelled monthly mean wet day intensity (blue) and frequency (red) at eight stations during 1961–2011. The black (coloured) lines indicate the observed (modelled) values. The blue (red) shaded areas correspond to the inter-quartile range across the set of synthetic daily time series. They comprise 100 runs covering every 51 years.

Next we focus on the inter-annual variability of monthly precipitation sums,
which is often more difficult to realistically model than the long-term mean
(Wilks and Wilby, 1999). The shaded areas in Fig. 3 represent the
inter-quartile range of the observed (grey) and modelled (blue) monthly
precipitation sums. From Fig. 3 it is obvious that the variability of the WG
is smaller than in observations for all of the analysed stations. This
implies that the stochastic model only explains part of the observed total
variability. This reduced variability is expected, as observations are
subject to additional sources of variability, which our comparable simple WG
is not trained for. The WG is forced with mean observed values, varying
between months but not between different years. The annual cycle is assumed
to be stationary, and hence interannual variability, e.g. related to the
North Atlantic Oscillation (Hurrell et al., 2003), is missing. Consequently,
the ratio of simulated to observed variance accounts for approximately
33 % on average. The magnitude of this result is consistent with other
studies (e.g. Gregory et al., 1993). Further insights can be gained from a
decomposition of the variance of monthly (

Cumulative distribution of the observed and simulated dry (left) and wet (right) spell length frequencies for lowland station Andelfingen (top) and mountain station Saentis (bottom). Results are for January and June during the time period of 1961–2011. The coloured area (line) represents the inter-quartile range (median) of the 100 realisations covering each 51 year long daily time series.

The adequate reproduction of the mean wet day intensity and frequency is a necessary but not sufficient precondition of a WG to be used for subsequent (impact) studies. Due to a large variability of precipitation amounts, it strongly matters how its frequency distribution is reproduced. For this, we compared simulated and observed quantiles of the daily non-zero precipitation distribution at each station (Supplement Fig. 4). Generally, the mixture model of two exponential distributions captures the frequencies of the intensities reasonably well, even at the high-Alpine station Saentis (SAE) . This is at least the case up to the 80th percentile, above which intensities are systematically underestimated at all stations. This issue could be overcome by more sophisticated amount models combining e.g. a gamma with a generalised Pareto distribution (Vrac and Naveau, 2007).

While the frequencies of precipitation amounts and the frequencies of wet and
dry days are realistically simulated, it remains unclear how the WG performs
for multi-day spells. For many application studies, this is essential
information that requires a specific analysis. Figure 5 displays observed and
modelled cumulative frequencies of dry and wet spell lengths at
the example of 2 months and 2 stations. The two stations Saentis and
Andelfingen are selected for display since they represent the stations with
the highest and lowest elevations in the catchment. For both stations a clear
seasonal difference in the probability of dry spells toward more short and
fewer long dry spells during summer compared to winter is found. A plausible
explanation are the more intermittent (convective) precipitation systems
during summer. In contrast to dry spells, no seasonal differences in wet
spell length probabilities can be inferred. This is likely related to the
fact that the dry–dry transition probability

Cumulative distribution functions (CDFs) of multi-day precipitation sums for the three stations Andelfingen (AFI), Appenzell (APP) and Saentis (SAE). The lines represent the CDFs of non-zero precipitation amounts over 1 day (red), over 3 consecutive wet days (green) and over 5 consecutive wet days (blue). Darker and lighter colours refer to observations and simulations, respectively. The observed CDFs have been derived from a 51-year long daily time series between 1961 and 2011, those of the weather generator from 100 realisations of 51-year long daily simulations. Note that the scaling of the horizontal axis differs between different stations.

Given that the frequency of wet spell lengths is realistically simulated, the question arises whether this also holds for multi-day precipitation sums. Multi-day periods of rain is a common phenomenon over Switzerland, especially during prevailing weather situations that favour orographic uplift. We compared observed and simulated cumulative distribution functions (CDFs) of precipitation sums over multiple consecutive wet days (Fig. 6). Overall, we found that the differences between generated and observed time series are largest for the higher quantiles and for long lasting wet spells (5 day wet spells) where the WG tends to underestimate large multi-day sums. This reduced skill in simulating longer wet spell sums can be explained by the fact that our WG is only prescribed with the temporal structure of precipitation occurrence but not in amount. In other words, the WG has the memory to realistically reproduce multi-day wet spell lengths (Fig. 5), while the combined analysis of multi-day occurrence and accumulated amount loses this memory again somewhat. Two further noticeable features in Fig. 6 are that intense 1 day precipitation sums are often overestimated by the model compared to the observations, while a relatively good match is obtained for 3 day sums. Although the deficiency in correctly simulating multi-day sums of consecutive wet days is to be expected by construction of the WG, it could be improved by more sophisticated precipitation models, such as multi-state Markov chains with different probability density distributions conditioned on pre-defined states, as for instance “dry”, “wet”, and “very-wet” (Boughton, 1999; Gregory et al., 1993).

Up to this point we evaluated the generator at individual sites only. One of the key issues of this study though is the potential added value of incorporating inter-station dependencies. Similarly as in the previous section, we analyse the performance first in terms of occurrence-related statistics and second in terms of the combined occurrence and amount statistics.

Frequencies (given in percent) of a completely wet or dry catchment together with the frequencies of its spell lengths. The observed (OBS) frequencies are calculated over 1961–2011. The multi-site simulated frequencies are given by the mean of 100 runs over 51 years (1961–2011).

Based on the eight stations in our catchment, with each being either in a wet
or dry state on a given day, theoretically 2

Those days with complete dry or wet catchment conditions were further investigated in terms of the temporal structure. Table 1 presents observed and multi-site simulated spell length statistics for the catchment. In general, remarkably good agreement between observations and the multi-site model is found. This is also true of longer spell lengths, where the spatio-temporal correlation structure is only indirectly given as input to the WG. All of these results imply that the calibrated multi-site WG not only captures the frequencies of spatially aggregated binary series very well, but it also does a surprisingly good job in reproducing multi-day dry/wet spells of the Thur catchment.

The above findings on the spatio-temporal correlation structure in the
occurrence process also give confidence that daily precipitation sums
aggregated over the catchment are reasonably simulated. To answer this
user-relevant question, we first analyse seasonal distributions of single-day
precipitation area sums over the time period 1961–2011 (Fig. 7). Area sums
are defined as the precipitation sum over the eight stations. Note that days
with an area sum of zero were excluded from this analysis and are not shown.
The observations (grey box plots) show in the median only a weak
inter-seasonal variability with somewhat higher sums during summer. The
spread in daily precipitation is smallest for winter and spring and largest
for summer, owing to the higher extreme precipitation values observed. Common
to all seasons is a distribution that is heavily right-skewed, ranging from
nearly dry conditions up to about 220 mm day

Daily non-zero precipitation sums over the catchment for the four seasons during 1961–2011. Daily precipitation intensity of the eight stations is summed and days with an area sum of zero are excluded. Box plots of observed daily sums (grey), of multi-site simulated time series (blue) and of single-site simulated time series (red) are shown. The WG models were run 100 times over a 51-year time period. The numbers (in percentage) indicated above the corresponding model represent the relative deviation of the simulated median from the observed median.

Compared to observations, the multi-site generator reproduces well the median
of the observed daily areal sums. The relative deviations remain rather
small, ranging from

The previous analysis has revealed a pronounced added value when incorporating spatial dependencies in the stochastic simulation of daily areal precipitation sums over the Thur.

Similarly to Sect. 4.2.1, we want to go a step beyond and additionally include the temporal structure. Note that by investigating spatial precipitation sums over multi-days, we explore the limits of our WG. We analyse in Fig. 8 annual maxima of observed (grey) and modelled (blue and red for multi-site and single-site, respectively) precipitation sums over several consecutive days (2, 5, and 10 days). This means that out of the aggregated catchment time series, we compute temporal sums over consecutive days and take the maximum in each year.

Regarding the performance of the calibrated WG in multi-site and single-site
mode, Fig. 8 shows that both clearly underestimate the observed sums. Yet,
the multi-site model exhibits much smaller deviations from the observed
distribution than the single-site model, and hence the added value of the
multi-site WG is clearly evident. In fact, the sums simulated with the
multi-site WG are larger by a factor of around 1.8 than those generated with
the single-site WG. Overall, deviations from observations are reduced from
about

The incorporation of inter-station dependencies into the stochastic model brings substantial added value over multiple single-site models regarding daily and multi-day areal precipitation sums over the Thur catchment. Similar benefits from the multi-site WG would be expected for other Alpine catchments and regions with complex topography, where correlations between sites are significant but well below unity. For very homogeneous regimes (inter-station correlation near unity), one single-site WG would be sufficient for the catchment area, whereas for low spatial correlations, several independent single-site WGs can be used.

A stochastic simulation with multi-site correlation structure comes with additional uncertainty from parameter estimations, additional implementation complexity and additional computational costs. The decision for incorporating spatial dependencies must therefore be balanced with the benefit. A careful inspection of the observed precipitation regime and its spatial structure over the catchment prior to the simulation is necessary to decide in favour of or against multi-site simulation. This is also important in terms of validation: for a large catchment area that is frequently affected by frontal passages, the validation of the precipitation generator should include more complex space–time dependency analyses. An example is the probability of a certain precipitation amount at a particular station given precipitation at a neighbouring station some days earlier.

Annual maximum precipitation summed over all eight stations and over consecutive days. The analysis is done for all days of the year. The bars (horizontal line) indicate the range between the 2.5 and 97.5 % empirical quantiles of the yearly maximum area sums during 1961–2011. The observations are plotted in grey, the multi-site simulations in blue and the single-site simulations in red. The observations comprise 51 years, and the models were run 100 times over a 51-year time period.

In the following, we want to elaborate more on the question of why we have implemented the rather simple multi-site precipitation model of Wilks (1998) and not a more sophisticated one. As already mentioned in the introduction, one premise of our work was to implement a stochastic tool that can be subsequently applied in a climate change context. This means that the number of model parameters needs to be kept limited for practical purposes such as calibration handling and evaluation of parameter changes from multi-models. An approach such as NHMM is conditioned on atmospheric circulation, changes of which would need to be constrained when used as a downscaling technique. However, from model evaluation studies it is well known that climate models are prone to substantial circulation errors (e.g. van Haren et al., 2012; van Ulden and van Oldenborgh, 2006), with effects on the local precipitation. Furthermore, the overall performance of a NHMM is highly dependent on the predictive power of atmospheric circulation patterns and the number of synoptic weather states, respectively (Schiemann and Frei, 2010). In winter, we would expect a NHMM to perform better than in summer, when the precipitation process is mainly dominated by local-scale convective processes triggered by orography. However, we need a downscaling technique that equally applies to all seasons. Also, for a small catchment scale such as the Thur, the variability of the local precipitation pattern is pre-dominantly caused by physiographic factors, such as height differences, or shielding effects, rather than by large-scale atmospheric patterns. As was shown in Table 1, at around 70 % of all days over 1961–2011, all stations in the catchment are simultaneously dry or wet. Under these circumstances the use of a NHMM would be feasible after careful calibration. For all these reasons, the precipitation generator by Wilks (1998) is in our view the more direct approach to guarantee the spatial consistency for the stations in our catchment.

For many impact applications, gridded precipitation data instead of multiple scattered stations would be beneficial. This demand could be achieved by interpolating the spatially consistent synthetic station data over the area of interest. A more sophisticated and elegant method, however, is to build a field generator, for instance by high-dimensional random Gaussian fields (e.g. Pegram and Clothier, 2001), random cascade models (e.g. Over and Gupta, 1996) or Poisson cluster models (e.g. Burton et al., 2008). An alternative would be to rely on geostatistical methods, for instance by prescribing a spatial correlation function at gauged and ungauged locations, which additionally also requires specifying parameters of the WG between the sites (e.g. Wilks, 2009). In regions with complex topography, this additional interpolation is not straightforward. It could be alleviated by explicitly including information on topographic aspects (e.g. altitude, aspect and slope) in a GLM (McCullagh and Nelder, 1989) or Bayesian hierarchical modelling approach (Gelman and Hill, 2006). These are appealing frameworks that allow the modelling of physiographic dependencies in the precipitation amount and occurrence model. However, this alone is not sufficient for a space–time weather generator, as the spatial dependence of daily precipitation is also determined by spatial autocorrelation and not just by the physiographic conditioning of parameters. Clearly, the development of a gridded space–time weather generator dealing with spatial autocorrelation, physiographic conditioning, intermittence and temporal autocorrelation is highly challenging and needs fundamental methodological development. This is beyond the scope in the present study, where our main focus was to develop an easy-to-use statistical downscaling tool for current and future climate.

The multi-site precipitation generator of Wilks (1998) has been successfully developed, implemented and tested over Swiss Alpine river catchment Thur. The precipitation generator treats precipitation occurrence as a Markov chain and simulates non-zero daily precipitation amounts from a mixture model of two exponential distributions. The spatial dependency is ensured by running the WG with spatially correlated random numbers. The model was calibrated on a monthly basis by using daily station data over a 51-year long time period from 1961 to 2011, and extensively compared to the observed record and to simulations based on multiple independent single-site WGs.

Our main findings of this study are the following.

The multi-site precipitation generator realistically reproduces key precipitation statistics at single stations, including the annual cycle, quantiles of non-zero precipitation amounts, multi-day spells and multi-day amount statistics.

The precipitation generator is able to generate relatively large stochastic variability. Nevertheless, it is rather low compared to observed inter-annual variability, where it underestimates inter-annual variability by a factor of 3.

The incorporation of inter-station dependencies into the stochastic process brings substantial added value over multiple single-site WGs. The medians of daily area sums are higher by about a factor of 1.3 than those from independent single-site models. In addition, the multi-site WG is able to capture about 95 % of the observed variability, while the single-site WG only explains about 13 %. Annual maxima of multi-day sums over the catchment increase by about a factor of 1.8 by incorporating the inter-site dependence into the stochastic simulations.

The added value is largest when the precipitation regime is subject to a large spatial and temporal heterogeneity, as is the case over the Thur catchment.

Therefore, care should be taken when using the precipitation generator as a tool for a broad risk assessment, in particular with respect to extreme events.

These inherent limitations point to potential future refinements of the presented model. (a) To better reproduce extreme precipitation, we intend to implement a three-state Markov chain model with the states dry, wet, and very wet and with state-dependent PDFs. From this, we expect a substantial improvement of 1-day and multi-day extremes as well as a better reproduction of multi-day precipitation sums. (b) To alleviate the underestimation of inter-annual variability, we will introduce a non-stationary model. This could be accomplished by sampling from a distribution of observed WG parameters (instead of taking the mean) or by formulating a regression model using large-scale atmospheric variables as predictors (see e.g. Furrer and Katz, 2007).

Besides these methodological improvements, the precipitation generator will be subject to two extensions: (a) the coupling of daily minimum and maximum temperature as additional atmospheric variables and (b) the adjustment of the WG parameters to represent a future mean climate. Finally, the time series over the Thur catchment will serve as input for a hydrological model to assess the added value of multi- versus single-site WGs in terms of runoff and to assess the implications of the systematic biases of the WG for hydrological quantities.

This work is supported by ETH research grant CH2-01 11-1. We would like to thank the Center for Climate Systems Modeling (C2SM) at ETH Zurich for providing technical and scientific support.Edited by: E. Morin