Assessment of meteorological extremes using a synoptic weather generator and a downscaling 1 model based on analogs 2

Natural risk studies such as flood risk assessments require long series of weather variables. As an 9 alternative to observed series, which have a limited length, these data can be provided by weather 10 generators. Among the large variety of existing ones, resampling methods based on analogues have 11 the advantage of guaranteeing the physical consistency between local variables at each time step. 12 However, they cannot generate values of predictands exceeding the range of observed values. 13 Moreover, the length of the simulated series is typically limited to the length of the synoptic 14 meteorology records used to characterize the large-scale atmospheric configuration of the 15 generation day. To overcome those limitations, the stochastic weather generator proposed in this 16 study combines two sampling approaches based on atmospheric analogues: 1) a synoptic weather 17 generator in a first step, which recombines days in the 20 century to generate a 1,000-year 18 sequence of new atmospheric trajectories and 2) a stochastic downscaling model in a second step, 19 applied to these atmospheric trajectories, in order to simulate long time series of daily regional 20 precipitation and temperature. The method is applied to daily time series of mean areal precipitation 21 and temperature in Switzerland. It is shown that the climatological characteristics of observed 22 precipitation and temperature are adequately reproduced. It also improves the reproduction of 23 extreme precipitation values, overcoming previous limitations of standard analog-based weather 24 generators. 25 26 2. Introduction 27

Abstract. Natural risk studies such as flood risk assessments require long series of weather variables. As an alternative to observed series, which have a limited length, these data can be provided by weather generators. Among the large variety of existing ones, resampling methods based on analogues have the advantage of guaranteeing the physical consistency between local weather variables at each time step. However, they cannot generate values of predictands exceeding the range of observed values. Moreover, the length of the simulated series is typically limited to the length of the synoptic meteorological records used to characterize the large-scale atmospheric configuration of the generation day. To overcome these limitations, the stochastic weather generator proposed in this study combines two sampling approaches based on atmospheric analogues: (1) a synoptic weather generator in a first step, which recombines days of the 20th century to generate a 1000-year sequence of new atmospheric trajectories, and (2) a stochastic downscaling model in a second step applied to these atmospheric trajectories, in order to simulate long time series of daily regional precipitation and temperature. The method is applied to daily time series of mean areal precipitation and temperature in Switzerland. It is shown that the climatological characteristics of observed precipitation and temperature are adequately reproduced. It also improves the reproduction of extreme precipitation values, overcoming previous limitations of standard analogue-based weather generators.

Introduction
Increasing the resilience of socio-economic systems to natural hazards and identifying the required adaptations is one of today's challenges. To achieve such a goal, one must have an accurate description of both past and current climate conditions. The climate system is a complex machine which is known to fluctuate at very small timescales but also at large ones over multiple decades or centuries (Beck et al., 2007). It is necessary to study meteorological series as long as possible in order to catch all sources of variability and fully cover the large panel of possible meteorological situations. Regarding weather extremes, the same need arises, as estimating return levels associated with large return periods cannot be successfully done without long climatic records (e.g. Moberg et al., 2006;Van den Besserlaar et al., 2013). This comment also applies to all statistical analyses on any derived variable, such as river discharge, for which multiple meteorological drivers come into play and for which extreme events correspond to the combination of very specific and atypical hydro-meteorological conditions.
Using weather generators, long simulations of weather variables provide accurate descriptions of the climate system and can be used for natural hazard assessments. Among the large panel of existing weather generators, stochastic ones are used to construct, via a stochastic generation process, single or multi-site time series of predictands (e.g. precipitation and temperature) based on the distributional properties of observed data. These characteristics, and consequently the weather generator parametrization, are usually determined on a monthly or seasonal basis to take seasonality into ac-count. They can also be estimated for different families of atmospheric circulation, often referred to as weather types. The state of the art of the most common methods which have been used for the downscaling of precipitation (single or multisite) is presented in Wilks and Wilby (1999) and in Maraun et al. (2010). More recent publications gather detailed reviews of some sub-categories of weather generators (e.g. Ailliot et al., 2015, for hierarchical models). An increasing number of studies focus on the generation of multi-variate and/or multisite series of predictands (e.g. Steinschneider and Brown, 2013;Srivastav and Simonovic, 2015;Evin et al., 2018a, b). Stochastic weather generators are able to produce large ensembles of weather time series presenting a wide diversity of multi-scale weather events. For all these reasons, they have been used for a long time to enlighten the sensitivity and possible vulnerabilities of socio-ecosystems to the climate variability (Orlowsky and Seneviratne, 2010) and to weather extremes.
Other models used for the generation of weather sequences are based on the analogue method. Since the description of the concept of analogy by Lorenz (1969), the analogue method has gained popularity over time for climate or weather downscaling. This analogue-model strategy has been applied in many studies (Boé et al., 2007;Abatzoglou and Brown, 2012;Steinschneider and Brown, 2013) and has been used to address a wide range of questions from past hydro-climatic variability (e.g. Kuentz et al., 2015;Caillouet et al., 2016) to future hydro-meteorological scenarios (e.g. Lafaysse et al., 2014;Dayon et al., 2015). The standard analogue-approach hypothesizes that local weather parameters are steered by synoptic meteorology. A set of relevant large-scale atmospheric predictors is used to describe synoptic weather conditions. From the atmospheric state vector, characterizing the synoptic weather of the target simulation day, atmospheric analogues of the current simulation day are identified in the available climate archive. Then, the analogue method makes the assumption that similar large-scale atmospheric conditions have the same effects on local weather. The local or regional weather configuration of one of the analogue days is then used as a weather scenario for the current simulation day. The key element of the analogue method is that it does not require any assumption on the probability distributions of predictands. This is a noteworthy advantage for predictands, such as precipitation, which have a non-normal distribution with a mass in zero. Most of the studies using analogues focused on precipitation and temperature either for meteorological analysis (Chardon, 2014;Ben Daoud et al., 2016) or as inputs for hydrological simulations (Marty et al., 2013;Surmaini et al., 2015). Nevertheless, analogues are increasingly used for other local variables such as wind, humidity (Casanueva et al., 2014) or even more complex indices (e.g. for wild fire; Abatzoglou and Brown, 2012). When multiple variables are to be downscaled simultaneously, another major advantage of the analogue method is that the different predictands scenarios are physically consistent and the simulated weather variables are bound to reproduce the correlations between the variables (e.g. Raynaud et al., 2017) and sites (Chardon et al., 2014). Indeed, when analogue models use the same set of predictors (atmospheric variables and analogy domains) for all predictands, all surface weather variables and sites are sampled simultaneously from the historical records, thus preserving inter-site and inter-variable dependency.
The two simulation approaches (stochastic weather generators and analogue methods) described above present some important advantages for the generation of long weather series but also some sizable drawbacks. Indeed, stochastic weather generators rely on strong assumptions about the statistical distributions of predictands. Identifying the relevant mathematical representations of the processes and achieving a robust estimation of their parameters can be difficult, especially if the length of the meteorological records is short. Modelling the spatial-temporal dependency between variables and sites is often another challenge. Conversely, for the analogue-based approaches, the identification of relevant atmospheric variables providing good prediction skills is not straightforward. The limited length of local weather records is also a critical issue, since resampling past observations restricts the range of predicted values. In particular, the simulation of unobserved values of predictands is not possible. This can be problematic if one is interested in estimating possible extreme values of the considered variable. Furthermore, the information on synoptic atmospheric conditions required by analogue methods are generally coming from atmospheric reanalyses, which also have a limited temporal coverage (e.g. from the beginning of the 20th century for ERA-20C -EMCWF Reanalysis; Poli et al., 2016) and from the mid-19th century for 20CR (Twentieth Century Reanalysis; Compo et al., 2011). The length of the generated time series is thus typically bounded by the length of the reanalyses.
In this study we propose a weather generator (hereafter SCAMP+) building upon the SCAMP (Sequential Constructive atmospheric Analogs for Multivariate weather Prediction) approach presented by Chardon et al. (2018) and making use of reshuffled atmospheric trajectories, following some of the developments by Buishand and Brandsma (2001) and Yiou (2014). With the weather scenarios generated by SCAMP being limited by the coverage of the climate reanalyses, the SCAMP+ model extends the pool of possible atmospheric trajectories. Using random transitions between past atmospheric sequences, SCAMP+ generates unobserved atmospheric trajectories on which the two-stage SCAMP approach can be applied. By exploring a wide variety of atmospheric trajectories, SCAMP+ introduces some additional large-scale variability which improves the exploration of possible weather sequences. In addition, as done in SCAMP (Chardon et al., 2018), the SCAMP+ approach includes a simple stochastic weather generator which is estimated, for each generation day, from the nearest atmospheric analogues of this day. These two steps (random atmospheric trajectories and random daily precipitation and temperature values) improve the reproduction of extreme values, overcoming previous limitations of analogue-based weather generators, usually known to underestimate observed precipitation extremes.
These developments are carried out for the exploration of hydrological extremes (extreme floods) of the Aare River basin in Switzerland (Andres et al., 2019a, b). Meteorological forcings, i.e. temperature and precipitation, are thus simulated to be used as inputs of a hydrological model, for different sub-basins of the Aare River basin. Meteorological simulations from SCAMP+ have been used in the Swiss EXAR (Hazard information for extreme flood events on the rivers Aare and Rhine) project 1 and have proven its ability to estimate the discharge values associated with very large return periods on the Aare River. In Sect. 2, we describe in details the test region, the data and three simulation approaches (a classical analogue method, referred to as ANALOGUE, SCAMP and SCAMP+). Section 3 presents the main results of both climatological characteristics and extreme values. Section 4 sums up the main outputs of this study and proposes some further developments and analysis.
2 Data and method

Studied region
This study is carried out on the Aare River basin, which covers almost half of Switzerland (17 700 km 2 ). The topography varies greatly within the basin with, on one hand, high mountains on its southern part (maximum altitude of 4270 m, Finsteraarhorn) and on the other hand, plains on the northern part (minimum altitude of 310 m). These different characteristics coupled with the basin are located at the crossroads of several climatic European influences give a wide diversity of possible weather situations across the year.

Atmospheric reanalysis and local weather data
The application of the analogue method requires a long archive providing an accurate description of both past synoptic weather patterns and local atmospheric conditions. Indeed, a wide panel of meteorological situations available for resampling is necessary in order to identify the best analogues for the simulation (e.g. Van Den Dool et al., 1994;Horton et al., 2017). In most studies, synoptic situations are provided by atmospheric reanalyses. Here, we use the ERA-20C atmospheric reanalysis (Poli et al., 2013), which provides information on large-scale atmospheric patterns on a 6 h basis from 1900 to 2010. Data are available at a 1.25 • spatial resolution. More specifically, the set of predictors used for the identification of atmospheric analogues is made of the geopotential height at 500 and 1000 hPa, the vertical velocities at 600 hPa, large-scale precipitation, and temperature. The justification of these choices will be given in Sect. 3.3.1.
The local and surface weather parameters of interest are retrieved from 105 weather stations for precipitation and 26 weather stations for temperature, which are spread out homogeneously over our target region, as presented in Fig. 1. These data are available at a daily time step from 1930 to 2014. They have been spatially aggregated in order to obtain daily time series of mean areal precipitation (MAP) and temperature (MAT) for the Aare region. The three weather generators considered in this study aim at producing scenarios of daily time series of MAP and MAT. In this study, a scenario is defined as a possible realization of the climate system under current climate conditions (i.e. the climate observed for the past few decades). It can be noticed that many applications of analogue-based approaches produce simulations at specific weather stations. However, as shown by Chardon et al. (2016) for France, the prediction skill is significantly improved when the prediction is produced for areal averages, which motivates the generation of MAP and MAT values in this study.

Description of the three models
This section presents the three different models considered and evaluated in this study.

ANALOGUE: classical analogue model
The most basic model evaluated in this study, hereafter referred to as ANALOGUE, relies on a standard two-level analogue method. For each day of the simulation period , analogue days are identified from candidate days. The candidate days extracted from the archive period, i.e. the period during which both predictors and local observations are available , are all days of the archive located within a 61 d calendar window centred on the target day. This calendar filter is expected to account for the possible seasonality of the large-scale-small-scale downscaling relationship. For instance, candidate days for 15 May 2000 are selected within the pool of days ranging from 15 April to 14 June of each year of the archive.
The predictors used for the analogue selection were chosen based on Raynaud et al. (2017). They have been shown to guarantee both inter-variable physical consistency and good predictive skills according to the Continuous Ranked Probability Skill Score (CRPSS) for four predictands (precipitation, temperature, solar radiation and wind). In the present work, the predictors considered for each level for the twolevel analogy are as follows: -The first level of analogy is based on daily geopotential heights at 1000 and 500 hPa (HGT1000 and HGT500) as proposed by Horton et al. (2012) and Ray-  (1954). This score has been found to lead to higher performance than a more classical Euclidian or Mahalanobis distance (Kendall et al., 1983;Guilbaud and Obled, 1998;Wetterhall et al., 2005). It quantifies the similarity between two geopotential fields by comparing their spatial gradients. It allows for selecting dates that have the most similar spatial patterns in terms of atmospheric circulation. From September to May, the analogy is based on the geopotential fields on both the current day D and its following day D + 1 at 12:00 UTC. Thereby, the motions of low-pressure systems and fronts are better described, and the prediction skill of the method for precipitation is improved (e.g. Obled et al., 2002;Horton and Brönnimann, 2019). In summer, only the geopotential fields on the current day are used as no similar improvement could be found with a 2 d analogy. During this first analogy level, 100 analogues are selected for each day of the target period.
-The second analogy level makes a sub-selection of 30 analogues within the 100 analogues identified in the first analogy level. The analogy score used for the selection is the root mean square error (RMSE). From September to May, the predictors are the vertical velocities at 600 hPa and the large-scale temperature at 2 m. In summer, not only the vertical velocities but also other predictors such as the convective available potential energy (CAPE) led to a rather poor prediction of precipitation due to the coarse resolution of the atmospheric reanalysis, which prevents it from providing an accu-rate simulation of convective processes. Consequently, large-scale precipitation from the reanalysis has been used as a predictor instead, resulting in predictive skills similar to the ones obtained for the rest of the year. The different predictor sets retained for summer and the rest of the year illustrate the differences typically observed between seasons for the main meteorological conditions and processes.
The dimensions and position of the different analogy windows used to compute the analogy measures are presented in Fig. 2. They follow the recommendations for the analogy windows optimization presented in Raynaud et al. (2017) for all predictors. With this two-step analogy, 30 scenarios of daily MAP and daily MAT are obtained for each day of the simulation period . Combined with the Schaake shuffle method described in Sect. 3.3.4, the application of the ANA-LOGUE model leads to 30 scenarios of 110-year time series of daily MAP and MAT.

SCAMP: combined analogue and generation of MAP and MAT values
The SCAMP model enhances the previous ANALOGUE approach, which is not able to generate daily values exceeding the range of observed precipitation and temperature. SCAMP combines the analogue method with a day-to-day adaptive and tailored downscaling method using daily distribution adjustment (Chardon et al., 2018). For each prediction day, the following discrete-continuous probability distribution proposed by Stern and Coe (1984) is fitted to the 30 MAP values obtained from the atmospheric analogues of this day: where π is the precipitation occurrence probability and F GA is the gamma distribution parameterized with a shape parameter α > 0 and a rate parameter β > 0. The π parameter is directly estimated by the proportion of dry days, and the parameters α and β of the gamma distribution are estimated by applying the maximum likelihood method to the positive precipitation intensities among the 30 MAP values.
Thirty MAP values are then sampled from the distribution model defined in Eq. (1) in order to obtain unobserved values of precipitation, possibly beyond past observations. When there are less than five positive MAP intensities in the analogues, we simply retrieve the MAP analogue values. This distribution model corresponds to a simplified version of the combined analogue and regression model described in Chardon et al. (2018), and we refer the reader to this paper for further information.
Similarly, for each prediction day, a Gaussian distribution F N (µσ ) is fitted to the 30 MAT values obtained from the analogues. A sample of 30 new MAT values is then generated from this fitted Gaussian distribution.
As for the ANALOGUE approach, the Schaake shuffle reordering method is applied to the daily scenarios obtained from SCAMP. A total of 30 scenarios of 110-year time series of daily MAP and MAT are produced.

SCAMP+
As mentioned previously, the first limitation of the analogue method is related to the length of the synoptic weather information that is used to generate local predictand time series. In the present case, the length of time series that can be produced with the models ANALOGUE and SCAMP is limited to 110-year-long weather scenarios.
In SCAMP+, we extend the archive of synoptic weather information by rearranging the synoptic weather sequences, thus creating new atmospheric trajectories, used in turn as inputs to SCAMP. This generation of new trajectories makes use of atmospheric analogues, following those of the principles proposed in the weather generators described by Buishand and Brandsma (2001) and Yiou (2014). For any given day, the atmospheric synoptic weather is considered to have the possibility to change its trajectory. The main hypothesis of this generation module is that if days J and K are close atmospheric analogues with atmospheric patterns heading in the same direction, then their "future" is exchangeable, and one could jump from one atmospheric trajectory to the other. In other words, day J + 1 is a possible future of day K, and conversely day K+1 is a possible future of day J . The probability p to jump from one trajectory to any other is considered as a parameter to estimate.
The principle of a random atmospheric-trajectory generation is sketched on Fig. 3. In the present work, the only predictor involved with comparing the synoptic atmospheric configuration between 2 different days is the geopotential height field at 1000 hPa, for both the present day and its followers. The spatial analogy domain is the one used in Philipp et al. (2010) for the identification of Swiss weather types. The first line of Fig. 3 presents an observed atmospheric trajectory in HGT1000 from 8-12 February 1934. On 9 February, we look for analogues of the current day and its following day D + 1. This is done to ensure that the two initial states are similar (high-pressure system located over France on 9 February 1934 and on its analogue, 28 January 1921) and that the main features move in similar directions (highpressure system heading southeast on both 10 February 1934 and 29 January 1921).
Practically, the five best analogues of the current atmospheric 2 d sequence are identified, and one of those sequences is then selected with a probability p to generate the new day of the new trajectory. The same method is repeated for this new day to find its future day (as illustrated in Fig. 3 for the sequence 30 January 1921-12 February 1925) and extend the new trajectory by 1 additional day. This process is repeated as long as necessary. In the present work, it was used to generate a 1000-year trajectory of daily synop- tic weather situations. Rather large differences between the synoptic weather situation can be obtained after some days between the observed atmospheric sequence (e.g. 12 February 1934) and the random atmospheric trajectory (12 February 1925). As we will show later on, such a method leads to higher weather variability at multiple timescales.
To ensure that 2 consecutive days of the generated sequences belong to the appropriate season, the five 2 d analogue sequences are identified within a ±15 d moving window centred on the calendar day of the target simulation day (e.g. all June days if the target day is xxxx-06-15 in the yearmonth-day date format).
The transition probability p from one observed trajectory to another indirectly determines the level of persistency of synoptic configurations. In this study, it has been calibrated in order to guarantee a good climatology of the large-scale atmospheric sequences. To do so, we analysed the mean frequency and duration of each of the nine weather types proposed for Switzerland by Philipp et al. (2010) in the observed synoptic series and in different reconstructed ones for transition probability p ranging from 1/10 (one transition every 10 d on average) to 1 (one transition every day on average). The results presented in Fig. 4 show that a transition proba-bility of 1/7 is necessary to generate atmospheric trajectories that present a relevant persistency within each weather type.
The long time series of synoptic weather generated with the above approach is further used as inputs to the SCAMP generator described in the previous section. The SCAMP+ approach leads to 30 scenarios of daily MAP and MAT, each of these scenarios being based on the 1000-year random atmospheric-trajectory sequence. The output of this approach, combined with the Schaake shuffle method described in the next section, is thus composed of 30 scenarios of 1000year time series of daily MAP and MAT.

Temporal consistency: application of the Schaake shuffle
For each model (ANALOGUE, SCAMP and SCAMP+) and each day of the simulation period, 30 scenarios of daily MAP and MAT are produced. To improve the temporal and physical consistency between 2 consecutive days or between the temperature and precipitation scenarios (partially induced by the synoptic weather series), we use the Schaake shuffle method initially proposed by Clark et al. (2004). This method makes use of both the inter-variable physical and the intravariable temporal consistency in observations to combine, at best, the outputs of any weather generator and reconstruct consistent predictand time series. It is particularly useful if one is interested in generating relevant precipitation accumulation scenarios over several days. A full description of the Schaake shuffle method can be found in Clark et al. (2004), and some applications can be found in Bellier et al. (2017) and in Schefzik (2016). Here, the Schaake shuffle consists in modifying the sequences of MAP and MAT values, preserving the association of the ranks of MAP and MAT, and rearranging sequences between days D and D + 1. Shuffled MAP and MAT sequences between consecutive days then have similar associations than what has been observed. In this study, we give priority to the temporal consistency of precipitation first. Temperature scenarios are recombined in a second step. The different components of the models ANALOGUE, SCAMP and SCAMP+ are summarized in Fig. 5.

Results
This section presents different statistical properties of the scenarios obtained with the three models and discusses the performances of each model by comparison with observed statistical properties. For the sake of consistency between the outputs, we compare the 30 scenarios of 111 years obtained from ANALOGUE and SCAMP to 300 scenarios of 100 years from SCAMP+ (i.e. each scenario of 1000 years is divided into 10 scenarios of 100 years).

Climatology
For both temperature and precipitation, the three models lead to an accurate simulation of their seasonal fluctuations (Fig. 6). However, one can notice the slight overestimation of winter temperature and an underestimation of July and August precipitation. SCAMP also tends to have a smaller interannual variability compared to ANALOGUE and SCAMP+.
The distributions of seasonal precipitation amounts and seasonal temperature averages are presented in Fig. 7. Whatever the season, the three models are able to generate drier and wetter seasons than the observed ones (Fig. 7a). The very similar results obtained for ANALOGUE and SCAMP suggest that the daily distribution adjustments used in SCAMP do not introduce more variability at the seasonal scale. SCAMP+ is able to generate seasonal values that significantly exceed the maximum values simulated by ANA-LOGUE and SCAMP (by 100 to 200 mm). This strongly suggests that a large part of the seasonal variability comes from the variability of the synoptic weather trajectories, the unobserved weather trajectories produced by SCAMP+ leading to a wider exploration of extreme seasonal values.
The same comments can be made for spring and autumn temperatures (Fig. 7b). For those variables however, SCAMP+ fails to simulate extremely hot summers or cold winters. This limitation will be further discussed in the next section with some additional analysis and opportunities for improvement.

Daily precipitation extremes
As mentioned in Sect. 1, simple analogue methods cannot simulate unobserved precipitation extremes at the temporal resolution of the simulation (here daily). Moreover, for higher aggregation durations, they also tend to underestimate observed precipitation extremes. Figure 8 presents the precipitation values obtained with the three models for different return periods (from 2 to 200 years) and different aggregation durations (from 1 to 5 d).
Considering 1 d extreme events, ANALOGUE is obviously not able to generate precipitation accumulations that exceed the maximum observed one. Combining the analogue method with daily distribution adjustments (SCAMP) overcomes this issue with maximum values reaching 115 mm. SCAMP+ leads to similar results.
The large underestimation of daily extremes obtained with ANALOGUE leads to an important underestimation of 3 and 5 d extremes. Despite a better simulation of daily values, SCAMP does not improve significantly the reproduction of 3 and 5 d extremes. SCAMP+ outperforms both models for all durations and generates precipitation extremes in agreement with observed extremes. Whatever the return period, the variability between the different 100-year scenarios is larger with SCAMP than with ANALOGUE and much larger with SCAMP+. This again suggests that 3 to 5 d extreme events can arise from atypical synoptic conditions, possibly not available in a 110-year long weather archive. Thanks to the random atmospheric trajectories, SCAMP+ is able to generate such conditions. For all models, we present the dispersion between the 30 annual values obtained from the 30 time series associated with the different atmospheric trajectories. This dispersion is very small for temperature and rather large in comparison for precipitation, illustrating the important uncertainty in the large-scalesmall-scale relationship for this variable in this region.

Multi-annual variability
For ANALOGUE and SCAMP, the simulated year-to-year variations of annual precipitation and temperature are in agreement with the observed ones. The successions of dry and wet or cold and warm years are well simulated in both temporality and amplitude, and the positive trend in temperature starting in 1980 is also adequately reproduced. Similar results are obtained for seasonal precipitation and temperature (not shown). These results illustrate the determinant influence of the large-scale conditions on local weather in this region and the relevance of a generation process based on atmospheric analogues.
In contrast, the chronological year-to-year variations produced by the different runs of SCAMP+ present differ- ent features. The annual precipitation and temperature time series obtained from different runs of SCAMP+ resulting from different large-scale atmospheric trajectories cannot be directly compared to the observed time series. This highlights the ability and interest of SCAMP+ to explore nonobserved sequences of precipitation and temperature at annual and multi-annual scales. Finally, it must be noticed that SCAMP+ simulations are not expected to reproduce the warming observed after 1980. Indeed, the different runs presented in Fig. 9b are associated with different 100-year subsets of the 1000-year atmospheric-trajectory simulation and do not include any trend (see discussion in Sect. 5).

Discussion and conclusions
The different extensions of the classical analogue method introduced in this study aim at generating long regional weather time series without suffering from the main limitations of analogue models. Indeed, due to the limited extent of the observed time series and the impossibility to simulate unobserved daily scenarios, analogue models usually underestimate observed precipitation extremes. These limitations are relaxed by SCAMP+, the weather generator proposed in this study. SCAMP+ generates unobserved and plausible atmospheric trajectories, and, in addition, provides unobserved samples of daily temperature and precipitation using distribution adjustments. Such a generation process explores larger weather variability at multiple timescales, which leads to a better reproduction of precipitation extremes. SCAMP+ is built upon a number of past studies carried out in the target region with analogue-based downscaling approaches. Different sensitivity analyses could be performed in order to assess the impact of the different modelling choices, e.g. the set of predictors used for the analogue selection, the number of analogues selected for the different analogy levels or the parameters related to the generation of atmospheric trajectories (e.g. probability of transition between large-scale trajectories). SCAMP+ is obviously not free of limitations. A first issue is relative to the quality of observations used in the model, especially at the synoptic scale. ERA-20C reanalyses used here are produced using sea level pressure and wind measurements only. This guarantees a certain quality of the geopotential at 1000 hPa. The quality of 500 hPa data and of the other predictors is conversely questionable (namely largescale temperature, precipitation and vertical velocities), as they do not benefit from the assimilation of observed data. This may impact the quality of the downscaling method. For instance, this could explain why the mean seasonal cycle of monthly precipitation is not well reproduced in our results (see for instance the underestimation of the mean precipitation in August). Using higher-quality data is expected to partly address such limitations. Indeed, using ERA-Interim reanalyses (Dee et al., 2011) instead of ERA-20C Figure 8. Return level analysis of extreme precipitation values associated with models ANALOGUE, SCAMP and SCAMP+ for accumulation over 1, 3 and 5 d. The grey shadings present the inter-quantile intervals at 50 %, 90 % and 99 % levels (30 × 111-year scenarios for models ANALOGUE and SCAMP and 300 × 100-year scenarios for SCAMP+).
removes the biases and misreproductions mentioned above (not shown), with a much larger panel of weather observations being assimilated in ERA-Interim. However, ERA-Interim covers a much smaller time period than ERA-20C (roughly 50 years). Using ERA-Interim for our simulations would make the panel of observed synoptic situations much less representative of possible ones and would impact the ability of our model to generate long-term climate variability.
As highlighted previously, a noticeable limitation of SCAMP+ is its difficulty to generate very hot summers or cold winters. The predictors used for the selection of the analogues may actually prevent the simulation of very cold or hot seasons. Choosing the geopotential height at 1000 hPa on 2 consecutive days guarantees similar positions of high-and low-pressure systems and comparable movements of these features for the target day and its analogues. This guarantees that the transition from one atmospheric trajectory to another is correct in terms of anticyclonic or unsettled weather, but this cannot guarantee that the transition is correct in terms of air mass temperatures. This might prevent the generation of long hot or cold sequences. A possible improvement of the method would be to include some temperature predictors in the selection of analogue days. Similarly, SCAMP+ is able to generate relevant inter-annual fluctuations of unobserved climate time series. However, long-term fluctuations do not seem to be efficiently generated (at least for temperature). These types of variations are actually driven by very largescale or global phenomena such as the Atlantic Multidecadal Oscillation (AMO; Hurrell and Van Loon, 1997;Trigo et al., 2002;Rogers, 1997). In SCAMP+, we do not account for such driving phenomena. Introducing additional drivers such as the AMO index in the generation of atmospheric trajectories could improve the results in this respect.
Trends in observed predictors and predictands, as a result of global warming, could be an additional issue. For instance, the mean elevation of geopotential fields is often expected to increase with mean temperature. Such trends may be detrimental for the simulations because the analogues identification process would be carried out in a non-homogenous dataset. In the present work for instance, trends in the secondanalogy-level predictors (VV600, P and T ) might result, to some extent, in selecting analogues preferentially within the   is presented with the black solid line in the plots associated with ANALOGUE and SCAMP models. The grey shadings present the inter-quantile intervals at 50 %, 90 % and 99 % levels (30 × 111-year scenarios for models ANALOGUE and SCAMP and 30 × 100-year scenarios for SCAMP+). (b) Time series of annual MAT for the ANALOGUE model , SCAMP (1900SCAMP ( -2010 and four different 100-year atmospheric trajectories of SCAMP+. The observed annual MAT (1930MAT ( -2014 is presented with the black solid line in the plots associated with ANALOGUE and SCAMP models. The grey shadings present the inter-quantile intervals at 50 %, 90 % and 99 % levels (30 × 111-year scenarios for models ANALOGUE and SCAMP and 30 × 100-year scenarios for SCAMP+). same decade rather than distant ones. This could then reduce the reshuffling potential of the method. This issue is likely to be less critical for the first analogy level of SCAMP and for the generation of atmospheric trajectories in SCAMP+. In this case, analogues are selected according to the Teweles-Wobus score, which compares the shapes of geopotential fields and not their absolute values. Quantifying the similar-ity between these geopotential fields, instead of differences in magnitude, removes the influence of a potential long-term trend in this predictor.
All in all, SCAMP+ weather generator paves the way for more developments and applications. As part of the EXAR project (see acknowledgements), the model was coupled with a spatial and temporal disaggregation model and fed a dis-tributed hydrological model in order to generate long series of discharge data (Andres et al., 2019a, b). Additional evaluations on the inter-variable co-variability showed that the physical consistency between temperature and precipitation is well reproduced in our simulations and that the model thus efficiently simulates the precipitation phase and the statistical characteristics of liquid and solid precipitation. SCAMP+ has a low computational cost and is able to generate multiple weather sequences which are consistent with possible trajectories of large-scale atmospheric conditions, which motivates future applications to other regions and other local weather variables.
Data availability. Precipitation and temperature data were downloaded from IDAWEB (https://gate.meteoswiss.ch/idaweb/, Me-teoSwiss, 2020), a data portal which provides users in the field of teaching and research with direct access to archive data of Me-teoSwiss ground level monitoring networks. However, the acquired data may not be used for commercial purposes (e.g. by passing on the data to third parties or by publishing them on the internet). As a consequence, we cannot offer direct access to the data used in this study. Atmospheric predictors are taken from the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-20C atmospheric reanalysis (Poli et al., 2013), available at the following address: https://www.ecmwf.int/en/forecasts/datasets/ reanalysis-datasets/era-20c, (ECMWF, 2020).
Author contributions. JC and DR developed the different models considered here. DR carried out the simulations and produced the analyses and the figures presented in this study. All authors contributed to the analysis framework and to the editing of the paper.