Uncovering the shortcomings of a weather typing method

In recent years many methods for statistical downscaling of the precipitation climate model outputs have been developed. Statistical downscaling is performed under general and method-specific (structural) assumptions but those are rarely evaluated simultaneously. This paper illustrates the verification and evaluation of the downscaling assumptions for a weather typing method. Using the observations and outputs of a global climate model ensemble, the skill of the method is evaluated for precipitation downscaling in central Belgium during the winter season (December to February). Shortcomings of the studied method have been uncovered and are identified as biases and a time-variant predictor– predictand relationship. The predictor–predictand relationship is found to be informative for historical observations but becomes inaccurate for the projected climate model output. The latter inaccuracy is explained by the increased importance of the thermodynamic processes in the precipitation changes. The results therefore question the applicability of the weather typing method for the case study location. Besides the shortcomings, the results also demonstrate the added value of the Clausius–Clapeyron relationship for precipitation amount scaling. The verification and evaluation of the downscaling assumptions are a tool to design a statistical downscaling ensemble tailored to end-user needs.

Abstract. In recent years many methods for statistical downscaling of the precipitation climate model outputs have been developed. Statistical downscaling is performed under general and method-specific (structural) assumptions but those are rarely evaluated simultaneously. This paper illustrates the verification and evaluation of the downscaling assumptions for a weather typing method. Using the observations and outputs of a global climate model ensemble, the skill of the method is evaluated for precipitation downscaling in central Belgium during the winter season (December to February). Shortcomings of the studied method have been uncovered and are identified as biases and a time-variant predictorpredictand relationship. The predictor-predictand relationship is found to be informative for historical observations but becomes inaccurate for the projected climate model output. The latter inaccuracy is explained by the increased importance of the thermodynamic processes in the precipitation changes. The results therefore question the applicability of the weather typing method for the case study location. Besides the shortcomings, the results also demonstrate the added value of the Clausius-Clapeyron relationship for precipitation amount scaling. The verification and evaluation of the downscaling assumptions are a tool to design a statistical downscaling ensemble tailored to end-user needs.

Introduction
For a 1.5 • C temperature rise, the worldwide direct flood damage is estimated to increase by 160 %-240 % (Dottori et al., 2018). To minimise that potential impact, our society opts for two complementary strategies, namely climate mitigation and climate adaptation (Stocker et al., 2013). Consequently, vulnerability, impact and adaptation studies find ground in our society (Alfieri et al., 2016;Åström et al., 2016;Brekke et al., 2009;Termonia et al., 2018;Vansteenkiste et al., 2014;Willems, 2013b). These studies require projected hydrometeorological time series and use the output of global climate models as primary information. However, the direct application of this output for impact modelling is hindered by climate model biases (Kotlarski et al., 2014;Tabari et al., 2016), the mismatch in temporal and spatial resolutions between the climate model output, and the time series required for impact modelling (Cristiano et al., 2018;Salvadore et al., 2015). Therefore, statistical downscaling or dynamical downscaling is applied. The statistical downscaling approach bridges the resolution gap through statistical relationships between the predictors and predictand, whereas in the dynamical downscaling approach regional climate models (RCMs) and limited area climate models (LAMs) are developed. Despite the refined resolution of RCMs and LAMs, their climate model output remains biased and requires bias correction (Ehret et al., 2012;Maraun, 2016;Teutschbein and Seibert, 2012). Both downscaling approaches have strengths and shortcomings arising from their underlying assumptions (Casanueva et al., 2016;Flaounas et al., 2013;Le Roux et al., 2018;Maraun et al., 2010;Vaittinada Ayar et al., 2016).
The statistical downscaling approach builds on four general assumptions (Benestad et al., 2008;Maraun et al., 2010;Schoof, 2013) as follows: -The relationship between the predictors and the predictand is relevant (referred to as the informative assumption). This is of importance in the development of new statistical downscaling methods (SDMs), which requires the selection of predictors (Fu et al., late to the physical processes explaining the predictand changes. Precipitation, more specially, responds to large-scale atmospheric circulation and thermodynamic laws (Emori and Brown, 2005;Kröner et al., 2017;Santos et al., 2016) and, hence, sea level pressure, geopotential height, relative humidity, and/or (dew point) temperature are common predictors .
-The predictors are adequately and accurately simulated by the climate model runs (referred to as the perfect prognosis assumption). The evaluation of this assumption is foremost performed under the name bias analysis. The bias in the predictors depends on, among others, the model resolution, parameterisation schemes, internal variability, and the choice of the reference period (Anstey et al., 2013;Arakawa, 2004;Davini et al., 2017;Deser et al., 2012;Fadhel et al., 2017;Hartung et al., 2017;Prein et al., 2015;Rybka and Tost, 2014;Tabari et al., 2016;Vanden Broucke et al., 2018;Watterson et al., 2014).
-The relationship between the predictors and the predictand remains time-invariant (referred to as the the stationarity assumption). This means that the relationship between the predictors and the predictand, which has been established by using historical observations, remains applicable under climatic changes. Of all the assumptions, this assumption is the most difficult one to validate as no future observations are available yet (Dixon et al., 2016;Lanzante et al., 2018;Salvi et al., 2016;Wang et al., 2018).
-The predictand is sensitive to the greenhouse gas scenarios. Schoof (2013) has pointed out that one predictor variable could strongly respond to the greenhouse gas scenarios while another variable would not. This observation is, for instance, applicable to changes in temperature and mean sea level pressure respectively. Moreover, due to the internal variability of the climate system and the climate-model-related uncertainties, the response of the predictor to the greenhouse gas scenarios is often masked (Van Uytven and . Hence, the response of the predictand to the greenhouse gas scenarios is governed by a smart choice of predictors. Alongside the general statistical downscaling assumptions, each SDM has method specific or structural assumptions. They are encapsulated in the downscaling methodology, create the method strengths and limitations, and are responsible for the statistical downscaling uncertainty contribution. An overview of commonly applied SDMs for precipitation downscaling and their strengths and limitations is provided by Hewitson et al. (2014), Maraun et al. (2010), , and Sunyer et al. (2015).
The main objective of this paper is to simultaneously verify and evaluate the general and structural statistical downscaling assumptions. Most studies address the general and structural statistical downscaling assumptions independently. Hence, there are studies addressing one or some of the general statistical downscaling assumptions (Dixon et al., 2016;Fu et al., 2018;Haberlandt et al., 2015;Hertig et al., 2017;Mendoza et al., 2016;Merkenschlager et al., 2017;Salvi et al., 2016;Tabari et al., 2016), and there are other studies addressing the structural assumptions by the statistical downscaling of surrogate climate model runs (Bürger et al., 2012;Gutmann et al., 2014;Hertig et al., 2018;Roberts et al., 2019;Werner and Cannon, 2016;Widmann et al., 2019;Yang et al., 2019), or by the statistical downscaling of the projected climate model output Sørup et al., 2018;Sunyer et al., 2015;Vaittinada Ayar et al., 2016;Wang et al., 2016;Wootten et al., 2017). To objectively identify the shortcomings of statistical downscaling methods, the verification and evaluation of the general and structural assumptions should, however, be performed simultaneously. To the authors' knowledge, there are no papers yet which simultaneously address the verification of both types of assumptions.
In this paper, the verification and evaluation of the general and structural assumptions are illustrated for a weather typing (WT) SDM for the purpose of climate change impact modelling on precipitation in Belgium during winter (December to February). The studied WT SDM is the method labelled SD-B-7 by Willems and Vrac (2011). Downscaling is performed in three steps. In the first step, weather types are identified based on the mean sea level pressure patterns. In the second step, the relationship between the predictors (weather types) and predictand (point precipitation) is established by using analogues. In the last step, the precipitation amounts are scaled following the Clausius-Clapeyron (CC) relationship. Overall strengths emerge from the physical background of the SDM (Shepherd et al., 2018). This paper is organised into the following sections. Section 2 introduces the studied SDM and the hydrometeorological data. Section 3 outlines the verification of the downscaling assumptions, and the corresponding results and discussions are included in Sect. 4. Section 5 summarises the main findings and makes suggestions for future research.
2 Statistical downscaling methods, case study and data

The weather typing method
The considered WT method is the method referred to as SD-B-7 by Willems and Vrac (2011). This method has been selected over the other WT methods as it accounts for both the changes in atmospheric circulation and the potential intensification of extreme precipitation due to temperature rise. The method downscales the daily gridded climate model output to a point time series with a time step equal to the observed time series by using a three-step approach. In the first step, the Jenkinson-Collison automated Lamb WT classification system is applied and the WTs are identified. In the second step, downscaled precipitation time series are produced by using WT analogues. In the last step, the precipitation amounts are scaled by using the Clausius-Clapeyron (CC) relationship.
These 27 WTs are regrouped to 11 WTs by equally dividing the hybrid WTs over the corresponding non-directional WTs (cyclonic or anticyclonic) and directional WTs. Although this might lead to information loss (Schiemann and Frei, 2010), it leads to larger sample sizes per WT and thus more accurate SDM relationships. The use of a reduced number of WTs is also in line with previous case studies for Belgium (Brisson et al., 2011;De Niel et al., 2017;Demuzere et al., 2009;Willems and Vrac, 2011).

Step 2: statistical downscaling by analogues
Downscaled time series are produced by finding analogues for the projected climate model output. In the first step, the bias in the number of wet days is removed by using a climate-model-dependent and a seasonally dependent wetday threshold. In the next step, the downscaled precipitation time series are constructed by WT analogues.
The first criteria for defining an analogue wet day are the season and WT. Consider the day d of the projected climate model output, corresponding with the season s, WT wt and a daily precipitation amount p. Then, the search for an analogue day is conducted among the observed wet days in season s for which the WT equals wt. Besides the season and the WT, the exceedance probability of the daily precipitation amount p is considered. More precisely, the exceedance probability is calculated by using the total daily precipitation amounts of the wet days occurring in the season s and corresponding with the WT wt. As such, the analogue precipitation amount for day d equals the daily precipitation amount of the observed time series with the closest exceedance probability.
In case the observed precipitation time series has a subdaily time step, the sub-daily precipitation amounts are aggregated to daily precipitation amounts. Next, for each season and WT, the exceedance probabilities for the observed daily precipitation amounts of wet days are calculated based on the total daily precipitation amount. After determining the analogue day, the sub-daily precipitation amounts of the analogue day are resampled to produce the downscaled time series.

Step 3: precipitation scaling by the Clausius-Clapeyron relationship
Besides large-scale circulation patterns, precipitation also responds to thermodynamic processes. The latter processes are accounted for by precipitation scaling following the CC relationship. The CC relationship describes the water-holding capacity in air masses, which more specifically increases by 7 % per degree of warming. Application of this scaling rate to precipitation intensities is valid assuming that extreme precipitation amounts are controlled by the local moisture availability and are not influenced by the large-scale atmospheric circulation patterns. In reality, however, physical processes interact and higher scaling rates are also found Blenkinsop et al., 2018;Manola et al., 2018;Lenderink et al., 2017;Zhang et al., 2017). The CC relationship is determined by the annual timescale. The temperature rise, to be applied for the CC scaling, is computed by using a seasonal quantile-based approach.
Although several studies have pointed out that dew point temperature is a better predictor for extreme precipitation amounts than the average daily temperature ( Van de Vyver et al., 2019;Wasko et al., 2018), average daily temperatures were considered in this study due to their availability.

Meteorological data
For the main station of the Royal Meteorological Institute of Belgium (RMI) in Uccle, the precipitation and average temperature time series are available for the period 1901-2000 with a 10 min and daily time step respectively. The historical WTs are identified by using the daily gridded MSLP output for the EMULATE, ERA40 and NCEP/NCAR reanalysis data sets (Table 1). Hence, this study accounts for the re-  5 19575 -20025 Kalnay et al. (1996 cent findings of Horton and Brönnimann (2018) and Stryhal and Huth (2017). Both studies indicate that reanalysis data sets introduce uncertainties in the classification of WTs and the statistical downscaling step. By using daily WTs, rapidly occurring changes in the large-scale atmospheric circulation might be neglected (Åström et al., 2016). However, the winter season is of interest and for this season no rapidly evolving circulation changes, i.e. within 1 d, are expected. The climate model ensemble, presented in Table 2, includes 93 CMIP5 climate model runs of which 33 are control runs. For the climate change impact analysis, all four representative concentration pathways (RCPs) are considered, where the RCP 2.6, 4.5, 6.0, and 8.5 sub-ensembles include 20, 28, 15, and 30 climate model runs respectively. For each climate model run, daily MSLP, precipitation and average temperature output are extracted for 1961-1990 (control period) and 2071-2100 (scenario period). The precipitation and temperature data are extracted for the grid cell covering Uccle, whereas MSLP, required for the WT identification, is extracted for a larger area covering Uccle by using the 16-point grid of the WT classification system (Fig. 1).

Verification of the statistical downscaling assumptions
The verifications of following assumptions are performed for the winter season, including the months of December, January and February.

Informative assumption
The informative assumption defines the existence of an informative and physically based relationship between the predictors and predictand. The predictors of the WT method are the average daily temperatures and WTs. In order to examine the informative assumption for the WTs, the WT occurrences and the precipitation statistics related to the individual WTs are determined for the period 1961-1990. The studied precipitation statistics involve the precipitation accumulation, the number of wet days and the empirical distribution of independent extreme precipitation amounts. The independent 10 min, hourly and daily precipitation amounts are determined by using a peak-overthreshold method, setting the threshold at 0.1 mm h −1 , and defining at least 12 h between successive events (Willems, 2000). In order to examine the informative assumption for the average daily temperatures, the existence of the CC relationship is verified. The independent precipitation amounts are determined by using the 10 min precipitation amount time series and the daily average temperature time series between 1901 and 2000. First, the 10 min precipitation events in the time series are identified by using a peak-over-threshold method (threshold = 0.1 mm h −1 and time between successive events > 12 h). Next, the precipitation events and corresponding temperatures are classified in moving temperature bins and each bin is sorted from low to high (Manola et al., 2018). Finally, the magnification of the 90th, 95th and 99th percentile precipitation amount for increasing temperature bins is investigated.

Perfect prognosis assumption
The verification of the perfect prognosis assumption is especially of importance for the WT method. More specifically, the application of the WT analogues follows the principle of perfect prognosis methods. This means that a statistical relationship is first defined between observed predictors and observed predictand. Thereafter, the statistical relationship is applied to the projected climate model output. Consequently, the calibrated statistical relationship is not tailored to biases in the climate model output. The scaling of the precipitation amounts by the CC relationship, on the contrary, follows the principles of model output statistical methods. Those methods implicitly assume that the climate model biases are timeinvariant and that through the application of changes the biases in the projected climate model output are cancelled by the biases in the historical climate model output.
The verification of the perfect prognosis assumption involves a comparison between the observed and simulated climate model WT occurrences. The verification is conducted over the period 1961-1990 by using the historical output of 33 global climate model runs (Table 2).

Stationarity assumption
To verify the stationarity assumption, the contributions of the dynamical and thermodynamic processes governing precipitation changes are studied over time. To this end, the observed period  is split into different sub-periods. The sub-periods are 20 years long and range between 1901 and 1920, 1921 and 1940, 1941 and 1960, 1961 and 1981, and 1981 and 2000. Each sub-period is thereafter considered as the scenario period for surrogate climate model runs. For instance, when 1901-1920 is selected as the scenario period, then the periods 1921-1940, 1941-1960, 1961-1981, and 1981-2000 act as control periods. When 1981-2000 is selected as the scenario period, then the periods 1901-1920, 1921-1940, 1941-1960, and 1961-1980 act as control periods. The combination of the different sub-periods yields an ensemble of 20 surrogate climate model runs. For each surrogate climate model run, the change in the average daily precipitation amounts of wet days is decomposed. More specifically, the precipitation amount changes P are governed by the changes in the WT occurrence changes, i.e. the contribution by the dynamical processes P dynamical , and thermodynamic, local, and/or mesoscale feedback changes, i.e. P other (Souverijns et al., 2016). The contributions are calculated as follows: N j,scen − N j,contr P j,contr P other = 11 j =1 P j,scen − P j,contr N j,scen , with -N j,contr the absolute occurrence frequency of wet days with WT j in the climate model output for the control period, -N j,scen the absolute occurrence frequency of wet days with WT j in the climate model output for the scenario period, -P j,contr the average daily precipitation amount of the wet days with WT j in the climate model output for the control period, and -P j,scen the average daily precipitation amount of the wet days with WT j in the climate model output for the scenario period.
The decomposition is also performed by using the historical and projected output of the 93-member global climate model ensemble (Table 2).

Response to greenhouse gas scenarios
In order to verify the response of the predictand to the greenhouse gas scenarios, the WT method is applied to the output of 93 global climate model runs (Table 2). Next, the daily precipitation amounts for the downscaled time series are compared against the observed precipitation amounts and the intensification of the extreme precipitation amounts for increasing greenhouse gas scenarios is investigated. The intensification is visually inspected, and the focus is put on the magnification of the changes for increasing greenhouse gas scenarios. Furthermore, a comparison is made between the coarse global climate model changes and the changes for the downscaled time series. For sake of brevity, the changes in the 30-year return period and the average winter precipitation accumulation are investigated.

Structural downscaling assumptions
To investigate the added value of the CC relationship, the original SDM (with CC scaling) and the SDM without CC scaling are applied to the projected output of 93 global climate models ( Table 2). The control period and the range of observations are defined as , and the scenario period is defined as 2071-2100. A comparison is made between the projected changes for the SDM with CC scaling and the SDM without CC scaling. The added value of the CC relationship is discussed in combination with the predictand response.
4 Results and discussions 4.1 The informative assumption Figure 2 presents the relative WT occurrence frequencies during the winter season. The results show that the A WTs occur most frequently and represent approximately 30 % of the winter days. Also the W, SW, and C WTs are identified as frequently occurring. The occurrence frequency of each of these WTs is approximately 12 %. Despite some details, there are no differences between the different reanalysis data sets. The WT occurrence patterns are generally in agreement with the recent findings of Otero et al. (2018). They identified the A WTs as the overall dominant winter WT in Europe. The A WTs, more specifically, represent on average 25 % of the winter days. The average occurrence frequency of the C WTs in Europe is estimated at approximately 15 %, the W WTs at approximately 8 %, and the SW at approximately 5 %. Note that the WT occurrences presented by Otero et al. (2018) have been averaged out over the European domain. Apart from the A WTs, the W, SW, and C WTs are associated with a high precipitation accumulation and together explain up to 71 % of the total winter precipitation accumulation (Appendix, Figs. A1 and A2). Additionally, these WTs are associated with higher precipitation amounts, as for instance shown for the NCEP/NCAR reanalysis data set in Fig. 3. More specifically, the 1-year daily precipitation amount for the W WTs measures 0.51 mm h −1 and is twice as large as the corresponding amount for the A WTs, which measures 0.19 mm h −1 . Also the NW WTs are characterised by higher precipitation amounts and a higher wet-day frequency. However, compared to the W, SW, and C WTs, their occurrence is rather low and contributes less to the precipitation accumulation.
The relationship between precipitation and temperature is presented in Fig. 4. This figure demonstrates the intensification of the independent precipitation amounts with increasing temperature. For instance, the 90th percentile precipitation amount increases by 7 % (1 • C) −1 , and this increase follows the CC relationship. For temperatures higher than 10 • C, the scaling rate increases up to 14 % (1 • C) −1 . Similar scaling rates are obtained for the higher precipitation percentiles. For percentiles smaller than 90 %, the scaling rate of 7 % (1 • C) −1 is not identified. The identified CC relationship is similar to other studies for Belgium (De Troch, 2016;Van de Vyver et al., 2019) and for neighbouring regions (Lenderink and van Meijgaard, 2008). Although the CC relationship in those other studies has been established by using the dew point temperature, similar scaling rates are obtained. Considering the findings of recent studies, the application of dew point temperature is expected to better esti- Figure 4. The relationship between daily average temperature and independent 10 min precipitation amounts. The relationship is defined by annual timescale, and this is done by using the entire Uccle time series . The CC relationship (+7 % • C −1 ) is indicated by the grey dotted lines, whereas the 2× CC relationship (+14 % • C −1 ) is indicated by the grey dashed lines. The black lines show the magnification of the 90th, 95th, 99th and 99.9th percentile precipitation amount for increasing temperatures. mate the increases in the atmospheric moisture capacity and, thus, the precipitation changes ( Van de Vyver et al., 2019;Wasko et al., 2018). Figure 5 compares the WT occurrences for the historical climate model outputs with those for the reanalysis data sets. The comparison reveals large biases, particularly for the W and A WTs. More precisely, the climate models overestimate the occurrence of W WTs by approximately 11 %, whereas the A WTs are underestimated by 14 %. Moreover, in contrast to the reanalysis data sets, the W WTs are the most prominently occurring WTs in the climate model outputs. These findings are in agreement with the recent study by Stryhal and Huth (2018). Using different atmospheric classification patterns, they found an overall overestimation of the westerly circulation, which is estimated to be approximately 7 % for the British Isles and increases towards central Europe by up to 21 %. Otero et al. (2018) and Stryhal and Huth (2018) also indicate that climate models have a poor performance in reproducing the occurrence of A WTs.

The perfect prognosis assumption
The overestimation of the W WTs is explained by the orientation of the North Atlantic storm track in the climate models. It has, more specifically, a zonal orientation instead of a SW-NE tilt (Pithan et al., 2016;Zappa et al., 2014). The zonal orientation results in a pronounced meridional pressure gradient, creating zonal westerly flows, which in turn impede the occurrences of anticyclones (Stryhal and Huth, 2019). Biases in the blocking frequency are also arising from the Figure 5. Relative WT occurrence in the winter season for different reanalysis data sets (dots) and climate model runs (box plots). The results are obtained for the reference period 1961-1990. climate model resolution (Anstey et al., 2013;Scaife et al., 2011;Woollings et al., 2018).
Although it would be possible to remove the bias in WT occurrences, for instance through resampling (Mehrotra and Sharma, 2019), the studied WT method does not do that. Note that such bias correction would require a technique that simultaneously accounts for the bias in the WT occurrences, the WT persistence, and the relationship between the WTs and other hydrometeorological variables.

The stationarity assumption
The contribution of the dynamical processes to the precipitation amount changes for the surrogate climate model runs is presented in Fig. 6. Based on the median results, the dynamical processes are responsible for 35 % to 55 % of the changes. The high contributions are most likely explained by the findings of Ntegeka and Willems (2008) and Willems (2013a). These authors identified multidecadal oscillations in the 100-year precipitation time series for Uccle. Some periods are characterised by higher precipitation amounts and are referred to as positive anomalies. The periods characterised by smaller precipitation amounts are referred to as negative anomalies. Willems (2013a) observed that the precipitation anomalies coincide with anomalies in the number of W WTs and with anomalies in the pressure difference between the Azores and Scandinavia. The coincidence of large-scale atmospheric circulation patterns and precipitation amounts has also been studied for other locations in Europe. In this context, Tabari and Willems (2018) identified the North Atlantic Oscillation (NAO) and the El Niño-Southern Oscillation (ENSO) signal as dominant drivers for the extreme winter precipitation amounts. Hence, the findings of Tabari and Willems (2018) and Willems (2013a) imply that large-scale atmospheric circulation influences winter precipitation in Europe. At the end of the 20th century, the dynamical processes explained only 20 % of the precipitation amount changes. This lower contribution is compensated for by a higher contribution from the thermodynamic processes. More specifically, Ntegeka and Willems (2008) point out that the higher precipitation amounts are governed by an intensification of the positive anomaly. The intensification arises from the increasing temperatures, which are in turn attributed to climatic changes. Figure 7 shows the contributions by the dynamical and the thermodynamical processes to the long-term projected changes. The changes in the WTs account for 18 % of the total change and, hence, they are primarily driven by the thermodynamic processes. This is in agreement with the findings of Kröner (2016), who investigated the drivers for precipitation changes in Europe. As the thermodynamic processes are only included to some extent in the downscaling methodology, the applicability of the SDM is questioned. Note that the application of the CC relationship is limited to the extreme precipitation amounts, while the thermodynamic processes also influence the average precipitation amounts.

Response to the greenhouse gas scenarios and the added value of the CC relationship
Climate models project a poleward shift of the Northern Hemisphere jet streams and storm tracks, resulting in the increased occurrence of zonal flows and fewer blocking occurrences (Barnes and Screen, 2015;Santos et al., 2016;Stryhal and Huth, 2019;Woollings et al., 2018). As a consequence, an increasing occurrence of W and SW WTs and a decreasing occurrence of A WTs are projected (Appendix, Fig. A3). More specifically, under the total uncertainty range, i.e. all RCPs combined, the occurrence of W WTs is projected to increase by 7 %. For the RCP sub-ensembles, the increase in W WTs is magnified from 6 % for RCP 4.5 to 11 % for RCP 8.5. The A WTs, on the contrary, decrease in 10 % for RCP 4.5 and 12 % for RCP 8.5. Using the same climate model ensemble, the median change in the average temperature is es- Figure 7. Contribution of the dynamical processes, i.e. changes in the WT occurrence frequencies and other effects, for instance, due to the thermodynamic processes, to the change in the average daily precipitation amount of wet winter days. The results are based on the climate model output for the scenario (2071-2100) and control period .
The changes in the 30-year daily winter precipitation amounts are shown in Fig. 8. While the studied SDM without CC scaling does not project any changes in the 30-year daily precipitation amount, the coarse global climate model (GCM) projections do. This indicates that the reliance of the projections on analogues involves significant shortcomings. Those shortcomings can, however, be overcome by the application of the CC relationship. As a result, an intensification of the extreme precipitation amounts is obtained for the studied SDM. The intensification is, moreover, in agreement with the theoretical estimations. The projected changes for RCP 8.5 are, for instance, estimated at 25.8 % and equal the theoretical change values (3.7 • C × 7 %). Note that the CC relationship is applied at the daily timescale and that at the daily timescale the super CC scaling rate (14 % (1 • C) −1 ) is not observed. To some extent, the estimated median change values for the studied SDM differ from the coarse climate model projections. More specifically, the differences in the median change values range between 3 % and 5 %. Besides differences in the change values, there are differences in the monotonicity of the change intensification for increasing greenhouse gas scenarios (GHSs). For the statistically downscaled changes, the intensification of the change values is monotonic due the monotonic increase in the temperature predictor. For the coarse global climate model changes, on the contrary, the monotonicity is masked by random uncertainties, climate model uncertainties and the stochastic uncertainty arising from the internal variability of the climate system (Van Uytven and Willems, 2018). The changes in average winter precipitation accumulation are shown in Fig. 9. The comparison between the coarse climate model changes and the statistically downscaled changes indicates that the statistical downscaling step increases the changes. More specifically, under the total uncertainty range, the statistically downscaled changes are approximately 15 % larger. The latter is explained by the absence of bias correction schemes, the overestimation of the W WTs (Fig. 5), which have been identified as one of the wetter WTs (Sect. 4.1), and the projected increase in these WTs (Fig. A3). Figure 9 furthermore shows that the statistically downscaled changes for the winter precipitation accumulation are not monotonic for increasing greenhouse gas scenarios. The latter is due to the applicability of the CC relationship to extreme precipitation only. For that reason, the monotonicity of the temperature changes is not transferred to the changes in the average winter precipitation accumulation.

Conclusions
The studied SDM does not meet all assumptions. It is shown that the SDM has limitations and its skill could be improved. The WT method fails, among other assumptions, the perfect prognosis assumption. As the method is applied in a perfect prognosis context, improvements should involve the bias correction of the WT occurrences. Since the simulation of large-scale atmospheric circulation patterns remains biased in RCMs (Addor et al., 2016;Jury et al., 2018), the application of a statistical bias correction method is suggested. A potential method would be the recently developed resampling approach of Mehrotra and Sharma (2019). Further extensions to the latter approach are, however, required to also address the biases in the WT persistence and the biases in the relationship between WTs and other hydrometeorological variables. Although the implementation of bias correc- Figure 9. Changes in the average winter precipitation accumulation in function of the different RCPs. The changes are obtained by using the climate model output for 2071-2100 with respect to the output for 1961-1990. tion methods is possible, it remains questionable whether the WT method may lead to accurate precipitation downscaling. The observations in this study confirm an informative relationship between the predictors and the predictand, but this relationship is time-variant. The WT occurrences explain between 35 % and 55 % of the historical precipitation amount changes, but their contribution decreases to less than 20 % at the end of the 21st century. This means that the precipitation changes for the case study location are controlled by thermodynamic processes rather than dynamical processes (i.e. changes in WT occurrences). As the extreme precipitation amounts are scaled following the CC relationship, the thermodynamic processes are accounted for to some extent in the downscaling methodology but it is not done sufficiently. The CC relationship produces extreme precipitation amounts outside the range of historical observations and thus anticipates the intensification of extreme events. The latter indicates the potential of the CC relationship for improving non-parametric precipitation models. The stand-alone application of the CC relationship as an SDM has recently been demonstrated by Manola et al. (2018) and Van de Vyver et al. (2019), but those SDMs also involve shortcomings (Zhang et al., 2017).
Uncovering the shortcomings of SDMs does not mean that their use is discouraged. One should not forget that other SDMs may also fail assumptions and, thus, also have shortcomings. By considering an ensemble of SDMs, the uncertainties introduced by those shortcomings can be taken into account. When SDM ensembles are considered, ensemble members could be weighted based on their skill. The latter would be similar to the existing climate model weighing techniques (Sanderson et al., 2017). The first step towards a weighted SDM ensemble is still to be made by the statistical downscaling community.
Appendix A Figure A1. Relative winter precipitation accumulation per WT for different reanalysis data sets (lines) and climate model runs (box plots). The results are obtained for the reference period 1961-1990. Figure A2. Percentage of wet days per WT in the winter season for different reanalysis data sets (lines) and climate model runs (box plots). The results are obtained for the reference period 1961-1990. Figure A3. Changes in the winter WT occurrences for RCP 4.5 and RCP 8.5. The changes are obtained by using the climate model outputs for 2071-2100 with respect to the output for 1961-1990.
Author contributions. EVU conceptualised and developed the approaches. Under the supervision of PW, EVU and JDN performed the formal analysis and investigation. EVU prepared the visualisation and wrote the initial draft, which was critically reviewed and revised by PW and JDN.