Towards the Development of a Pan-European Stochastic Precipitation Dataset

Heavy precipitation leading to widespread river floods are one of the main natural hazards affecting Central Europe. Since extreme precipitation events associated with devastating floods have long return periods, long-term datasets are needed to adequately quantify the frequency and intensity of these events. As long-term observations of precipitation across Europe are rare and not homogeneous in space nor time, they are generally not suitable to run hydrological models. In the present study, 5 a combined approach is presented on how to generate a consistent precipitation dataset based on dynamical downscaling and post-processing statistics. Focus is given to five river catchments in Central Europe: Upper Danube, Elbe, Oder, Rhine, and Vistula. Reanalysis data are dynamically downscaled with a regional climate model and bias corrected towards observations. Empirical quantile mapping was identified as one of the most suitable methods to correct the bias in model precipitation. For most of the top ten precipitation events of large European river catchments, bias correction led to clear improvements towards 10 the raw model data. However, results for Western European rivers (e.g., Rhine) are typically better than for Eastern European rivers (e.g., Vistula), which may also be associated with observational gaps for the latter. Two examples of severe river floods are presented in more detail: the Rhine river flood in winter 1995 and the flood in the Upper Danube and Vistula in June 2009. While the former was already well presented without bias correction, for the latter, bias correction improved underestimated precipitation amounts in the Upper Danube but not in the Vistula catchment. In conclusion, this method can be applied to other 15 extensive datasets towards the development of a Pan-European stochastic precipitation dataset.


Introduction
River floods are one of the most disastrous weather-related hazards in Central Europe, heading the list of the highest economic losses (e.g., Alfieri et al., 2018).For example, the damage caused by the devastating flood in Germany in spring 2013 has been valued at about 12 billion Euro (Merz et al., 2014).Due to the high impact of flooding on economy, agriculture, infrastructure, transport as well as on human life, there is a high interest in quantifying the risk of flooding for Central Europe (e.g., Ward et al., 2011;Feyen et al., 2012;Jongman et al., 2014).However, such extreme events have long return periods (e.g., Pauling and Paeth, 2007;Hirabayashi et al., 2013) and thus are rarely presented in short-term (observational) datasets.Therefore, it is expedient to use historical century-long datasets to estimate flooding risk (e.g., Feyen et al., 2012), which is generally related to the occurrence of heavy and/or long-lasting precipitation (e.g., Maddox et al., 1979;Hilker et al., 2009).Long-term observational records of precipitation are quite heterogeneous across Europe.Thus, they cannot be used to run hydrological rainfall-runoff models, whose simulations are needed to quantify the flooding risk for Europe.On the other hand, reanalysis provide homogeneous datasets covering long time periods with the limitation of a coarse resolution.Approaches to overcome the problem of small sample sizes are either to develop stochastic precipitation models (e.g., Richardson, 1981;Ehmele and Kunz, 2018) or to downscale centennial reanalysis datasets (e.g., Stucki et al., 2016).Within the Miklip project ("decadal climate predictions"; Marotzke et al., 2016), a regional component of the decadal prediction system was developed for Europe (Feldmann et al., submitted), leading to the generation of 5,800 years of regional climate model hindcast data for the last 60 years.
Dynamical downscaling of long-term reanalysis with or without subsequent bias correction was used in a variety of earlier studies.An overview of statistical and dynamical downscaling approaches and a discussion of their advantages and disadvantages are given in Maraun et al. (2010).In general, the magnitude, the temporal, and the spatial variability of a downscaled variable should be represented correctly.In particular, Maraun et al. (2010) found that the performance of downscaled precipitation depends on the underlying synoptic situation.Thus, precipitation associated with frontal systems can be better represented in comparison to precipitation gained by convective systems, which may be relevant for summer flooding.As an example, the Twentieth Century Reanalysis dataset (20CR) was dynamically downscaled with the Weather Research and Forecasting (WRF) model to investigate gustiness in Switzerland (Stucki et al., 2016) and the Lago Maggiore flood in 1868 (Stucki et al., 2018).While the timing and magnitude of the peak gusts are not well captured by the model, the simulated precipitation pattern agrees well with observations.Both studies did not apply any bias correction method.A downscaling with add-on bias correction was used in e.g., Dobler and Ahrens (2008) or Fang et al. (2015).Dobler and Ahrens (2008) compared different downscaling approaches for precipitation in Europe and South Asia as well as different bias correction methods (namely quantile mapping and local intensity scaling).They concluded that dynamical downscaling with a regional climate model (RCM) in combination with a bias correction method is most suitable to simulate precipitation in Europe.Fang et al. (2015) focused more on the comparison of different bias correction methods and found that empirical quantile mapping and power transformation performed best for precipitation.However, they mentioned that the selection of an accurate correction method turns out to be case sensitive.
This study is a proof of concept where we show how to generate a precipitation dataset with a resolution of 25 km, which is high enough to be used as input for hydrological models (e.g., Maraun et al., 2010).We address the following research questions: • Which bias correction method is most suitable to correct model precipitation?
• Are historical precipitation events related to floods represented adequately in the dataset?
We apply a dynamical downscaling with an RCM (COSMO-CLM, Rockel et al., 2008), where reanalysis are used as initial and boundary conditions.In a second step, a bias correction towards observations is applied.The focus is laid on large European river catchments, namely Upper Danube, Elbe, Oder, Rhine, and Vistula (Fig. 1).Although, temperature was also considered, as it is a necessary input variable for hydrological models to calculate discharges, we focus on precipitation in this study and test its representation for historical flood events.The top ten flood events for each catchment are considered and two events are discussed in detail, namely the winter flood in 1995 which affected the Rhine, as well as the summer flood in 2009, which inundated parts of the Danube and Vistula catchments.This paper is structured as follows: The datasets which are used in this study are introduced in Sect. 2. Methods are presented in Sect.3, including the dynamical downscaling approach as well as the tested bias correction methods.The bias correction methods are evaluated in Sect 4 and their added value for historical extreme precipitation events is discussed in Sect. 5. Results for the case studies are given in Sect.6.The general findings are discussed in Sect.7. Finally, the conclusions are summarized in the Sect.8.

Observational data
As reference data for the bias correction as well as for a comparison to downscaled model data E-OBS data is considered, which is a European land-only daily high resolution gridded dataset for precipitation, surface temperature and sea level pressure (Haylock et al., 2008;Van den Besselaar et al., 2011).In this study, E-OBS v17 is used, which shows some improvements towards older versions due to updates and the inclusion of new observational stations (e.g., for Poland).A detailed comparison between older versions and v17 can be found at the official homepage1 .In this study, we consider daily precipitation sums on a 0.22 • resolution grid (about 25 km) for the time period from 1950 to 2015.
For a quick quality control of E-OBS we additionally use the HYRAS dataset (German: HYdrologische RASterdatensätze; hydrological raster datasets) provided by the German Weather Service (Deutscher Wetterdienst, DWD).HYRAS is a highresolved gridded observational dataset of daily rainfall totals based on several thousand ground based climate stations (Rauthe et al., 2013).Using a specific algorithm, the more or less unevenly distributed observations are interpolated to regular grids of 1×1 km 2 and 5×5 km 2 .During this interpolation, elevation, exposition and climatology of a specific grid point are considered.
HYRAS covers the state area of Germany and surrounding catchments of the rivers Rhine, Danube, and partly Elbe and Oder in the neighboring countries.HYRAS is available for the period 1951-2006, only.Note that HYRAS data are inhomogeneous due to the changing number, location and instrumentation of the used observations over the years.Furthermore, there is a certain bias in precipitation totals especially over complex terrain, where the number of observations is limited (Kunz, 2011;Ehmele and Kunz, 2018).In this study we use the 5 km version of HYRAS interpolated to the E-OBS grid.
We calculated the correlation coefficients between E-OBS and HYRAS daily precipitation sums for the Rhine, for the northern part of the Elbe catchment, and for the western part of the Upper Danube catchment.For the Rhine, we found a correlation coefficient of 0.9908, for the Elbe of 0.9876, and for the Danube of 0.9790.Thus, the daily precipitation sums of E-OBS and HYRAS are highly correlated.The slightly lower correlation for the Danube can be explained by less well represented orographic effects.

Reanalysis data
Reanalysis data provide the initial and boundary conditions for the RCM.Since reanalysis are not affected by changes in the data assimilation system as it is the case for archived weather analyses from operational forecasting systems, a homogeneous record of the past atmospheric evolution can be produced (Dee et al., 2011).
Two global reanalysis datasets provided by the European Centre for Medium-Range Weather Forecasts (ECMWF) are considered: ERA-Interim (Dee et al., 2011) and ERA-20C (Poli et al., 2016).ERA-Interim is available on a horizontal resolution of approximately 80 km (T255 spectral) and covers the time period from 1 January 1979 onwards (Dee et al., 2011).Due to enhancements in the data assimilation system, ERA-Interim shows improvements towards the previous generation ERA-40, for example, in the representation of the hydrological cycle (Dee et al., 2011).As mentioned in the introduction, the use of centennial datasets is expedient to investigate extreme events, since these events have long return periods.A centennial record of atmospheric data is provided by the ERA-20C dataset, covering the period from 1900 to 2010 (Poli et al., 2016).ERA-20C products are available on a horizontal resolution of approximately 125 km (T159 spectral).Further information about ERA-Interim as well as ERA-20C can be found at the official ECMWF homepage2 .

Methods
With the aim of generating a homogeneous long-term dataset of daily precipitation sums at 25 km resolution, reanalysis (ERA-Interim and ERA-20C) are dynamically downscaled with an RCM.Afterwards, the bias of the model precipitation is corrected towards observations (E-OBS).

Dynamical downscaling
Downscaling describes a procedure, in which information at larger scales are used to make predictions at local scales (e.g., Fowler et al., 2007).In general, downscaling can be classified in three categories namely statistical, dynamical and statisticaldynamical downscaling (e.g., Gutmann et al., 2012;Reyers et al., 2015).In this study, we apply dynamical downscaling, meaning that an RCM is run on a sub-domain of the utilized large-scale forcing data (e.g., Trzaska and Schnarr, 2014).The added value of the high resolution RCM compared to global climate models with coarser resolution are discussed e.g., in Feser et al. (2011).One of the key benefits is the better representation of hydrological variables particularly over areas with complex terrain, which is important to accurately model precipitation associated with flood events (Frei et al., 2000).In this study, we used the non-hydrostatic COSMO model in climate mode (COSMO-CLM3 , Consortium for Small-Scale Modeling Climate Limited-area Model, thereafter CCLM; Rockel et al., 2008) in its version 5.08.CCLM is the community model of the German regional climate research community jointly further developed by the CLM-Community.In contrast to the COSMO model operationally used by the DWD, CCLM is run without data assimilation of observations and without latent-heat nudging of radar data.Our CCLM simulations are performed on a 0.22 • x 0.22 • rotated grid with 32 unevenly distributed vertical levels (highest vertical resolution in the boundary layer) covering the EURO-CORDEX4 domain (see Fig. 1).

Bias correction
With the dynamical downscaling approach we obtain precipitation data at the aimed resolution of 25 km.However, hydrological models are calibrated with observations, making them sensitive towards the forcing data.Thus, a bias correction of the RCM output is necessary.Moreover, it is known that RCMs produce too many wet days with low intensities (below 0.1 mm), known as drizzle effect (e.g., Feldmann et al., 2008).Therefore, a previous dry-day correction had to be applied to match the number of dry days in the simulated dataset to the number of dry days in the observational dataset.
In line with Berg et al. (2012), dry days were defined as days with a daily precipitation sum below 0.1 mm.The dry-day correction was done separately for each month and included the following steps: For each grid point, the dry days were counted both in the observational data and the model run.Thereafter, for dry days in the model run, being wet days in the observations, a small artificial precipitation amount was added.For wet days in the model run, being dry days in the observations, the model precipitation was reduced to an amount below 0.1 mm.This was done until the number of dry days in the model run was identical to the observed number of dry days.
The bias correction is applied to the dry-day-corrected data.Five commonly used bias correction methods were tested (e.g., Fang et al., 2015): • linear scaling (LS) • local intensity scaling (LOCI) • power transformation (PT) • empirical quantile mapping (EQM) • quantile mapping with gamma distribution (GQM) A description of these approaches as well as references to further literature can be found in Appendix A. The needed distributions or scaling factors were calculated for each grid point by considering the eight surrounding grid points.This was done to ensure that adjacent grid points do not have independent precipitation amounts.Moreover, the distributions and scaling factors can be calculated with different calibrations, meaning e.g., the use of different time periods.Here, we compared scaling factors and distributions on a monthly, seasonal, and half-yearly basis.
The time period of ERA-Interim that is covered by E-OBS data (1979 to 2015) is declared as the reference period one (RPI).
For ERA-20C, only the second half of the 20th century is covered by E-OBS.Therefore, a second reference period (RPII) is defined (1950 to 2010).
In the next section, the bias correction methods and calibrations are compared against each other for RPI and RPII, respectively.This comparison is done to find a bias correction method which is suitable for our approach.We do not aim to rank these methods in an absolute sense.Studies which deal with a more general comparison of bias correction methods (nevertheless with focus on specific regions) are done, for instance, by Lafon et al. (2013).
In this section, bias correction and calibration methods are evaluated with the aim of identifying a method which leads to improvements in the representation of precipitation in all investigated river catchments.First, we compare time series by calculating statistical variables, then we show the horizontal distribution of the mean annual precipitation during the time period from 1979 to 2015.
The skill of the different bias correction methods is quantified by the root mean square error (RMSE), Pearson's correlation coefficient (R) as well as the skill score (S) of Taylor (2001).The calculation of the skill score can be found in Appendix B.

S and R values close to 1 and low values of the RMSE indicate a high resemblance between observations and the (corrected)
model output.To calculate the statistical measures, time series (RPI for ERAI-CCLM and RPII for ERA20C-CCLM) of catchment means were considered.For the Upper Danube, the results are illustrated in Table 1.For all three statistical measures, the monthly calibration leads to the best results in comparison with the seasonal and the half-yearly calibration.All bias correction methods reduce the RMSE compared to the raw model runs.The lowest RMSE can be found for PT with a monthly calibration.The highest values of S and R result for EQM with a monthly calibration.However, S and R of PT and GQM differ only at the third decimal place, which means that these methods lead to similar improvements as EQM.These statistical patterns are also illustrated as Taylor diagram in Fig. 2. It can be seen that the centered root mean square difference (cRMSD, formula in Appendix B), R, and S are quite similar for the raw model and all bias-corrected ERAI-CCLM runs (Fig. 2 a).This is also true for ERA20C-CCLM (Fig. 2 b).For the Elbe catchment, a high correlation and skill is also found with EQM, while lowest RMSE values result from GQM (Table S1 in the supplemental material).For the Oder and the Vistula catchments, PT shows slightly better results in skill and correction than EQM, however, EQM has the lowest RMSE (Table S2 and S4).
The result for the Vistula and the Oder catchments are less meaningful compared to the other river catchments.This can be particularly attributed to the lower density of stations in Poland and surrounding areas than in other countries leading to a strong smoothing of peaks in the interpolation step and thus to an underestimation of precipitation amounts (shown for temperature by Kyselỳ and Plavcová, 2010).For the Rhine catchment, the correlation is higher for EQM, while the skill is higher for PT (Table S3).Moreover, lowest RMSE values are found for GQM.
In conclusion, EQM as well as PT with a monthly calibration lead to similar improvements towards the uncorrected model runs.We decided to use EQM, since previous studies have also found that the most comprehensive correction was achieved with EQM compared to distribution-based QM (as e.g.GQM), both for linear as well as non-linear methods (e.g., Lafon et al., 2013).In addition, PT bases on the results from LOCI, meaning that more computational effort is necessary for this bias correction method.
In the following, we compare the horizontal distribution of precipitation in Europe from the model output with and without bias correction.The mean annual precipitation from 1979 to 2015 for E-OBS, and the differences to uncorrected ERAI-CCLM and to ERAI-CCLM corrected with monthly EQM is shown in Fig. 3.In ERAI-CCLM (without bias correction), there are areas with overestimated precipitation and areas with underestimated precipitation (Fig. 3 b).Especially in the Alps, precipitation is overestimated, while, for example, Poland shows a weaker overestimation compared to observations.An underestimation of precipitation can be seen, for example, over Ireland or northern Italy.With bias correction, most areas in Europe are associated with an overestimation of precipitation, except Poland (Fig. 3 c).Thus, the overestimation of mean annual precipitation in the Alps is still present in the bias-corrected run but with reduced magnitude.Therefore, the bias correction (monthly EQM) leads to some improvements towards the raw model run.
In contrast to ERA-Interim, ERA-20C covers also the first half of the century, where E-OBS data is not available.Thus, the bias correction for the first half of the century in ERA20C-CCLM is done with the scaling factors and distributions derived from the precipitation data of the second half of the century.To justify this procedure, the intensity spectra for the five river catchments for the first and the second half of the century are compared.For the Upper Danube, the intensity spectrum is shown in Fig. 4. The occurrence of precipitation events with amounts over 50 mm is overestimated in the uncorrected ERA20C-CCLM run compared to E-OBS.However, the intensity spectrum of ERA20C-CCLM for the time period from 1900 to 1949 is similar to the time period from 1950 to 2010.Thus, it seems to be appropriate to correct the bias in precipitation for the first half of the century with the scaling factors and distributions from the second half of the century.
We conclude from the climatological perspective that bias correction leads to improvements towards the raw model outputs, especially with EQM on a monthly basis.In the next two sections, the added value of bias correction for specific extreme precipitation events is investigated.

Added value of bias correction for extreme precipitation events
In this section, we discuss the added value of the bias correction for the top ten events of the five investigated river catchments.
The top ten events were identified by calculating the catchment-averaged 7-day running mean of daily precipitation sums   2.Then, we accumulated the precipitation sums over 14 days (9 days before and 4 days after the day of the peak value) for the 50 heavy precipitation events.We define "well captured" ("very well captured") if the model precipitation is within plus/minus 20% (10%) above/below the observed precipitation.To call the bias correction helpful, we expect an improvement of at least −33% of the initial deviation.Events which are not well captured in the ERAI-CCLM and for which bias correction does not have an added value are pinrted in italic (Table 2).Events which are already well presented in the model run (indicating the accuracy of the model), so that bias correction hardly lead to further improvements are printed in standard letters.Events printed in bold letters benefit from the bias correction.The very well captured events are underlined.
There is a clear east-west gradient in the accuracy of the bias-corrected ERAI-CCLM run.Eight of the ten events are well captured for the Rhine and the Upper Danube, respectively.For the Rhine, the two bad cases are related to timing errors (cf. 10 Fig. S1).For the Danube, the underestimation in the two not well captured events is related to spatial shifts and/or errors in the representation of dynamical processes (cf.Fig. S2).For the Elbe, four cases are not well captured by the model, even with  bias correction, which is related to spatial shifts and the incorrect representation of dynamical processes (cf.Fig. S2).This is also true for the Oder (cf.Fig. S3).Here, one bad case (5 August 2006) results from a timing error (not shown).For the Vistula, precipitation peaks are not well presented in most of the cases, again as a consequence of spatial shifts and errors in the representation of dynamical processes (cf.Fig. S3).The fact that the Vistula floods are not well captured, can have two reasons.First, observational precipitation peaks may be smoothed dramatically in the interpolation process due to a low density of stations in Poland and adjacent areas (Kyselỳ and Plavcová, 2010).Second, there are gaps in the long-term records for stations within the Vistula catchment.We counted the missing data in the E-OBS dataset and found up to 70% of missing data at certain locations within the Oder and the Vistula catchments.Thus, we assume that for the Eastern European river catchments it is not the method that fails, but the representation of precipitation in the observational dataset.
To summarize, 38% of the top ten events for all five river catchments are not well represented by the model and bias correction does not have an added value.This agrees with findings from Bennett et al. (2014) and others, who also found that even bias-corrected RCM simulations tend to underestimate the magnitude of (multi-day) heavy precipitation events.In 22% (14%) of the events, precipitation is already (very) well represented in ERAI-CCLM, even without bias correction, showing that the model is able to simulate extreme precipitation.In 40% of the cases, bias correction leads to improvements in the representation of precipitation.18% of the events are even very well captured after bias correction.Similar results can be found for ERA20C-CCLM (cf.Table S5).

Case studies
In this section, we present the results of two heavy precipitation events (associated with severe flooding) for three Central European river catchments, namely the Rhine, the Danube, and the Vistula (Fig. 1).The considered modeled daily precipitation sums were corrected with EQM based on monthly distributions, as this method turned out to be most promising (see Sect. 4).
We show a case where precipitation is already well presented by the model, so that bias correction cannot lead to further improvements (Rhine 1995).In addition, we show a second case (June 2009) where bias correction has a clear added value for precipitation in the Upper Danube river catchment, while at the same time precipitation is still strongly underestimated even with bias correction for the Vistula river catchment.
6.1 Rhine: Flood of January 1995 In early January, snow-and rainfall as well as meltwater coming from the Alps led to a saturation of the soil in the Rhine catchment (Chbab, 1995).These preconditions favored the flooding in the end of January in the catchment area, which was mainly caused by record-breaking rainfall events within the last ten days of January (Fink et al., 1996).Four fatalities and an economic loss of 500 million German Marks were reported for Germany (Fink et al., 1996).
In winter 1994/1995, a peak in the 7-day running mean of the daily precipitation sum averaged over the river catchment is visible on 25 January (Fig. 5 a).In E-OBS, the peak reaches nearly 13 mm, which is also exceeded by both downscaled reanalysis.Overall, the timing and intensity of the heavy precipitation events are quite similar in all datasets.Therefore, there is only a small added value of the bias correction in this case.This can also be seen in the accumulated catchment-averaged precipitation (Fig. 5 b).In early February, the precipitation sum is slightly overestimated by the (bias-corrected) CCLM runs.
The high agreement between E-OBS and the bias-corrected CCLM runs can also be found in the horizontal distribution of the 7-day precipitation sum (including 22 to 28 January, Fig. 6).Outside the Rhine catchment, ERAI-CCLM and ERA20C-CCLM slightly overestimate precipitation over France (Fig. 6 b-c).Within the catchment, ERAI-CCLM shows less precipitation in the Alps (Fig. 6 b), compared to E-OBS and ERA20C-CCLM.Observed maxima in precipitation around 50 • N and between 5 • E and 10 • E are captured by both CCLM runs.
In summary, the Rhine winter flood in 1995 is well represented by the model, since precipitation intensity, timing, and the horizontal rainfall distribution agree with observations.Due to this close resemblance between model outputs and observations, the effect of the bias correction is of minor importance here.

Danube and Vistula: Floods of June 2009
In the last eight days of June 2009, a spatially extended stationary cut-off system located over Italy led to heavy precipitation events associated with flooding in the Danube and Vistula catchments (Godina and Müller, 2009).For example, in Mostviertel, Austria, 340 mm were reported between 22 and 29 June (Godina and Müller, 2009).
In the 7-day running mean of precipitation between May and July, there is a peak reaching 13 mm on 22 June in the Danube catchment and 11 mm in the Vistula catchment (Fig. 7 a and c).For the Danube, ERAI-CCLM produces more realistic precipitation amounts than ERA20C-CCLM.On 22 June itself, the bias correction leads to a clear improvement compared to the modeled precipitation (Fig. 7 a).The peak is just slightly overestimated in bias-corrected ERAI-CCLM.Before and after 22 June, the bias correction leads to the overestimation in the precipitation intensity compared to observations.In the accumulated precipitation sums, there is an overestimation in the first half and at the end of June with bias correction.However, there is   Italy.This is also true for ERA20C-CCLM (Fig. 8 c).However, rainfall within the Danube catchment is slightly underestimated.
In conclusion, the model runs underestimate precipitation within the Danube and the Vistula catchments.The bias correction increases the amount of precipitation for the Danube, but an underestimation still remains in the Vistula catchment.

Discussion
Precipitation from reanalysis was dynamically downscaled with an RCM (CCLM) to generate a precipitation dataset which is suitable as input for hydrological rainfall-runoff models.We applied a statistical bias correction method (monthly EQM), in which E-OBS was used as reference data.
ERA-Interim and ERA-20C reanalysis were used as initial and boundary conditions for the RCM and both have advantages and disadvantages.ERA-Interim is known to produce unphysical changes in the global mean precipitation (Dee et al., 2011).Additionally, the accuracy of reanalysis data depends on the availability and quality of observations (Dee et al., 2011).
ERA-20C tends to overestimate the wet-day frequency and to underestimate extreme precipitation values (Kim et al., 2018).
Therefore, the choice of reanalysis may be a limiting factor in this approach.
Five commonly used bias correction methods have been tested, where EQM comes forth as most suitable to correct the bias in daily precipitation sums, also found in several previous studies before (e.g., Themeßl et al., 2011;Bennett et al., 2014).In addition, three calibration methods (monthly, seasonal, and half-yearly) were compared.Bias correction with monthly scaling factors and distributions lead to better results than the seasonal or the half-yearly calibrations.Haerter et al. (2011) have already shown that the improvements from bias correction depend on the considered timescales of the bias correction function.They concluded that this dependency results from a different statistical behavior on different timescales.Although, we decided to use EQM, the other methods (LIN, LOCI, PT, and GQM) led also to improvements compared to the raw model runs, which agrees with findings from Teutschbein and Seibert (2012).Therefore, statistical measures, such as RMSE, R, or S, varied only slightly between the methods (see Fig. 2).Other approaches for bias correction can be found in e.g., Piani and Haerter (2012) or Moghim and Bras (2017).In Piani and Haerter (2012), copula coupling is presented, which means that precipitation and temperature are corrected by fitting the two-dimensional distribution of the model run to that of observations.The representation of spring foods could benefit from this approach, as temperature is more important here in respect of snow melt.Moghim and Bras (2017) used artificial neural networks (ANN) and showed that its output is suitable as input for hydrological models.
Although, ANNs are a powerful tool, a large effort is necessary to optimize an ANN for a specific application.
To show the benefit of bias correction, we investigated the top ten precipitation events of each river catchment for the period 1979-2010.As we used only precipitation sums to identify extreme events, the top ten cases do not include events in late winter or early spring that are mainly caused by a strong snow melt.In most of the analyzed cases, bias correction led to clear improvements towards the raw model output.In some cases, the CCLM simulations were already close to observations, so that bias correction did not lead to further improvements, indicating the capability of the model to simulate extreme precipitation.
In addition, we found a clear gradient from east to west.The underestimation of extreme precipitation amounts in the Vistula catchment could not be improved by the bias correction in most of the cases.We think that this underrepresentation of heavy precipitation may be associated with observational gaps and a low density of stations within the Vistula catchment and surrounding areas (Haylock et al., 2008;Kyselỳ and Plavcová, 2010).Two case studies support this view.
The bias correction has an added value, as over-and underestimated precipitation amounts can be corrected in such a way that they are closer to observations.However, bias correction cannot improve precipitation events in respect of timing, or fix errors in the underlying dynamical processes (Haerter et al., 2011;Maraun, 2016).Dynamics may be improved by re-run the model on a higher resolution (e.g., 2.8 km), where convection is resolved and orography is better depicted (e.g., Mass et al., 2002;Lean et al., 2008).Regarding convection, especially the simulations of summer floods may benefit from a higher model resolution, since embedded convection can enhance precipitation in between large-scale stratiform areas (e.g., Kirshbaum and Durran, 2004).Another source of uncertainty is the choice of reference data (here E-OBS).Although, E-OBS showed clear improvements during the last decade (e.g., for Poland), it inevitably has uncertainties.These uncertainties are caused by errors and inhomogeneties in the station data as well as by (rounding) errors from the interpolation process (Hofstra et al., 2009).
However, the suitability of the E-OBS dataset as reference data was shown by the high correlation coefficients between E-OBS and HYRAS daily precipitation sums (for three of the five considered river catchments, namely Elbe, Upper Danube and Rhine).
In this study, only precipitation was discussed, however, we calculated also datasets for the daily 2 m temperature, as this variable is also needed as input for rainfall-runoff models.Due to its relation to snow melt, temperature plays an important role for late winter and early spring floods (e.g., Marks et al., 1998;Parajka et al., 2010).Regarding bias correction, additive linear scaling, variance scaling and quantile mapping were tested.First results indicate that the quantile mapping with a Gaussian distribution is most promising.

Summary and Conclusions
This study is a proof of concept on how to create a precipitation dataset which can be used as input to run hydrological models.We applied a dynamical downscaling approach with the regional climate model COSMO-CLM.ERA-Interim (∼80 km resolution) and ERA-20C (∼125 km resolution) were used as initial and boundary conditions.Then, the modeled precipitation

Figure 2 .
Figure 2. Diagram according Taylor (2001) for displaying statistical variables.Radial black lines mark values of the standard deviation, blue lines show values of the correlation coefficient and green lines show values for the centered RMSD.Results for the Upper Danube are shown.The red star shows E-OBS.In a) the red cross shows ERAI-CCLM and the black crosses show bias-corrected ERAI-CCLM runs.In b) the red cross shows ERA20C-CCLM and the black crosses show bias-corrected ERA20C-CCLM runs.

Table 1 .
Root mean square error (RMSE), Pearson's correlation coefficients (R) and Taylor's skill scores (S) for the raw model outputs plus for the model outputs with different bias corrections just for the Upper Danube catchment: linear scaling (LS), local intensity scaling (LOCI), power transformation (PT), empirical quantile mapping (EQM), and quantile mapping with gamma distribution (GQM).Results of ERAI-CCLM based on RPI and of ERA20C-CCLM on RPII.Best values of each measure are print in bold.

Figure 3 .
Figure 3. Mean annual precipitation (in mm a −1 ) during the time period from 1979 to 2015 for a) E-OBS, as well as corresponding difference plots for b) ERAI-CCLM, and c) ERAI-CCLM bias-corrected with monthly EQM.

Figure 4 .
Figure 4. Precipitation intensity spectrum including all grid points within the Upper Danube catchment.Yellow circles illustrate E-OBS for the time period from 1950 to 2010.Blue crosses (orange squares) illustrate uncorrected ERA20C-CCLM for the time period from 1950 to 2010 (1900 to 1949).

Figure 5 .
Figure 5.Time series of a) the 7-day running mean and b) the accumulated sum of the daily precipitation sum (in mm), both averaged over the Rhine catchment.The orange line marks the 99 th percentile of the daily precipitation sum from E-OBS based on the time period between 1979 and 2010.Note that different time periods are presented in a) and b) (x-axis).

Figure 7 .
Figure 7. Similar to Fig. 5, but a) and b) show time series for the Danube and c) and d) for the Vistula.