Dynamically downscaled precipitation fields from regional climate models
(RCMs) often cannot be used directly for regional climate studies. Due to
their inherent biases, i.e., systematic over- or underestimations compared
to observations, several correction approaches have been developed. Most
of the bias correction procedures such as the quantile mapping approach
employ a transfer function that is based on the statistical differences
between RCM output and observations. Apart from such transfer function-based statistical correction algorithms, a stochastic bias correction
technique, based on the concept of Copula theory, is developed here and
applied to correct precipitation fields from the Weather Research and
Forecasting (WRF) model. For dynamically downscaled precipitation fields we
used high-resolution (7 km, daily) WRF simulations for Germany driven by
ERA40 reanalysis data for 1971–2000. The REGNIE (REGionalisierung der NIEderschlagshöhen) data set from the German
Weather Service (DWD) is used as gridded observation data (1 km, daily) and
aggregated to 7 km for this application. The 30-year time series are
split into a calibration (1971–1985) and validation (1986–2000)
period of equal length. Based on the estimated dependence structure
(described by the Copula function) between WRF and REGNIE data and the
identified respective marginal distributions in the calibration period,
separately analyzed for the different seasons, conditional distribution
functions are derived for each time step in the validation period. This
finally allows to get additional information about the range of the
statistically possible bias-corrected values. The results show that the
Copula-based approach efficiently corrects most of the errors in WRF
derived precipitation for all seasons. It is also found that the
Copula-based correction performs better for wet bias correction than for
dry bias correction. In autumn and winter, the correction introduced a
small dry bias in the northwest of Germany. The average relative bias of
daily mean precipitation from WRF for the validation period is reduced
from 10 % (wet bias) to

Most climate studies operate on a regional and local
scale. Global climate models (GCMs), however, provide climatological
information only on coarse scales, usually in a horizontal resolution of
100–300 km. Since they are not able to mimic the regional- and local-scale climate variability, further refinement is necessary. For dynamical
downscaling, regional climate models (RCMs) are capable of bridging the gap
between large-scale GCM data and local-scale information to conduct climate
studies. Nevertheless, the RCM simulations usually do not agree well with
observations even if downscaled to high spatial resolutions

In this study, a Copula-based stochastic bias correction method is applied
to correct each individual time step of a RCM simulation. This is different
to the traditional transfer function-based statistical correction
approaches. The strategy of this method is the identification and
description of the underlying dependence structures between observed and
modeled climate variables (precipitation) and its application for bias
correction. It is known that the traditional measures of dependence
(e.g., Pearson's correlation coefficient) can only capture the strength of
the linear dependence as a single global parameter. Alternatively, Copulas
are able to describe the complex nonlinear dependence structure between
variables

Recently, Copulas are used for various applications in hydrometeorology

The article is structured as follows: in Sect. 2 the data sets for this application are introduced. Section 3 briefly describes the basic theory of Copulas and the procedure of Copula-based conditional simulations to correct RCM precipitation. Results of application of the Copula-based approach for Germany are shown in Sect. 4, followed by the summary and conclusions (Sect. 5).

In this section the data sources which are used for the application of the
Copula-based bias correction method for gridded data sets is described. The
newly developed approach is applied for Germany (Fig.

Terrain elevation of Germany (digital elevation model). The numbers represent the position
of the four specific grid cells for which the performance of the
Copula-based algorithm is analyzed in Fig.

Visualization of a bivariate Copula model consisting of two marginal distributions and a theoretical Copula function that describes the pure dependence.

Dynamically downscaled precipitation fields over Germany from a RCM
simulation

As observations, we used the 1 km gridded daily data set REGNIE

In this section the fundamentals of Copula theory are briefly
summarized. Details about Copula theory are given, e.g., in

Let (

The Copula functions provide a functional link between the two univariate
marginal distributions

As a consequence of Sklar's theorem, each complex and unknown joint
distribution

A scatter plot of the two realizations (

The next step is to estimate the theoretical Copula function

The Copula-based modeling of the dependence between

In this study, five different parametric distribution functions are tested
(Weibull, gamma, normal, generalized Pareto and exponential). For all time
series (REGNIE and WRF), the parameters of the respective distribution
functions are estimated by a standard maximum likelihood estimation
(MLE). The goodness-of-fit is evaluated in a two-stage process. Firstly,
a K–S test is applied

The BIC selects the optimum within a finite set
of models. It is based on the likelihood function and deals with the
trade-off between the goodness-of-fit of the model and its complexity:

The Copulas from different families describe different dependence structures. To increase the accuracy of the description of the dependence, different types of Copulas are considered, since one common Copula might be incapable of capture the dependence structure for all grid cells over the entire study area and for all seasons.

Theoretical Copula functions used in this study.

In this study, four different one-parametric Copulas (see Table

For the Copula goodness-of-fit test we closely follow the approach as
described in

Since the dependence structure, i.e., the theoretical Copula function,
between

The Copula-based bias correction applied for this study is based on the
estimation of a Copula model for each pair of observed (

estimate the theoretical marginal distributions

transform the time series

calculate the empirical Copula

estimate the Copula parameter

calculate the Copula distribution conditioned on the variate

generate the pseudo-observations in the rank space for each time step by using the conditional Copula distribution;

transform back the random samples to the data space by using the integral transformation.

The Copula-based conditional simulation is the critical step of this bias correction approach, as it forces a certain variable (observation) to take a value when another variable (RCM) is given. To assess the uncertainty associated with this prediction, the conditional prediction process (step 6 and 7) must be repeated for a large number of times This provides the possibility to obtain a large set of random realizations and additionally gives the information of a probability density function (PDF) for each corrected time step. From the PDF the spread of the distribution in form of the inter-quantile range can, e.g., be provided as an additional uncertainty criterion for the bias correction.

The implementation of a bias correction for precipitation (a discrete
variable) is more complex than a bias correction of a continuous variable,
e.g., temperature. In general four cases have to be distinguished, namely,
(0,0), (0,1), (1,0) and (1,1), where 0 denotes a dry day and 1 indicates
a wet day (see Fig.

(1,1): REGNIE and WRF precipitation

(0,1): REGNIE

(1,0): REGNIE

(0,0): both REGNIE and WRF

Different approaches exist in the literature to account for the intermittent
nature of rainfall. For example the truncated Copula suggested in

Illustration of the four cases: (0, 0) indicates that both REGNIE and WRF show no rain, (0, 1) stands for an observation with no precipitation but the RCM model shows a rain event, while (1, 0) indicates the opposite of (0, 1), (1, 1) implies that both are wet.

The proportion of the four cases over the study area for the validation time period (from 1986 to 2000).

In this study, we aim for an event-based correction as described in the
following: the Copula-based concept focuses on the correction of the (1,1)
cases, i.e., the positive pairs. In order to generate a complete bias
corrected time series of WRF output, the events that are not covered by the
(1,1) case are left unchanged. For the (0,0) cases, there is no error. The
errors that come from the (0,1) and (1,0) cases are not corrected by this
method. To justify this strategy, we investigated the proportion of the four
cases in the study area (see Fig.

In this section, details about the estimated Copula models are presented including information about the fitting of the marginal distributions and the theoretical bivariate Copula functions from the calibration period (1971–1985). Since the estimated marginal distributions reflect the statistical characteristics of RCM and observations, their differences are analyzed spatially. The fitted Copula models are applied for the validation period (1986–2000) to bias correct the WRF precipitation. It is found that the dependence structures vary intra-annually, therefore the performance of the algorithm is analyzed separately for the different seasons.

For both REGNIE and WRF data, five different distribution functions are
employed for each grid cell separately: generalized Pareto distribution
(gp); gamma distribution (gam); exponential distribution (exp); Weibull
distribution (wbl) and normal distribution (norm). This guarantees the
flexibility in selecting the most appropriate distribution for each grid
cell. The goodness-of-fit tests (K–S test and the BIC; see Sect.

The coincidence between REGNIE and WRF marginals is shown in the confusion
matrix. Each row of the matrix represents the distribution types of REGNIE,
while each column represents that of WRF (in %). The major diagonal shows
the fraction of concurring marginal types. The confusion matrix for the
calibration period is shown in Table

In order to assess the annual variability in the precipitation time series, the marginal distributions are estimated for the different seasons (spring – MAM, summer – JJA, autumn – SON, winter – DJF).

Estimated marginal distributions of precipitation for Germany for both REGNIE (left panel) and WRF (right panel). The results are shown for the calibration period (1971–1985) and positive pairs only.

For both REGNIE and WRF data, the seasonal representation of the different
distribution types is shown in Fig.

The seasonal confusion matrices are shown in
Table

Confusion matrix between REGNIE and WRF for the different distribution types.

Seasonal confusion matrix of fitted REGNIE and WRF precipitation distribution.

Estimated marginal distribution of precipitation for the different seasons for REGNIE (left column panels) and WRF (right column panels) in Germany. The results are shown for the calibration period (1971–1985) for positive pairs only. Spring (MAM), summer (JJA), autumn (SON) and winter (DJF) are illustrated from top to bottom.

Identified Copula functions between REGNIE and WRF precipitation in the calibration period (1971 to 1985) with positive pairs.

As mentioned above in Sect.

For each grid cell the theoretical Copula function, which characterizes the
dependence structure between REGNIE and WRF data, is identified
separately. Four Copulas (Clayton, Frank, Gumbel and Gaussian) are
investigated by applying the goodness-of-fit tests described in
Sect.

In order to assess the annual variability of the dependence structures
between REGNIE and WRF precipitation time series, the Copula functions are
identified for the different seasons separately. The corresponding results
are shown in Fig.

While for spring, autumn and winter the Copulas that have no pronounced tail
dependence (the Frank and Gaussian Copula) dominate (spring 49 %
(Frank)

Based on the estimated Copula model (parametric marginal distributions and
theoretical Copula functions), the conditional distribution of REGNIE
conditioned on WRF is derived for each grid cell separately (see
Sect.

Figure

It can be seen from Fig.

The proportion of grid cells for both REGNIE and WRF that K–S test failed and only BIC is used in goodness-of-fit procedure.

To investigate the spatial performance of the correction algorithm, the relative bias of RCM modeled mean daily precipitation (WRF) compared to gridded observations (REGNIE) is compared to that of the bias-corrected model data (B. C. WRF) for Germany.

A comparison of corrected WRF data derived by the expectation, median and
mode of the predictive distribution with observations indicates that the
correction performs best for the expectation value (see
Fig.

Fitted Copula functions between REGNIE and WRF precipitation (calibration period (1971–1985), positive pairs only). The Copulas are identified for the different seasons (spring – MAM, summer – JJA, autumn – SON, winter – DJF).

Comparison of bias-corrected WRF data (blue) with the original WRF data
(red) and REGNIE (green) in winter 1986–1987 (positive pairs only) for
pixel 1 in Fig.

Relative bias map of mean daily precipitation for Copula-based correction by taking the expectation (left panel), median (middle panel) and the mode (right panel) as the estimator of the sampled distribution. The results are based on the validation period 1986–2000.

Relative bias of mean daily precipitation for uncorrected (left panel) and corrected WRF precipitation field (right panel). The results are based on the validation period 1986–2000.

Relative bias between uncorrected (left panels) and corrected (right panels) WRF mean daily precipitation and the REGNIE data set in Germany for the different seasons (spring – MAM, summer – JJA, autumn – SON, winter – DJF, from top to bottom). The results are derived for the validation time period (1986–2000).

Figure

A performance analysis with respect to seasonal variations is shown in
Fig.

In the following, it is further analyzed how well the model can reproduce the intra-annual variability of observed precipitation and how the performance for the different seasons is influenced by the Copula-based correction algorithm.

To investigate typical situations in detail, the results are shown for four
specific grid cells in the study area (see Fig.

Figure

Comparison of bias-corrected WRF mean monthly precipitation (blue) with REGNIE (green) and original WRF data (red) for the selected four pixel 1–4 in the validation period from 1986 to 2000. The number of the respective grid cell is noted in the upper left corner of each plot.

Daily precipitation fields over Germany for three consecutive days from 9 to 11 January 1986.

The rank correlations between RCM and REGNIE precipitation over the domain in the validation period from 1986 to 2000.

The changes of the RMSE in the validation period (1986–2000) by different bias correction methods. The green color indicates a decrease of the RMSE, while the ocher color implies an increase of the RMSE.

The root mean square errors (RMSE) and the root mean square errors for
specific probability intervals (RMSE

The percentage of the corrections that are closer to the observations. Left panel: Copula-based correction (mean regression); right panel: quantile mapping correction. The results are derived from the validation period from 1986 to 2000.

The results for grid cell 1 in Fig.

Finally, in order to investigate the spatial coherence of the bias-corrected
precipitation fields, the sequence of three selected days (from 9 to 11 January 1986)
are exemplarily shown in Fig.

The quantile mapping method is often used in bias correction of RCM derived
precipitation

The quantile mapping correction has been performed for comparison to the
Copula-based approach. The RMSE between the observed (REGNIE) and bias-corrected modeled data is calculated for both the Copula-based correction
and the quantile mapping method. The original RMSE (between REGNIE and WRF)
is also computed as a reference. For the Copula-based approach, we
calculated the RMSE for all the simulations with respect to the mean,
median and mode value. The changes of the RMSE by different corrections
over the study area are shown in Fig.

To further assess the performance of the Copula-based method, additional
performance measures are analyzed. The RMSE for different magnitudes of
observed precipitation (i.e., a quantile RMSE analysis) is done for the
selected four grid cells (see Fig.

Furthermore, we also investigated the percentage of the corrected time steps
that are closer to the observations compared to the quantile mapping
method. The results are shown in Fig.

In this study, a Copula-based stochastic bias correction technique for RCM output is introduced. The strategy of this method is the identification and description of underlying dependence structures between RCM and observed precipitation and its application for bias correction. Copulas are able to capture the nonlinear dependencies between variables (between RCM and gridded observed precipitation) including a reliable description of the dependence structure in the tails of the joint distribution. This is not possible, e.g., by using a Gaussian approach or methods based on the Pearson's correlation coefficient. Yet, another albeit more practical advantage of this approach is that the univariate marginal distributions can be modeled independently from the dependence function, i.e., the Copula. This provides more flexibility to construct a correction model by combining different marginal distributions and Copula functions, as many parametric univariate distribution and theoretical Copulas are available.

The conditional distribution derived from fitted Copula model forms the basis of the correction procedure. It provides the possibility to access all the possible outcomes of the corrected value and additionally gives the information of a PDF for each corrected time step.

This study is an extension of the two former studies of

The grid cell base is worked on and the Copula model (marginal distributions and Copula function) is estimated for each grid cell separately rather than selecting, e.g., the most dominant model. Therefore, the statistical characteristics of observed (REGNIE) and modeled data (WRF) and their dependence structure are visualized spatially and analyzed for the first time.

The BIC, as well as the K–S test, is implemented for the marginal goodness-of-fit test. From previous studies we found that very large sample sizes may bias the result of the K–S test, leading to the rejection of the null hypothesis (the sample comes from the selected distribution) most of the time.

The Copula model is estimated for every season separately. Thus, different precipitation geneses types are not masked by the same models. This, in general, leads to stronger dependencies and more robust models.

For the dependence function it was detected that the fitted Copula families vary both in space and time (seasonally). The fact that different dependence structures exist for the different seasons indicates that the method corrects for different dominating precipitation types, i.e., convective and stratiform precipitation.

The assumption of this approach is that the dependence structure between observed and modeled precipitation is stationary over the period of interest. For the investigation of the spatial performance, the Copula correction based on the mean value is applied. The validation results show that the proposed approach successfully corrected the errors in RCM derived precipitation. It is also found that the correction method performs better for overestimation than for underestimation. By investigating the spatial coherence, the proposed method is found to be able to preserve the spatial structure of the WRF model output. This is due to the fact that the Copula-based approach is conditioned on the WRF simulation. The method adjusts the value of the WRF precipitation according to the fitted Copula model. Even though the Copula models are estimated for each grid cell, the spatial coherence is captured by the Copula model as both the Copula families as well as the marginal distributions are also spatially clustered.

When comparing to the quantile mapping correction, the Copula-based method has an improved performance in reducing the RMSE. It is also found that the Copula-based method allows for a better correction with respect to the percentage of the time steps that are closer to the observations after the correction. The Copula-based method is able to provide a stable correction efficiency over the entire domain, even if the rank correlations between the RCM and observed precipitation are low.

Apart from traditional approaches, such as the quantile mapping which is based on a bijection transfer function, the Copula-based stochastic bias correction technique provides the information of the full PDF for each individual time step. This additionally provides a quality criterion for the bias correction, e.g., expressed as the spread of the PDF in form of the inter-quantile range. Subsequent modelers using RCM derived precipitation data are potentially enabled to make use of the full PDF, especially if they are interested in other statistical moments or in estimating uncertainties arising from this approach.

In this study, the Copula-based bias correction is only applied for past precipitation time series. The method would need further modifications if applied to future climate scenarios. A suitable algorithm must be able to reflect changes in the marginal distributions as well as the joint distributions, taking into account possible non-stationarity of precipitation time series.

The authors acknowledge funding from China Scholarship Council (CSC), the Bavarian State Ministry of the Environment and Public Health (reference number VH-ID:32722/TUF01UF-32722) and the Federal Ministry of Education and Research as part of the research project Land Use and Climate Change interactions in Central Vietnam (LUCCi, reference number 01LL0908C). The WRF simulations were carried out at High Performance Computing Center Stuttgart (HLRS), the University of Stuttgart, within the project high-resolution regional climate modeling for Germany using WRF. We would also like to extend our appreciation to the German Weather Service (DWD) for the REGNIE data set. Finally, we acknowledge support from the Deutsche Forschungsgemeinschaft for the Open-Access Publishing Fund of Karlsruhe Institute of Technology. The article processing charges for this open-access publication were covered by a Research Centre of the Helmholtz Association. Edited by: R. Uijlenhoet