Bias correction methods are used to calibrate climate model outputs with respect to observational records. The goal is to ensure that statistical features (such as means and variances) of climate simulations are coherent with observations. In this article, a multivariate stochastic bias correction method is developed based on optimal transport. Bias correction methods are usually defined as transfer functions between random variables. We show that such transfer functions induce a joint probability distribution between the biased random variable and its correction. The optimal transport theory allows us to construct a joint distribution that minimizes an energy spent in bias correction. This extends the classical univariate quantile mapping techniques in the multivariate case. We also propose a definition of non-stationary bias correction as a transfer of the model to the observational world, and we extend our method in this context. Those methodologies are first tested on an idealized chaotic system with three variables. In those controlled experiments, the correlations between variables appear almost perfectly corrected by our method, as opposed to a univariate correction. Our methodology is also tested on daily precipitation and temperatures over 12 locations in southern France. The correction of the inter-variable and inter-site structures of temperatures and precipitation appears in agreement with the multi-dimensional evolution of the model, hence satisfying our suggested definition of non-stationarity.

Global climate models (GCMs) and regional climate models (RCMs)
are used to study the climate system. However, their outputs often appear
biased compared to observational references

Most of those methods are univariate, and do not take into account the
spatial and inter-variable correlations, which may alter the quality of the
corrections

This shortcoming has led to the recent development of multivariate
techniques. As mentioned by

Optimal transport theory is a natural way to measure the dissimilarity
between multivariate probability distributions

Moreover,

This paper is organized as follows. In Sect.

The general goal of this paper is the correction of a random variable,
denoted

Following

In the first part, we highlight our method of bias correction with a univariate example starting from quantile mapping. In the second part, the mathematical theory is explained. Finally, an extension of our method in a non-stationary context is presented.

We start with the construction of a quantile mapping method in the univariate
case, i.e., with

Histogram of two Gaussian laws

We illustrate the quantile mapping method with an example in
Fig.

The main point here is the following: in the univariate context, we can
perform a bias correction with only the black arrows. A realization in a cell
of

For this, let

The problem is to calculate the coefficients

The advantage of this approach is that the functional

In the next section we present the mathematical theory behind this example
with probability measures of

In the multivariate context we assume the existence of a transfer function

We argue that

We note that the problem where

We have defined a bias correction method as an element of

To select a probability law

Our next step is to explain how this minimization strategy can be extended in the multivariate case.

We assume that

Note that the traditional one-dimensional quantile mapping preserves the
ordering of quantiles. In the multivariate case, this type of property can be
viewed as the Monge–

Representation of bias correction in the context of climate change.

Climate models offer a valuable tool to study future realistic climate
trajectories. Climate model outputs of the present period need to be bias
corrected with respect to current observations. Future climate simulations
also need to be adjusted. However, no observation is available for the future
and clear assumptions have to be made to correct simulations for future
periods. Table

CDF-

Estimation of the unobserved random variable

Using OTC, we define two optimal plans: the optimal plan

Bivariate histogram with bin size equal to

The estimation of

transformation of

transferral of these vectors along

adaptation of these vectors to

Random variables generated by the

Finally, a realization of

We first propose evaluating OTC and dOTC on an idealized case.

To evaluate our bias correction method, we construct an idealized biased
case, based on the

One realization of random variable

We introduce a bias by multiplying each point of the trajectories by a
triangular matrix

The random variables

We estimate the empirical distributions

Finally, we evaluate the quality of the correction by comparing the
covariance matrices of

We apply our method to correct

The correction

By contrast,

The dataset used as a reference for the bias correction (BC) is the Systeme
d'Analyse Fournissant des Renseignements Atmospheriques a la Neige

We test our multivariate BC method on a simulation of the Weather Research
and Forecast (WRF) atmospheric model

SAFRAN and WRF data are re-mapped onto the same grid, with a spatial
resolution of 0.11

In both datasets, we will consider daily surface air temperatures and precipitation. The goal of this section is to correct the bias in tas and pr in the WRF data with respect to SAFRAN.

We focus on the daily timescale over the 1970–2000 period. We correct the
warm season (May–September). The analysis and conclusions are available for
the cold season, and the corresponding figure
(i.e., Fig.

We perform two bias corrections: univariate and

For univariate correction, quantile mapping is used for the calibration
period, and CDF-

For

As we have seen in the previous section, the corrections of

The

The linear regression between evolution of

The evolution of dependence structure is given by the evolution of spatial
and inter-variables covariance. The minimal

A linear regression, the Spearman rank correlation between the evolution of SAFRAN, and the evolution of the correction with WRF do not show a significant statistical link (not shown). We conclude that the evolution of WRF is different of the evolution of SAFRAN. This indicates it is not possible to reproduce SAFRAN during projection period using dOTC and WRF. For example, WRF predicts an increase between 0.2 and 0.4 K of the mean temperature, whereas SAFRAN gives an increase between 0.2 and 1 K.

The correction with CDF-

We conclude that the evolution of the

We have developed a new method for multivariate bias correction, generalizing the quantile mapping in the multivariate case. To do so, we have developed a new theoretical framework to understand any bias correction (BC) method: any BC method is here characterized by a joint law between the biased dataset and the correction. This joint probability distribution is estimated based on optimal transport techniques, and the BC method is then referred to as optimal transport correction (OTC). A definition of non-stationary bias correction is also proposed: the evolution of the model is learned and transferred to the reference world. An extension of OTC called dynamical OTC (dOTC) has been developed to account for temporal non-stationarities.

OTC and dOTC methods have been tested on an idealized three-dimensional case
based on

Then,

This is consistent with the results of

Furthermore, although the number of available data is very small compared to
the dimensions (2295 days and

The methods OTC and dOTC are able to correct the dependence structure
(i.e., the joint law), and not only the inter-variable and spatial
correlations. In particular, the copula function (which contains the
information about dependence) is corrected. In addition, dOTC proposes a
definition of non-stationarity, and explicitly gives what the correction
corresponds to (the evolution of the model applied to observations). In the
particular case of the temperatures/precipitation correction, compared to,
e.g.,

As a perspective of improvement of the method, we note that the optimal plan
can only be used to correct data points that are already known. If a new data
point is obtained, and alters the estimate of the probability density
function, then the plan needs to be recomputed. However, such a situation is
relatively rare in bias correction. Indeed, the corrections usually have
to be performed on climate model simulations that cover many
years and decades. This means that the
whole time series are available at once and are not continuously updated.
One possibility would be to “smooth” the optimal plan that,
thus, could be applied to new points without recalculating the plan.
Finally, a promising application of this method is the post-processing of
operational forecasts.
In such a case, the question of internal variability

OTC and dOTC are implemented in two packages: ARyga (R)
and Apyga (python3). These packages are available at

The supplement related to this article is available online at:

YR performed the analyses. The experiments were co-designed by YR and MV. All the authors contributed to writing the manuscript.

The authors declare that they have no conflict of interest.

This work was supported by ERC grant no. 338965-A2C2. Edited by: Uwe Ehret Reviewed by: Michael Muskulus and one anonymous referee