Compound events (CEs) are multivariate extreme events in which the individual contributing variables may not be extreme themselves, but their joint – dependent – occurrence causes an extreme impact. Conventional univariate statistical analysis cannot give accurate information regarding the multivariate nature of these events. We develop a conceptual model, implemented via pair-copula constructions, which allows for the quantification of the risk associated with compound events in present-day and future climate, as well as the uncertainty estimates around such risk. The model includes predictors, which could represent for instance meteorological processes that provide insight into both the involved physical mechanisms and the temporal variability of compound events. Moreover, this model enables multivariate statistical downscaling of compound events. Downscaling is required to extend the compound events' risk assessment to the past or future climate, where climate models either do not simulate realistic values of the local variables driving the events or do not simulate them at all. Based on the developed model, we study compound floods, i.e. joint storm surge and high river runoff, in Ravenna (Italy). To explicitly quantify the risk, we define the impact of compound floods as a function of sea and river levels. We use meteorological predictors to extend the analysis to the past, and get a more robust risk analysis. We quantify the uncertainties of the risk analysis, observing that they are very large due to the shortness of the available data, though this may also be the case in other studies where they have not been estimated. Ignoring the dependence between sea and river levels would result in an underestimation of risk; in particular, the expected return period of the highest compound flood observed increases from about 20 to 32 years when switching from the dependent to the independent case.

On 6 February 2015, a low-pressure system that developed over the north of
Spain moved across the island of Corsica into Italy. The low pressure itself
(Fig.

Sea level pressure and total precipitation on 6 February 2015, when the coastal area of Ravenna (indicated by the yellow dot) was hit by a compound flooding.

Such a compound flood is a typical example of a compound
event (CE). CEs are multivariate extreme events in which the individual
contributing variables may not be extreme themselves, but their
joint – dependent – occurrence causes an extreme impact. The impact
of CEs may be a climatic variable such as the gauge level (e.g. for compound
floods), or other relevant variables such as fatalities or economic losses.
CEs have received little attention so far, as underlined in the report of the
Intergovernmental Panel on Climate Change on extreme events

CEs are responsible for a very broad class of impacts on society. For
example, heatwaves amplified by the lack of soil moisture, which reduces the
latent cooling, may be classified as CEs

In the recent literature, more attention has been given to the study of CEs
through multivariate statistical methods

Modelling CEs is a complex undertaking

Due to the complex dependence structure between the contributing variables,
advanced multivariate statistical models are necessary to model CEs. For
example, modelling the multivariate probability distribution of the
contributing variables with multivariate Gaussian distributions would usually
not produce satisfying results. A multivariate Gaussian distribution would
assume that the dependencies between all the pairs are of the same type
(homogeneity of the pair dependencies), and without any dependence of the
extreme events, also called tail dependence. Furthermore, a multivariate
Gaussian distribution would assume that all of the marginal distributions
would be Gaussian. To solve the latter problems, the use of copulas has been
introduced in geophysics and climate science

Here we develop a multivariate statistical model, based on PCCs, which allows for an adequate description of the dependencies between the contributing variables. The model provides a straightforward quantification of risk uncertainty, which is reduced with respect to the uncertainties obtained when computing the risk directly on the observed data of the impact. We extend the multivariate statistical model by including predictors for the contributing variables. Such predictors could represent for instance meteorological processes driving the contributing variables. This increase in complexity of the model due to additional variables is accommodated for through the use of PCCs. The predictors allow us to (1) gain insight into the physical processes underlying CEs, as well as into the temporal variability of CEs, and (2) to statistically downscale CEs and their impacts. Downscaling may be used to statistically extend the risk assessment back in time to periods where observations of the predictors but not of the contributing variables and impacts are available, or to assess potential future changes in CEs based on climate models. Based on this model, we study compound flooding in Ravenna.

In the context of compound floods, the dependence between rainfall and sea
level has previously been studied for other regions

Here, we explicitly define the impact of compound floods as a function of sea and river levels in order to quantify the flood risk and its related uncertainties. Moreover, we quantify the risk underestimation that occurs when the dependence among sea and river levels is not considered. We identify the meteorological predictors driving the river and sea levels. By incorporating such predictors into the statistical model, we extend the analysis of compound floods into the past, where data are available for predictors but not for the river and sea level stations.

The paper is organized as follows. The Ravenna case study is discussed in
Sect.

In this study, we focus on the risk of compound floods in the coastal area of
Ravenna. The choice of the case study was motivated by the extreme event that
happened on 6 February 2015, as presented in the Introduction. On the day
prior to the event, values of up to approximately

A schematic representation of the catchment on which we focus is shown in the
black rectangle of Fig.

Hydraulic system for the Ravenna catchment. The area affected by
compound floods is marked by the red point. The impact is the water level

We develop a multivariate statistical model able to assess the risk of
compound floods in Ravenna. Our research objectives are the following.

Develop a statistical model to represent the dependencies between the contributing variables of the compound floods, via pair-copula constructions.

Explicitly define the impact of compound floods as a function of the contributing variables. This allows us to estimate the risk and the related uncertainty.

Identify the meteorological predictors for the contributing variables

Extend the analysis into the past (where data are available for the predictors but not for the contributing variables

The data used here for the contributing variables

Our statistical conditional model consists of three components: the
contributing variables

The downscaling feature is particularly useful for compound events, which are
not realistically simulated or may not even be simulated at all by available
climate models. For instance, standard global and regional climate models do
not simulate realistic runoff

More specifically, the conceptual conditional model consists of the
following.

An impact function to quantify the impact

Predictors

A conditional joint probability density function (pdf)

When the variables

An advantage of using a parametric statistical model is that this constrains
the dependencies between the contributing variables, as well as their
marginal distributions, and thereby reduces their uncertainties with respect
to empirical estimates

Pair-copula constructions (PCCs) are mathematical decompositions of
multivariate pdfs proposed by

Consider a vector

Under the assumption that the marginal distributions

The dependence of extreme events cannot be measured by overall correlation
coefficients such as Pearson, Spearman or Kendall. Given two random variables
which are uncorrelated according to such overall dependence coefficients,
there can be a significant probability of getting concurrent extremes of both
variables, i.e. a tail dependence

Mathematically, given two random variables

While the number of bivariate copula families is very large

When the dimension of the pdf is large, there can be many possible,
mathematically equally valid decompositions of the copula density into a PCC.
For example, for a five-dimensional system there are 480 possible different
decompositions. For this reason,

As described in
Sect.

The extreme impact of compound events may be driven by the joint occurrence
of non-extreme contributing variables

Define the impact function:

Find the meteorological predictors of the contributing variables

Fit the five-dimensional conditional joint pdf

Given the complexity of the problem, an analytical derivation of the statistical proprieties of the impact is
impracticable. Therefore, we apply a Monte Carlo procedure. Specifically we simulate the contributing variables

Perform a statistical analysis of the values

To neglect the Monte Carlo uncertainties, i.e. the sampling uncertainties due
to the model simulations, we produce long simulations. For example, to obtain
the model-based return level curve, we simulate a time series

The water level

Figure

Scatter plots of predictands

The meteorological influence on the two rivers

The river levels are influenced by the total input of water over the
catchments, which is given by the positive contribution of precipitation and
snowmelt, and by evaporation which results in a reduction of the river
runoff. Specifically, we compute the input of water

By defining the river predictor as in Eq. (

All of the terms involved in the multiple regression model
(Eq.

Sea level can be modelled as the
superposition of the barometric pressure effect, i.e. the pressure exerted by
the atmospheric weight on the water, the wind-induced surge, and an overall
annual cycle. As for the river predictor, we aggregate the different physical
contributions in a single predictor. We define the sea level predictor on day

Regression map

All the terms involved in the multiple regression model are statistically
significant at level

The results of the unconditional and conditional models are presented in the following sections.

The unconditional model reproduces the joint pdf of the contributing
variables

Scatter plots of observed (grey) against simulated (black)
contributing variables

Figure

Unconditional model. Return levels of the impact

This model allows for assessment of the change in the risk of compound floods
due to temporal variations of the meteorological predictors of the
contributing variables

Validation time series of the conditional model obtained by 6-fold
cross-validation.

The cross-validation time series of the impact

In Fig.

To estimate the risk based on predicted values of the impact during the past,
we run the simulations by conditioning on predictors of the period
1979–2015. This allows us to get a more robust estimation of the risk
compared to that obtained considering only the period 2009–2015. The return
levels in Fig.

During the period 1979–2015, there has not been any long-term trend in the
risk due to a variation of the marginal distributions of the predictors or in
their dependence. To study this, we computed the return levels on moving
temporal windows in the cases described below. First, we simulated the impact
by conditioning the

Conditional model.

Compound events (CEs) are multivariate extreme events in which the contributing variables may not be extreme themselves, but their joint – dependent – occurrence causes an extreme impact. Conventional univariate statistical analysis cannot give accurate information regarding the multivariate nature of CEs and therefore the risk associated with these events.

We develop a conceptual model, implemented via pair-copula constructions (PCCs), to quantify the risk of CEs as well as the associated sampling uncertainty. This model includes predictors, which could represent for instance meteorological processes. The inclusion of predictors in the model (1) provides insight into the physical processes underlying CEs, as well as into the temporal variability of CEs, and (2) allows for statistical downscaling of CEs and their impacts. The model is in principle extendable to any number of contributing variables and predictors, given a large enough sample of data for calibration.

Downscaling may be used to statistically extend the risk assessment back in
time to periods where observations of the predictors are available but not of
the contributing variables and impacts, or to assess potential future changes
in CEs based on climate models. The conceptual model is particularly useful
for downscaling large-scale predictors from climate models in cases where the
local contributing variables driving the impacts of CEs are either not
realistically simulated or not simulated at all by the available climate
models. As such, the model can straightforwardly be used to assess future
risk of CEs based on multi-model ensembles as available from the CMIP

The model makes use of PCCs, a very powerful statistical method to model
multivariate dependencies. PCCs are particularly useful for modelling CEs,
when the contributing variable pairs have different dependence structures,
e.g. when only some of them are characterized by tail dependence. To model
such types of structures, even multivariate parametric copulas, which have
been introduced in climate science to overcome some difficulties in modelling
multivariate density distributions

The model allows for a straightforward quantification of sampling
uncertainties. In many cases, such risk uncertainties might be substantial as
observed data are often limited, and should thus be quantified. In fact,
uncertainty estimates are essential to avoid drawing conclusions that may be
misleading when uncertainties are large (as also recently discussed by

We adapt the developed conceptual model to study compound floods in Ravenna, which are floods driven by the joint occurrence of storm surge and high river level. In other words, the contributing variables of the compound floods are the river and sea levels, whose combination drives the impact, i.e. the water level in between the river and the sea.

We used the specific adaptation of the model to statistically downscale the river and sea level from meteorological predictors, and therefore estimate the impact of the compound floods as a function of the downscaled sea and river levels. The accuracy of the estimated impact appears satisfactory, such that the model is potentially interesting for use in both flood forecasting and warning. Also, the model-based expected return levels of the impact are about the same as those directly computed on observed data of the impact. Although the model-based uncertainty in these return levels is very large (due to the shortness of the available data), for return period smaller than about 60 years it is smaller than that obtained by computing the risk directly on the observed data of the impact.

We calibrate the model over the period 2009–2015, and by including
meteorological predictors obtained from the ECMWF ERA-Interim reanalysis
dataset, we extend the analysis of compound flooding to the full period of
1979–2015, to obtain a more robust estimation of the risk. The expected
return period of the highest compound flood observed, computed over the
period 1979–2015, is 19 years (the

Ignoring the estimated dependence between sea and river levels may lead to an
underestimation of risk. Specifically, assuming independence between sea and
river levels, the expected return period of the highest compound flood
observed – computed over the period 2009–2015 – is 32 years (the

In the context of compound floods, only a few studies have explicitly
quantified the impact and the associated risks

The developed routines for working with conditional joint
probability density functions decomposed as D- or C-vines are publicly
available via the CDVineCopulaConditional R package

Sea level data of the Ravenna-Porto Corsini station were
downloaded from the Italian National Institute for Environmental Protection
and Research (ISPRA), and are available under the link

The zero reference level of river measurements is the water level in the
river defined as zero in the measurements. In general, such a zero reference
level may change during different periods of observation, for technical
reasons. As the zero reference level of rivers

In this appendix we show more details about vines, focusing on C- and D-vines. Moreover, we discuss the sampling procedure, showing the algorithms to perform the conditional sampling from C- and D-vines.

Shown below are the general expressions to decompose an

The five-dimensional vine that we use for the conditional model is shown in
Eq. (

In total, a three-dimensional copula density can be decomposed in three
different ways, and each of these vines is both a D-vine and a C-vine. For
this application we use the following vine.

In this study of compound floods, the variables

To simulate a vector

The simulation of the uniform variables from vines is discussed in

It is clear then that to sample from the conditional distribution of

Following this approach, for D-vines the number of

To apply such a sampling procedure, we developed
Algorithms

Algorithm to simulate uniform variables

Sample

Algorithm to simulate uniform variables

Sample

Finally, we underline that this is not the only way to proceed for the
conditional simulation

Statistical inference on a pdf decomposed via a PCC is in principle
computationally very demanding. As can be seen from Eq. (

To overcome these obstacles, some techniques have been developed. The
complications regarding the dependence of the copula parameters from the
marginals estimation can be overcome using empirical marginals

In this study of compound floods, for each marginal pdf we use a mixture
distribution composed of the empirical and generalized Pareto distribution
(GPD) for the extreme. For each predictor

We use the AIC to select the best vine structure among C- and D-vines (those
selected are shown in Sects.

K-plots of the pair-copula families selected for the
five-dimensional model (names of the families and parameters are shown in the
top left of each plot). In abscissa the empirical K-function and in ordinate
the K-function based on fitted copula. The

The CDVineCopulaConditional

In the case of the unconditional model, the fitted pair-copula families to
the observed contributing variables

The flexibility of copula theory in modelling multivariate distributions has
determined its spread in the literature, and more recently in climate
science. However, once the model is fitted to observed data, we stress that
procedures to get an estimate of the uncertainties, both in the parameter
estimates and the choice of the model, should be considered. This is
particularly important, as it often happens that because of the limited
sample size of the available data, these uncertainties are large and so
cannot be neglected

In this study, we find model uncertainties in the joint pdf which propagate into large uncertainties when assessing the risk of compound floods. This does not mean that such models are not useful, but instead that the results should be interpreted being aware of these existing uncertainties. Also, even if large, the obtained uncertainties in the risk are smaller than those obtained computing the risk analysis directly on the observed data of the impact, underlining another advantage of applying such procedures.

For both the unconditional and conditional models, we use a parametric
bootstrap to assess the model and subsequent risk uncertainty, as follows.

Select
and fit a model that can reproduce the statistical characteristics of

Simulate

On each of the

From each of these

For each sample, compute the simulated impact sequence as

Estimate the uncertainties in the return levels by identifying the

The uncertainty in the return levels obtained via the observed data

ACF of the observed time series (shown in red) against the ACF

Given a statistical model describing time series with serial correlations, to
avoid an underestimation of the model uncertainties computed via the
bootstrap procedure, it is necessary to use a model which can reproduce the
serial correlation. During the bootstrap procedure, simulating samples
without serial correlation, and then re-fitting the model to each of them,
would mean assuming that the data carry more information than they actually
do. In fact, it is as if the effective sample size of data with serial
correlation is smaller than those without

Fit a linear Gaussian autoregressive model of order 1, AR(1),

Assured via the autocorrelation function (ACF) that

Simulate the residuals

In Fig.

We consider this result satisfying because our target is to include the
serial correlation of the contributing variables

We employ the Brier score to assess the accuracy of the probabilistic
predictions of the conditional model when predicting extreme values of the
impact

The Brier skill score (BSS) measures the relative accuracy of the model under
validation over a reference model, and is defined as

To assess the quality of the conditional model, avoiding overfitting, we
perform a 6-fold cross-validation. Therefore, the original sample of data
(

Douglas Maraun had the initial idea for the study. Emanuele Bevacqua and Douglas Maraun jointly developed the study with contributions from Martin Widmann. Emanuele Bevacqua developed the statistical model with contributions from Ingrid Hobæk Haff, Douglas Maraun and Mathieu Vrac. Emanuele Bevacqua carried out the analysis with contributions from Douglas Maraun and Ingrid Hobæk Haff. Emanuele Bevacqua, Douglas Maraun and Martin Widmann interpreted the results. Emanuele Bevacqua wrote the paper with contributions from all other authors.

The authors declare that they have no conflict of interest.

Emanuele Bevacqua received funding from the Volkswagen Foundation's CE:LLO project (Az.: 88468), which also supported project meetings. The authors would like to thank Arnoldo Frigessi for hosting them, and for fruitful discussions at the Norwegian Computing Center. Emanuele Bevacqua would like to thank Colin Manning for the productive discussions, and contributions during the writing process. The authors would like to thank the anonymous reviewers for their valuable comments and suggestions which contributed to improving the quality of the paper. The data used for sea and river levels have been provided by the Italian National Institute for Environmental Protection and Research (ISPRA) and Arpae Emilia-Romagna. Edited by: D. Koutsoyiannis Reviewed by: J. Zscheischler and two anonymous referees