The link between streamflow extremes and climatology has been widely studied in recent decades. However, a study investigating the effect of large-scale circulation variations on the distribution of seasonal discharge extremes at the European level is missing. Here we fit a climate-informed generalized extreme value (GEV) distribution to about 600 streamflow records in Europe for each of the standard seasons, i.e., to winter, spring, summer and autumn maxima, and compare it with the classical GEV distribution with parameters invariant in time. The study adopts a Bayesian framework and covers the period 1950 to 2016. Five indices with proven influence on the European climate are examined independently as covariates, namely the North Atlantic Oscillation (NAO), the east Atlantic pattern (EA), the east Atlantic–western Russian pattern (EA/WR), the Scandinavia pattern (SCA) and the polar–Eurasian pattern (POL).

It is found that for a high percentage of stations the climate-informed model is preferred to the classical model. Particularly for NAO during winter, a strong influence on streamflow extremes is detected for large parts of Europe (preferred to the classical GEV distribution for 46 % of the stations). Climate-informed fits are characterized by spatial coherence and form patterns that resemble relations between the climate indices and seasonal precipitation, suggesting a prominent role of the considered circulation modes for flood generation. For certain regions, such as northwestern Scandinavia and the British Isles, yearly variations of the mean seasonal climate indices result in considerably different extreme value distributions and thus in highly different flood estimates for individual years that can also persist for longer time periods.

The understanding of extreme streamflow is a key issue for infrastructure design, flood risk management and (re-)insurance, and the estimation of flood probabilities has been a focus of scientific debate in recent decades. Traditionally, streamflow has been analyzed with regard to associated hydro-climatic processes acting at the catchment scale. In recent years many studies have additionally focused on the link between local streamflow and larger-scale climate mechanisms, extending beyond the catchment boundaries (Merz et al., 2014). An early example can be found in Hirschboeck (1988), who provides a detailed explanation of relationships between floods and synoptic patterns in the USA. Large-scale atmospheric patterns acting at global or continental scales have been shown to significantly influence flood magnitude and frequency at the local and regional scale. Regional in this context refers to the joint consideration of several gauges. For example, Kiem et al. (2003) stratified a regional flood index in Australia according to quantiles of the El Niño–Southern Oscillation (ENSO) index and showed that La Niña events are associated with a distinctly higher flood risk compared with El Niño events. Ward et al. (2014) found that peak discharges are strongly influenced by ENSO for a large fraction of catchments across the globe. Delgado et al. (2012) detected a dependence between the variance of the annual maximum flow at stations along the Mekong River and the intensity of the western Pacific monsoon.

This perception of climate-influenced extremes has been incorporated into flood frequency analysis by including climatic variables as covariates of extreme value distribution parameters. It is therefore assumed that the probability density function (pdf) of streamflow is not constant in time but it is conditioned on external variables. This framework, usually called nonstationary, can be particularly useful for hydro-climatic studies since the influence of the climatic phenomena on the distribution of the hydrological target variable, such as extreme streamflow, can be considered (Sun et al., 2014). This means that the whole distribution as well as certain parts of the target variable distribution, such as the tails, can be assessed by including the influence of the large-scale climate phenomenon and used for flood risk management or reinsurance purposes. This conditional or nonstationary frequency analysis has been popularized in the field of hydrology and flood research in recent years. Different covariate types have been examined for their influence on flood extremes, e.g., time (e.g., Delgado et al., 2010; Sun et al., 2015), snow cover indices (Kwon et al., 2008), reservoir indices (López and Francés, 2013; Silva et al., 2017), population measures (Villarini et al., 2009) and large-scale atmospheric and oceanic fields and indices (Delgado et al., 2014; Renard and Lall, 2014). A review of nonstationary approaches for local frequency analyses is given by Khaliq et al. (2006), while some of their limitations are discussed by Koutsoyiannis and Montanari (2015), Serinaldi and Kilsby (2015) and Serinaldi et al. (2018).

In this study, we focus on the European continent and the relation between
streamflow extremes and the large-scale atmospheric circulation. The
European climate is mainly influenced by pressure patterns acting at the
broader region covering Europe and the northern Atlantic. In particular,
five circulation modes have been shown to significantly modify the moisture
fluxes into the European domain: the North Atlantic Oscillation (NAO), the
east Atlantic (EA), the east Atlantic–western Russia (EA/WR), the
Scandinavia (SCA) and the polar–Eurasian (POL) patterns (Bartolini
et al., 2010; Casanueva et al., 2014; Rust et al., 2015; Steirou et al.,
2017). These patterns represent the first five pressure modes north of
50

Apart from Northern Hemisphere modes, the El Niño–Southern Oscillation (ENSO) has been suggested to influence the European hydrology. Significant relations have been found with precipitation and different discharge indices (Guimarães Nobre et al., 2017; Mariotti et al., 2002; Steirou et al., 2017). However, in contrast to the above-described circulation modes, ENSO does not shape the European climate and hydrology directly, but rather indirectly through the regulation of the phase of other large-scale modes, such as the EA (Iglesias et al., 2014). Other patterns acting at a smaller scale, such as the Mediterranean Oscillation (MO) and the western Mediterranean Oscillation (WMO), have also been related with hydrological variables in Europe (Criado-Aldeanueva and Soto-Navarro, 2013; Dünkeloh and Jacobeit, 2003; Martin-Vide and Lopez-Bustins, 2006). However, such modes seem to have limited importance at the continental scale.

While the relation between European hydrology and large-scale circulation has attracted much attention and has been widely studied, only a few studies have adopted a conditional flood frequency framework for the investigation of climate–flood interactions. Villarini et al. (2012) conducted a frequency analysis of annual maximum and peak-over-threshold discharge in Austria with NAO as a covariate. López and Francés (2013) examined maximum annual flows in Spain conditioned on the principal components of four winter climate modes: NAO, AO, MO and WMO. Still, a comprehensive study on streamflow extremes at the European scale has not been conducted.

Thus, this study aims at a large-scale investigation of circulation–streamflow interactions for the entire European continent by adopting a flood frequency framework. We examine seasonal streamflow maxima from more than 600 gauges covering the entire European continent and particularly investigate the influence of the five major pressure modes that directly affect the European climate: NAO, EA, EA/WR, SCA and POL. In order to quantify the effect of important hydro-climatological processes for the streamflow regimes, we investigate contemporaneous relationships only, without considering any time lags. We identify regions with a consistent influence of each particular circulation index in order to explain the spatial coherence of flood frequency. The analysis is conducted at a seasonal scale in order to better account for the intra-annual variations of the circulation characteristics and the associated seasonal shift of climate–streamflow relationships. A Bayesian framework is adopted for the flood frequency analysis because of its advantages concerning the quantification and interpretation of uncertainty. Furthermore, prior information about hydrologic extremes exists in the literature and can be used for inference.

The time period of our analysis is from 1950 to 2016, defined by the overlap
between streamflow data and circulation indices. Daily streamflow data for
the European continent were received from GRDC (Global Runoff Data Centre).
From this dataset, gauges with record lengths of at least 50 years after
1950 and with a catchment area larger than 200 km

Time series of monthly circulation indices for the period 1950–2016 were
retrieved from the Climate Prediction Center (CPC) of the National Oceanic
and Atmospheric Administration (NOAA)
(

The GEV distribution with parameters invariant in time and with parameters conditioned on
the climate indices are fitted to the seasonal maximum streamflow data. For
the climate-informed models the condition of independent and identically
distributed observations of the classical GEV distribution is relaxed to include
parameters conditioned on time-varying covariates (Katz et
al., 2002). For the two types of models we use the terms “classical model”
instead of stationary model and “climate-informed model” rather than
“nonstationary model”. It has been suggested that if covariates have a
stochastic structure and no deterministic component, the resulting
distribution is not truly nonstationary (Montanari
and Koutsoyiannis, 2014; van Montfort and van Putten, 2002; Serinaldi and
Kilsby, 2015). As our climate covariates have no distinguishable
deterministic component (not shown), it is consequently not clear if they
result in nonstationary models. Here each streamflow gauge is handled
independently and site-specific parameters are derived. Let

In the Bayesian framework, the posterior pdf of the parameter vector is
computed as follows, based on Bayes theorem:

For the climate-informed distribution, parameters are assumed to be a
function

The climate-informed GEV distribution is a generalization of the classical GEV distribution. The
likelihood function is then defined as follows:

The function

Conditional distributions of only one covariate at a time are derived, since
we are interested in the separate effect of each individual climate index on
flood quantiles. Based on the abovementioned assumptions concerning model
structure and the form of the function

Consequently, the conditional GEV distribution comprises four parameters: scale and shape
parameters, and intercept

For all covariates and seasons, models are fitted independently. No
posterior distributions from the classical approach are used as priors for
the climate-informed case. For all models, noninformative uniform priors
are used for the location parameter (for both intercept and slope) and for
the scale parameter, since no prior information is available. For the shape
parameter an informative normal distribution with mean 0.093 and standard
deviation 0.12 is used. This distribution is adopted from a global study of
extreme rainfall by Papalexiou and Koutsoyiannis (2013), which,
to our knowledge, summarizes an analysis of shape parameters using the
largest number of stations with hydrological data worldwide. Although
rainfall extremes may be characterized by slightly different shape parameters
than those of streamflow, our informative prior is very close to the
“geophysical prior” of Martins and Stedinger (2000), which is often used
to restrict the range of shape parameters based on previous hydrological
experience (Renard et al. 2013). The latter prior was not preferred because
it is bounded to the interval (

Five chains of 14 000 simulations, with the first half discarded as warmup
period, are run for all parameters. Convergence is investigated by the
potential scale reduction statistic

We apply a two-step methodology to select the optimal model among the
classical and conditional competitors. First, we assess whether the covariates
have a significant effect on our extreme streamflow models by examining the
posterior distribution of the slope

The deviance, used for the calculation of the DIC, is defined as follows:

Conditional models satisfying both criteria are preferred to the classical model. The model comparison is performed in two steps: first, for each station and season, each climate-informed competitor is pairwise compared to the classical GEV distribution. Subsequently, the model with the overall best performance is identified.

In the classical or stationary approach one can define the

Here we assess whether the consideration of climatic drivers leads to a significant alteration of flood “effective” return levels or conditional quantiles in individual years. Differences of flood quantiles during years with high and medium values of the considered circulation indices are quantified. Since the model is linear, the effect of high and low covariate values on the extreme value distribution quantiles is approximately symmetric (it would be symmetric if the seasonal indices had a symmetric distribution around zero – see Fig. S6) and thus low covariate values are not considered. The 95th and 50th quantile of the considered climate index are chosen as high and medium index values, respectively. Index quantiles are calculated for the entire period 1950–2016.

From the No-U-Turn sampling after thinning, 3500 post-warmup sets of
parameters are obtained, each corresponding to a flood quantile (for a given
probability of exceedance). The median value of all 3500 flood quantiles is
chosen as a point estimate. The median estimate was preferred to the maximum
a posteriori (MAP) estimate because it is more representative of the
posterior distribution. Based on this approach, the percent relative
difference

In the previous chapters an automatic methodology for the choice of an adequate model and a discussion of flood quantiles for different covariate values is presented. However, a visual comparison of point estimates and uncertainty intervals of the classical and conditional models can be useful, since it illustrates the differences but also the plausibility and possible drawbacks of the competing models. For this reason, we plot the time series of flood quantiles for a probability of exceedance of 0.02 for selected gauges and covariates based on both the classical and the climate-informed extreme value distribution. As discussed in the previous section, the median flood quantile for a probability of exceedance of 0.02 is chosen as point estimate (median quantile curve). Uncertainty of flood quantiles is quantified by means of posterior or credibility intervals, which are the Bayesian equivalent to frequentist confidence intervals, although there exist differences in the interpretation of the two types (Renard et al., 2013; Gelman et al., 2013).

For all seasonal indices climate-informed models are preferred over the classical distribution for a large number of stations; percentages of preferred models (based on both the DIC and the significance of the slope of the location parameter) are shown in Table 1 and spatial patterns are mapped in Figs. 1–2. The climate-informed fits form spatial clusters that resemble the correlations between the climate indices and average seasonal precipitation (Figs. S1–S4), while a relation with the correlations of seasonal mean temperature is not straightforward. Particularly for NAO a dipole pattern is evident in winter, with a positive influence on extreme discharge in northern and central Europe and a negative relationship south of the Alps (Fig. 1). The intra-annual shift of the NAO pressure centers is well captured. The positive influence of NAO on flood magnitudes during summer is only detected for northern Scandinavia (Fig. 2). Similar dipole structures, resembling the correlations with seasonal mean precipitation, are found for other indices. However, there are some deviations from the precipitation patterns. For example, contradicting results are found in Scandinavia during spring and summer for the SCA index. Scandinavian rivers usually have small catchments and are fed by snowmelt in particular in spring; subsequently, in this area, both temperature and precipitation are important for runoff generation. An opposite sign between correlations with precipitation and the slope of the location parameter can also be found during autumn in northeastern Germany for the EA index.

Results comparing the climate-informed and the classical GEV models for all covariates examined for the winter and spring season. Nonsignificant models preferred only by the DIC (yellow points) are plotted on top of stations for which climate-informed models were not chosen by any of the two criteria (grey points). Preferred climate-informed models chosen by both criteria (blue and red triangles) are illustrated on top of the other models so that they can be better distinguished.

Same as Fig. 1 but for the summer and autumn seasons.

Percentage of stations with climate-informed fits preferred to the classical GEV distribution model. Indicated is the result of the pairwise comparison of each covariate with the classical model and the percentage of preferred fits for each covariate when all models are compared (in brackets). Results are shown per season and for mean seasonal covariates.

Same as Table 1 but for monthly covariates at the same month as the seasonal streamflow extremes.

NAO is the covariate with the highest number of significant fits in winter (46 %) and autumn (31 %) and EA in spring (32 %) and summer (18 %). High percentages of preferred climate-informed models are also found for EA and SCA in winter, which is the season where most indices are characterized by their strongest influence on the European climate (Table 1). The worst overall results are found for EA/WR in spring (3 %) and POL in summer (7 %). It can be argued that these two latter cases could occur solely by chance or due to spatial correlation of nearby flood time series; however, results are coherent in space and cover large regions, which suggests a real influence of the circulation modes on the location parameter of the extreme value distributions, restricted though to certain subregions of Europe.

Similar spatial patterns are obtained from the same analysis if monthly covariates during the month of the seasonal discharge peaks are examined (Figs. S7–S8). Clusters of stations with positive or negative slopes of the location parameter agree with those for seasonal indices; however, in most cases the percentages of preferred fits are lower for the monthly covariates, with EA/WR in spring being an exception. In particular, the role of NAO in winter and autumn and of EA during the rest of the seasons is less pronounced in the monthly-scale analysis. NAO and SCA are the covariates with the highest number of preferred fits in spring and EA together with EA/WR in summer and autumn (Table 2). Regarding the spatial patterns of preferred fits, deviations from those for seasonal covariates can be found for EA/WR, SCA and POL during spring and summer.

Best overall models among the five climate-informed and classical GEV distribution tested for the winter and spring season. Mean seasonal covariates are examined.

Same as Fig. 3 but for the summer and autumn seasons.

For all indices examined, a percentage of stations between 5 % and 13 %, depending on the season and the covariate, are characterized by lower DIC for the climate-informed model, although the slope of the location parameter is not significant (illustrated as yellow points in Figs. 1 and 2). Only a few station records, up to three per season and index (not shown in Figs. 1 and 2), are characterized by higher DIC value for the climate-informed model without showing a significant slope. These results indicate that DIC is a weaker criterion for model selection than the slope significance at 10 % level.

In order to illustrate the spatial structure of the best models, the preferred model (classical or climate-informed) is mapped in Figs. 3 and 4 for each station for seasonal covariates. Spatial patterns do not resemble the pattern of significant fits for separate indices (Figs. 1, 2), since the influence of the selected climate modes on flood frequencies is overlapping for some regions and some of the indices are correlated for particular seasons (Table S1). Winter (summer) is the season with the highest (lowest) overall percentage of preferred climate-informed models: 77 % and 38 %, respectively. In winter, NAO is the most influential climate mode, being preferred over the other modes for 28 % of the gauges. The largest influence of NAO on flood frequencies is detected in central Europe, Great Britain, parts of Scandinavia and the Iberian Peninsula (Fig. 3). The first three regions also show a high fraction of SCA-influenced models, which points towards a joint effect of NAO and SCA during winter. The two indices are significantly correlated during this season (Table S1). EA is identified as the best covariate in winter for Great Britain. In spring an expansion of the EA influence towards central Europe is detected. The NAO influence is shifted to the south during the transition seasons (spring and autumn) and is completely dissolved in summer. Patterns for SCA are heterogeneous throughout the year. The same results but for monthly covariates are shown in Figs. S7 and S8. Spatial patterns resemble those for seasonal covariates. Percentages of preferred climate-informed models are included in Tables 1 and 2.

In the previous section it is shown that models with monthly covariates do not outperform those with seasonal covariates for most indices and seasons. Hence, quantiles of climate indices are calculated at the seasonal scale only (Table 3). Figures 5 and 6 show the relative differences of seasonal flood quantiles for a probability of exceedance of 0.02 between a (hypothetical) year with a climate index value equal to the 95th index quantile and a year with an index value equal to the median. For a probability of exceedance of 0.02, relative differences higher than 20 % and up to 22 % are detected in winter for NAO. For the rest of the seasons, maximum relative differences are lower than 20 % with highest values for EA/WR in autumn (marginally below 20 %). In spring and summer the highest value is considerably lower, between 11 % and 13 % for NAO and SCA in spring and EA and SCA in summer.

Seasonal quantiles of the five climate indices: median and the 95th quantile (in parentheses) are provided.

Summary statistics of median posterior shape parameter of all stations examined. Statistics are taken over all models for one season. In the parentheses the maximum deviation of all the models fitted (classical and climate-informed) is provided.

A difference of 5 %–10 % is quite common for NAO in winter. For example, a
station with a positive slope of the location parameter and a probability of
exceedance of 0.02 for a maximum seasonal discharge value of 600 m

Percentage of the relative difference of the streamflow for an exceedance probability of 0.02 between a (hypothetical) year with a climate index value equal to the 95th quantile and a year with an index value equal to the median index. Results are shown for winter and spring and seasonal mean covariates.

General information about selected sites shown in Fig. 7. Ref. code is the number of the subplot of Fig. 7.

Climate-informed results as shown in Fig. 7. Ref. code is the
number of the subplot of Fig. 7. Mean seasonal covariates for the same
season as streamflow extremes are examined. dDIC is the difference from the
DIC value of the classical distribution.

Same as Fig. 5 but for the summer and autumn seasons.

The high relative differences of flood quantiles could partly reflect differences in catchment size or unreasonable posterior values of the shape parameter. A link with catchment size was, however, not found (not shown). Posterior shapes for all seasons and indices were further analyzed. Summary statistics of the median shape from the posterior distribution of each fitted model are given in Table 4. Little deviation is observed for different models (classical or climate-informed) during the same season but some inter-season variation is present. No unreasonable values are observed, and thus we assume that the use of an informative prior distribution for shape adequately restricts the posterior distributions to reasonable limits.

Annual maximum discharge time series (A4, B4, C4) and climate-informed quantiles (A1–A3, B1–B3, C1–C3) with credibility intervals for an exceedance probability of 0.02 and for three selected gauges (Tables 5 and 6). Climate-informed quantiles are compared with those of the classical GEV distribution. The three best climate-informed models based on the DIC are shown for each site, with increasing DIC from top to bottom.

Comparison of climate-informed and classical streamflow quantiles for station A (see Fig. 7 and Tables 5 and 6 for more details), an exceedance probability of 0.02 and NAO as covariate of the climate-informed model. The legend is the same as in Fig. 7. Observed streamflow is indicated with black dots.

The results for three selected gauges with high relative differences

The uncertainty bounds of the climate-informed fits can be narrower or wider
than those of the classical model. They are also asymmetric, contrary to
uncertainty bounds that result from a method using a normal approximation.
Asymmetric intervals are associated with the shape parameter of the GEV distribution and
are not uncommon (see for example Zeng et al., 2017). The range of
uncertainty bounds reflects an interplay between model complexity and the
additional information provided by the more complex models. In Fig. 7,
uncertainty bounds are narrower in the case of the “best” conditional
models (e.g., subplot A1). Uncertainty increases when extrapolations are made
towards high and low index values. This can be more easily observed in Fig. 8. For the classical case, the range is about 94 m

This study explored whether a climate-informed flood frequency analysis provides insights and can improve the estimation of flood probabilities at the European scale. A site-specific model using a Bayesian framework was developed, and five Euro-Atlantic circulation modes were investigated as potential covariates: the North Atlantic Oscillation (NAO), the east Atlantic pattern (EA), the east Atlantic–western Russian pattern (EA/WR), the Scandinavia pattern (SCA) and the polar–Eurasian pattern (POL). Streamflow was analyzed at a seasonal timescale in order to account for the variable influence of the circulation modes on the European climate during different seasons of the year. Covariates were averaged and examined at both seasonal and monthly scales, contemporaneous to the season or month of the seasonal streamflow maxima, respectively.

The developed climate-informed models were compared to the classical GEV distribution with time-invariant parameters. For most seasons and covariates investigated, the climate-informed models were preferred over the classical GEV distribution for a high percentage of stations (around 20 % on average), with best results found in winter for NAO and EA, in spring for EA, and in autumn for NAO (Table 1). Results were shown to be coherent in space, indicating that certain regions are influenced by particular circulation modes (Figs. 1–4). In winter 77 % of the stations were found to be influenced by one of the climate modes, which indicates high potential for an improvement of flood probability estimations by including climate information in extreme value statistics. On the contrary, less than half of the stations examined were significantly affected by at least one of the five large-scale indices during summer season, indicating a rather convective and nonpredictable precipitation regime (Table 1).

Based on the variability of the circulation indices, we identified regions that are characterized by preferred climate-informed fits and by steep slopes of the location parameter. For models with significant slopes, variations of the climate indices lead to highly varying flood quantile estimations for the same probability of exceedance. Particularly for northwestern Scandinavia and the British Isles, variations of the climate indices result in considerably different extreme value distributions and thus highly different flood estimates for individual years (Figs. 5–6). This difference in estimates could be partly a result of unreasonable posterior values of the shape parameter; however, the use of an informative prior distribution for shape adequately restricts the posterior distributions to reasonable limits. Plots of extreme streamflow under consideration of a probability of exceedance of 0.02 indicate that the deviation between the classical and climate-informed analysis concerns not only single years but can also persist for longer time periods (Fig. 7), which reflects the decadal-scale variability of NAO and other large-scale circulation indices (Fig. S5).

Although the circulation indices examined are characterized by high intra-seasonal variability, the seasonally averaged indices provided in most cases better fits compared with monthly values (Tables 1–2). This should be emphasized, since extreme precipitation events are most likely more closely related to monthly circulation states, which better represent the moisture fluxes into the target domain. On the contrary, the catchment wetness before the flood event is likely to be influenced by the seasonal mean circulation and the associated precipitation sums. Hence, our result suggests that the skill of climate-informed extreme value distributions is to a significant extent a consequence of the important link between catchment wetness and flooding. Thus we assume, in line with recent studies (Blöschl et al., 2017; Merz et al., 2018; Schröter et al., 2015), that in many regions of Europe, catchment wetness plays an important role for flood generation.

For the selection of the best model among the classical and climate-informed
models, two criteria were adopted: the DIC and the significance of the slope of the
location parameter

The described methodology can be complemented in several ways.

The GRDC discharge dataset was obtained from the Global Runoff Data Centre,
56068 Koblenz, Germany (

The supplement related to this article is available online at:

BM conceived the original idea, and all co-authors designed the overall study. ES developed the model code with contributions from XS, performed the analysis and prepared the paper. All co-authors contributed to the interpretation of the results and writing of the paper.

The authors declare that they have no conflict of interest.

The authors are grateful to the three reviewers, Alberto Viglione, Elena Volpi and Francesco Marra, for their helpful comments and suggestions that substantially improved the paper. Alessio Domeneghetti is thanked for providing unpublished discharge data from Italy and Luis Mediero for providing discharge data from Spain and Portugal. Daniel Beiter is thanked for his support in coding and parallel computing. Xun Sun is supported by the National Key R&D Program of China (no. 2017YFE0100700) and Shanghai Pujiang Program (no. 17PJ1402500). This study was conducted in the frame of the projects “Conditional flood frequency analysis: exploring the link of flood frequency to catchment state and climate variations” and “The link of flood frequency to catchment state and climate variations”, two joint research initiatives between AXA Global P&C and GFZ, Potsdam. The authors wish to acknowledge the AXA Research Fund for financial support. The article processing charges for this open-access publication were covered by a Research Centre of the Helmholtz Association. Edited by: Nadav Peleg Reviewed by: Elena Volpi, Francesco Marra, and Alberto Viglione