Articles | Volume 26, issue 24
Research article
16 Dec 2022
Research article |  | 16 Dec 2022

River flooding mechanisms and their changes in Europe revealed by explainable machine learning

Shijie Jiang, Emanuele Bevacqua, and Jakob Zscheischler

Climate change may systematically impact hydrometeorological processes and their interactions, resulting in changes in flooding mechanisms. Identifying such changes is important for flood forecasting and projection. Currently, there is a lack of observational evidence regarding trends in flooding mechanisms in Europe, which requires reliable methods to disentangle emerging patterns from the complex interactions between flood drivers. Recently, numerous studies have demonstrated the skill of machine learning (ML) for predictions in hydrology, e.g., for predicting river discharge based on its relationship with meteorological drivers. The relationship, if explained properly, may provide us with new insights into hydrological processes. Here, by using a novel explainable ML framework, combined with cluster analysis, we identify three primary patterns that drive 53 968 annual maximum discharge events in around a thousand European catchments. The patterns can be associated with three catchment-wide river flooding mechanisms: recent precipitation, antecedent precipitation (i.e., excessive soil moisture), and snowmelt. The results indicate that over half of the studied catchments are controlled by a combination of the above mechanisms, especially recent precipitation in combination with excessive soil moisture, which is the dominant mechanism in one-third of the catchments. Over the past 70 years, significant changes in the dominant flooding mechanisms have been detected within a number of European catchments. Generally, the number of snowmelt-induced floods has decreased significantly, whereas floods driven by recent precipitation have increased. The detected changes in flooding mechanisms are consistent with the expected climate change responses, and we highlight the risks associated with the resulting impact on flooding seasonality and magnitude. Overall, the study offers a new perspective on understanding changes in weather and climate extreme events by using explainable ML and demonstrates the prospect of future scientific discoveries supported by artificial intelligence.

1 Introduction

River flooding is a pervasive natural hazard that regularly causes substantial economic, societal, and environmental damages worldwide (Tellman et al., 2021; Merz et al., 2021). With a warming atmosphere, flooding risk is projected to increase due to an intensification of the water cycle over large areas (Hirabayashi et al., 2013; Alfieri et al., 2017). For Europe, large-scale studies have revealed changes in flooding frequency, seasonality, and magnitude over the past decades, with considerable variations across catchments (Blöschl et al., 2017, 2019; Hall and Blöschl, 2018; Bertola et al., 2020; Alfieri et al., 2015). The spatial inconsistency in these trends reflects differences in flood-generating processes across the continent, which underscores the need for a better understanding of flood drivers (Keller et al., 2018).

In recent years, numerous studies have investigated river flooding mechanisms, and some of them have provided European-scale assessments (e.g., Berghuijs et al., 2016, 2019; Kemter et al., 2020; Bertola et al., 2021; Stein et al., 2020). Catchment-level floods can typically be attributed to the interaction of hydrometeorological processes, such as extreme precipitation, soil moisture excess, and snowmelt (Merz and Blöschl, 2003; Tarasova et al., 2019). The dominant controlling processes in catchments were usually identified either qualitatively by comparing the observed flood trends with the contemporaneous changes in flooding drivers (e.g., Blöschl et al., 2017, 2019) or quantitatively by calculating the seasonal similarities between flood events and potential drivers (e.g., Berghuijs et al., 2016, 2019). Such analyses revealed the dominant flood-generating processes at a catchment level, improving the understanding of climate change effects on flooding magnitude and timing. However, the methods often implicitly assume temporally consistent flood processes within a catchment (Merz et al., 2012), making it difficult to detect possible changes in flooding mechanisms themselves in a warming climate.

Flooding mechanisms that dominate one catchment are not always immutable but might shift over time, particularly in light of climate change (Hall et al., 2014). For example, increasing temperatures can affect snow dynamics in cold regions and result in more rainfall extremes, which could make snowmelt-dominated catchments more susceptible to extreme rainfall and thereby alter the regional flood seasonality and magnitudes (Davenport et al., 2020; Rottler et al., 2021; Vormoor et al., 2016). Therefore, a systematic investigation of the changes in flooding mechanisms is necessary. Yet few studies have been able to quantify how the mechanisms evolved over time on a continental scale in Europe. The identification of specific trends in flooding mechanisms requires a comprehensive understanding of hydrological processes underlying individual events (Stein et al., 2020). Currently available studies that attempted to classify river flooding processes on an event basis typically rely on multicriteria approaches, which require predefining thresholds for a variety of hydrometeorological indicators, such as the storm duration and snowmelt amount (e.g., Nied et al., 2014; Stein et al., 2021). Using a multicriteria approach, Kemter et al. (2020) identified the flooding mechanisms in Europe by classifying approximately 174 000 flood peaks and revealed their trends over the past 50 years. Likewise, Stein et al. (2020) analyzed flood events over 4155 catchments worldwide and classified them into five flood-generating processes. Despite the computational efficiency of using multicriteria approaches, the obtained insights are often dependent on the careful choice of indicators and thresholds. For example, in some cases, a small change in a threshold value modifies the classification, potentially compromising the robustness of the results (Sikorska et al., 2015). Alternatively, some studies grouped flood events by inductive analyses, which adopted clustering methods to obtain flood types from hydrometeorological indicators (e.g., Turkington et al., 2016; Keller et al., 2018). However, the chosen indicators (e.g., snow-covered area, day of occurrence, and 95th percentile of spatial precipitation distribution) did not unambiguously indicate flooding mechanisms, since they were not indicative of the causal contribution of flood drivers to peak discharges (Tarasova et al., 2019).

An effective way to identify flooding mechanisms for individual flood events is to quantify the contribution of possible drivers to its occurrence, which involves uncovering the implicit connections that may exist between flood events and meteorological observations. This can be achieved by machine learning (ML), which has been receiving increasing attention in Earth and climate sciences for its remarkable ability to identify and generalize predictive relations with a high-level abstract representation (Reichstein et al., 2019; Yu and Ma, 2021). In hydrology particularly, one excellent example is the prevalence of Long Short-Term Memory (LSTM) neural networks (Kratzert et al., 2018; Shen, 2018), which have been demonstrated to learn patterns conceptually consistent with qualitative understandings of how hydrological systems work as opposed to simply trivial coincidences (Kratzert et al., 2019a). Extraction of captured patterns from “black-box” ML models with feature attribution techniques (i.e., ML interpretations) may lead to theoretical advances and can assist in making new scientific discoveries, as recently demonstrated for climate, ocean, and weather applications (e.g., Toms et al., 2020; Barnes et al., 2020; Labe and Barnes, 2021), including the identification of flooding mechanisms (Jiang et al., 2022).

In this study, we revisit flooding mechanisms in Europe over the period 1950–2020 by using an improved framework based on the explainable ML methods developed by Jiang et al. (2022) and compare the results with existing studies. We base the analysis on around 1000 catchments, and the only dynamic information necessary for the analysis is precipitation, temperature, and streamflow. These three variables can be readily measured, thereby reducing the reliance on possibly uncertain estimations of fluxes and state variables (such as soil moisture). The combination of supervised-learning-based feature attribution and unsupervised-learning-based cluster analysis reduces subjectivity and uncertainty for the selection of appropriate indicators and thresholds in the categorization of flood drivers. Moreover, taking an event-level perspective, we quantify the changes that occurred in these mechanisms in the past seven decades and discuss the possible reasons for and implications of the detected changes. Overall, the study contributes to a better understanding of river flood risk and how it is affected by climate change and illustrates how explainable ML can advance knowledge about the Earth system.

Figure 1An overview of the 1077 catchments and their properties, including (a) average elevation and slope of the catchments; (b) the catchment size; (c) the aridity index, expressed by the ratio between mean annual potential evapotranspiration (PET) over mean annual precipitation; (d) the fraction of precipitation falling as snowfall (i.e., precipitation falling with a temperature below 0 C); and (e) the seasonality of annual maximum discharges. PET was estimated via Hamon's formulation (Hamon, 1961).

2 Data and methodologies

2.1 Data

The study considers 1077 catchments in the domain of Europe (Fig. 1a) based on the data availability of daily river discharge observations from the Global Runoff Data Centre (GRDC) dataset (, last access: 1 November 2021). We restricted our analysis to catchments that have a minimum of 20 years of discharge records within 1950–2020 to ensure sufficient samples to train the ML models. The catchment areas range between 8 and 10 000 km2 – very large catchments, where the effect of spatial heterogeneity of flood drivers tends to be substantial, were not considered. For those catchments, the sample size of daily discharge records ranges from 7300 to 25 753, with a median of 20 455 time steps. Overall, the selected catchments encompass a variety of geographical and climatic conditions, as illustrated by the catchment distributions in terms of average elevation, average slope, catchment size, aridity index, snowfall fraction, and flood mean date (Fig. 1). The elevation, slope, and size were derived from the Global Streamflow Indices and Metadata Archive (GSIM) (Do et al., 2018a); the aridity index and snowfall fraction were calculated from the catchment-averaged precipitation and temperature described later. In the study, floods are defined as the annual maxima (peaks) of river discharge time series in line with common practices (e.g., Blöschl et al., 2017, 2019). The above properties will also be used to discuss their relevance to the catchment-level dominant flood mechanisms.

We considered precipitation, temperature, and day length as input variables of the ML models. Using the 0.1 daily gridded precipitation and mean surface temperature data from the E-OBS dataset (version 23.1e) (Haylock et al., 2008), we calculated the catchment-averaged time series of these variables based on area-weighted averages of the data pixels within the catchment boundary. The weight of each pixel was determined by the fraction of its area covered by the relevant catchment. The catchment boundaries were obtained from readily available GRDC (Do et al., 2018a) databases, with GRDC being prioritized when the boundary of a catchment was available in both databases. Note that, for smaller catchments under 100 km2 (approximately 0.1×0.1), uncertainties may exist due to the relatively coarser spatial resolution of the meteorological data. Nonetheless, those catchments with large uncertainties will not be considered for the subsequent attribution analysis if ML models cannot capture the relationship between inputs and outputs effectively. Day length was included in the study, since it was shown to improve model accuracy in a series of preliminary tests, including the cases where only precipitation and temperature were used and where day length was additionally incorporated. Catchments where day length largely improves accuracy are mainly located in northern Europe. Day length was calculated based on the day of the year and the latitude of the catchment center by using the Brock model, following Forsythe et al. (1995).

Figure 2The workflow of using explainable ML methods for attributing flood peaks (annual maxima of river discharge) to their drivers. (a) Diagrammatic representation of the used LSTM models. The window in the time series of discharge highlights the target output (which is a point), and the window in the inputs indicates the input features used to predict the illustrated peak discharge sample. (b) The feature importance of the inputs for predicting the peak discharge shown in (a), which was obtained by using the ML interpretation technique (namely integrated gradient). The vertical dashed lines in the windows separate the feature importance into a recent 7 d period and an earlier period to calculate the aggregated feature contributions (see main text).


2.2 Attribution framework and ML model

Figure 2 illustrates the framework of using explainable ML methods for flooding attribution in the present study, which was originally developed by Jiang et al. (2022) and involves three main steps. First, we built ML models for individual catchments to establish the nonlinear predictive maps from meteorological factors (i.e., precipitation, temperature, and day length) to daily discharges (Fig. 2a). Secondly, an ML interpretation technique was applied to interpret the trained models to quantify the contributions of the three input variables at each time step (i.e., time-wise feature importance) to the generation of respective flood events (Fig. 2b). The time-wise feature importance was further aggregated into contributions of specific features. Finally, cluster analysis was used to group the specific feature contributions from multiple flood events that had similar patterns into several categories, from which we then identified different flood mechanisms. Detailed explanations of the methods are given below.

In the study, we used the classical LSTM network (Hochreiter and Schmidhuber, 1997) as the ML model. The LSTM is one of the most popular ML architectures for modeling dynamic hydrological variables (e.g., Kratzert et al., 2018; Lees et al., 2021); it can effectively capture nonlinear and temporal dependencies between variables owing to its recurrent structure and unique gating mechanism (Gers et al., 1999). The effectiveness of the LSTM is partially due to the comparability of its formulation to the hydrological behavior of a catchment. Specifically, the backbone of the LSTM network is composed of recurrent cells that can store previous information from input sequences, which is conceptually similar to the way meteorological information (e.g., precipitation) is stored in the form of soil moisture or snowpack (Lees et al., 2022). The physically realistic mapping from inputs to outputs facilitates gaining hydrologically meaningful insights from subsequent model interpretations. Figure 2a illustrates the data flow of one sample in the LSTM model, with the dashed windows highlighting the predictors and the target variable. The input layer of the model brings in precipitation (P), temperature (T), and day length (D) over the past 180 d (i.e., [X1P, X2P, …, X180P; X1T, X2T, …, X180T; X1D, X2D, …, X180D]), and the output layer produces the discharge of the same day (i.e., y1). Note that we included predictors on the same day as the output in the model, since precipitation on that day could also affect the discharge, especially in small catchments with quick catchment response times. However, the conclusions do not change even if using LSTM models to predict discharge on the next day (i.e., the prediction models consider the lagged meteorological forcings up till the day before each daily discharge). The hidden layers consist of a single LSTM layer and a dense layer with 32 units. The number of time steps and hidden units was determined by considering both the model performance and efficiency, which had been evaluated in preliminary experiments. Preliminary experiments also suggested that using fewer time steps (e.g., 90 d) would not impair the conclusions of the study about flooding mechanisms because contributions from inputs at very early time steps to output are limited in LSTM models (i.e., memory decay) (Su and Kuo, 2019). Here, we skip the technical details of the LSTM architecture and refer to Sherstinsky (2020) for a comprehensive explanation of the fundamentals of LSTM networks.

To improve the robustness of model evaluation and analysis, we fitted 10 independent LSTM models for each of the 1077 catchments. Specifically, the data for each catchment were divided into 10 folds without shuffling the temporal sequence, and each fold was tested once with a model trained with the remaining 9 folds. The predictive performance of each model was evaluated independently based on testing data, i.e., 1/10 of the data for each catchment, which ranged from 2 to 7 years due to the 20–70-year sample sizes available in studied catchments. During the training process, a portion of the training data (70 %) was repeatedly used to update the model parameters every epoch until no further decrease in the loss function was observed in the remaining 30 % (also known as validation data). The initial learning rate and maximum training epoch number were configured to 0.01 and 200, respectively, with the adaptive moments estimation (Adam) algorithm (Kingma and Ba, 2015) being used for training the models.

2.3 Model interpretations and cluster analysis

The integrated gradient (IG) technique developed by Sundararajan et al. (2017) was employed to interpret the trained models, which allows for obtaining the time-wise feature importance of the three input variables for each sample of the output (i.e., daily discharges). The IG method is a gradient-based interpretation technique that exploits the gradient of the model's output to its input features to trace back the specific contributions of the inputs. It aims to assign an importance score to each feature (e.g., to the precipitation at each time step prior to the flooding). A large positive score indicates that the feature substantially increases the network output (e.g., that the precipitation at a certain time step contributes to increasing the flooding), a large negative score indicates a decrease in the network output, and a score close to zero indicates little influence on the output. The IG score for the input feature x (e.g., precipitation at the ith time step) is formulated as

(1) ϕ i ( x ) = x i - x i α = 0 1 f x + α x - x x i d α ,

where f(x+α(x-x))xi denotes the local gradient of the network f at a point interpolated from a baseline input (x, when α=0), which is meant to represent the “absence” of feature input, to the target input (x, when α=1). An important property of the IG is completeness, which states that the IG scores add up to the difference between the output of f at the target input x and the baseline input x, i.e., iϕi(x)=f(x)-f(x). Therefore, the model output can be decomposed into the sum of features' individual contributions, and it enables us to examine the contribution of a group of features by summing up their individual IG scores.

In the study, we focus specifically on the IG scores for annual maximum peak discharge events to gain insights into flooding mechanisms. Given that we trained 10 independent models, 10 sequences of time-wise feature importance were generated for each peak discharge, with each sequence having the same dimensions as the input variables (i.e., [ϕ1P, ϕ2P, …, ϕ180P; ϕ1T, ϕ2T, …, ϕ180T; ϕ1D, ϕ2D, …, ϕ180D]). Then, the 10 sequences were averaged into one sequence (i.e., [ϕ1P, ϕ2P, …, ϕ180P; ϕ1T, ϕ2T, …, ϕ180T; ϕ1D, ϕ2D, …, ϕ180D], which is simplified as ϕi hereafter) to reduce the impact of the stochasticity associated with training the different LSTMs. Figure 2b exemplifies the averaged IG scores corresponding to the sample shown in Fig. 2a, i.e., it shows the contribution of the three input variables to the selected annual maxima of river discharge. The warm or cool colors in the heatmap denote that the input variable at the particular time step has increased or decreased the network output, while white indicates little effect. Note that the averaged IG scores for an individual peak were computed by averaging the scores obtained from all 10 of the independent models, regardless of whether the peak was part of the training data or the testing data in the models. Overall, the IG scores extracted from the 10 models for each target peak discharge generally follow a similar pattern, though with inevitable differences due to randomness and uncertainties in training processes (see Figs. S1–S3 in the Supplement for examples). Note that using the IG scores based on the target peaks in testing datasets alone does not yield substantial impacts on our conclusion in subsequent analyses (see Figs. S4–S5).

In the following step, the sequences of averaged IG scores (ϕi) can be clustered directly using time series clustering techniques based on their similar shapes, such as using the K-means method with the dynamic time warping algorithm (DTW) as the distance metric (Tavenard et al., 2020). However, the main drawback of clustering time series is the heavy computational burden. The DTW distance between any two samples has a quadratic time complexity with respect to the sequence length, which would make clustering long feature importance sequences a time-consuming process, and it would be especially challenging when dealing with tens of thousands of sequences (Salvador and Chan, 2007). Moreover, for this large-sample study that aims to understand flood mechanisms at a continental scale, it might not be necessary to distinguish the daily contributions of meteorological drivers in detail. Therefore, before carrying out the cluster analysis, we aggregated each sequence of averaged IG scores (ϕi) by using a 7 d separating window, which generates a low-dimensional contribution vector with only six elements [17ϕiP, 8180ϕiP, 17ϕiT, 8180ϕiT, 17ϕiD, 8180ϕiD]. Here, 17ϕi and 8180ϕi represent contributions of a variable in recent 7 d and an earlier antecedent period, respectively. The separating-window size should cover the period of precipitation and snowmelt events leading to each peak discharge, which depends highly on the local characteristics. After examining the relationship between catchment area and mean event response time, Stein et al. (2020) suggested that a synoptic window of 7 d should be sufficient to guarantee the response time for large catchments. As a result, this study used a 7 d period, similar to the practice in most studies that examined flooding causes (e.g., Blöschl et al., 2017; Berghuijs et al., 2019). However, using a shorter period (e.g., 5 d) does not affect the conclusions about dominant flooding mechanisms and their trends (see discussion in Sect. 3.7). Figure 2b demonstrates the values of the aggregated feature contributions based on respective daily IG scores represented by the heatmap.

To obtain an overall picture from the individual aggregated feature contributions, we used the K-means method to cluster the results for all annual maximum peak discharges pooled from all considered catchments. Considering that the feature importance values are correlated to the magnitude of the predicted peak discharge due to the completeness property, we normalized each accumulated vector by its Manhattan norm (i.e., dividing each element by the sum of its absolute values while keeping its sign) to make the contributions comparable across different floods. To determine the optimal cluster number for the K-means algorithm, we evaluated the cluster characteristics for candidate cluster numbers ranging from 2 to 8 using the silhouette coefficient (Rousseeuw, 1987), which reflects the separation distance between the resulting clusters. The silhouette coefficient for an individual sample is calculated as (b-a)/max(b-a), where a represents the mean distance between the sample and all other points within the same cluster, and b represents the mean distance between the sample and all other points in the next nearest cluster. The average silhouette coefficient over all samples is an indicator of the goodness of a clustering result, which ranges from −1 to 1, with a higher score generally indicating a better cluster number choice.

2.4 Trend analysis of flooding mechanisms

Based on the clustering results, we can identify the mechanism responsible for each annual maximum peak discharge and calculate the proportions of different flooding mechanisms at either the continental or catchment scale. The trend magnitude in these proportions was then analyzed by Theil–Sen's estimator, with the modified Mann–Kendall test (Hamed and Rao, 1998) being used to determine the significance of the trend. Specifically, at the continental scale, we estimated the overall trends of various flooding mechanisms based on their respective proportions within all the annual maximum peak discharges per year. At the catchment scale, to capture the variations of flooding mechanisms over different periods, we calculated the proportion series using a 20-year moving window in each catchment. The 20-year time frame was used to ensure an adequate sample size for reliably estimating the intra-period proportions and also to guarantee enough periods to observe decadal variability (Pagano and Garen, 2005). Only proportions that were calculated with at least 10 years of peak discharge data in each window were used to estimate the trend slope.

Moreover, in order to analyze the possible causes of trends, we selected a number of regions where most catchments present consistent trends in certain mechanisms. We investigated those catchments exhibiting significant changes in flooding mechanisms and compared the temporal regional changes in flooding mechanisms with changes in potential flooding drivers. The time series of proportions in regions were calculated by applying the previously described 20-year moving window to peak discharge classifications for the considered catchments. The flooding drivers considered include annual maximum 7 d total precipitation, mean spring temperatures (January to April), and 30 d precipitation preceding the 7 d window of recent precipitation, which is a common proxy for soil moisture prior to flooding (e.g., Bertola et al., 2021). All the drivers were averaged across the catchments and then smoothed by using a 20-year moving average window as well.

Figure 3(a) Nash–Sutcliffe efficiency (NSE) values in the testing period averaged over the 10-fold cross-validation. (b) The cumulative frequency of the averaged NSE values. (c) The distribution of the standard deviation values for the NSE values across the 10-fold cross-validation. The NSE values were calculated using all samples in respective testing datasets.

3 Results and discussion

3.1 Model predictive performance and interpretations

Before moving to the analysis of annual maximum peak discharges, we used the Nash–Sutcliffe efficiency (NSE) (Nash and Sutcliffe, 1970) to assess model accuracy in predicting discharges. The NSE value ranges from negative infinite to 1.0, and NSE > 0.5 is generally deemed satisfactory for discharge simulations (Moriasi et al., 2015). Based on the NSE value computed in the testing period for each model in the 10-fold cross-validation, we acquired the average and standard deviation of NSE values for each of the 1077 catchments, as shown in Fig. 3. The overall warm colors in the map (Fig. 3a) indicate that the model performed satisfactorily for most catchments, with the median of NSE averages reaching 0.74 (Fig. 3b). The standard deviations of NSE values (Fig. 3c) further indicate robust model performance in most cases. Accordingly, the models have effectively captured the generalizable predictive relationship between meteorological factors and discharges. As an accurate predictive relation is essential for deriving meaningful information from ML models (Murdoch et al., 2019), the subsequent analyses focus specifically on the 977 catchments (out of 1077; 91 %) with average NSE values above 0.5. In the following, we move to the analysis of annual maximum peak discharges.

A total of 53 968 annual maximum discharges were identified from the 977 catchments (20–70 peaks per catchment). By using the IG method, we can obtain 53 968 feature importance sequences averaged across the models from the 10-fold cross-validation. In the case shown in Fig. 2b, precipitation is the dominant driver behind the annual maximum peak discharge occurrence, showing consistently non-negative feature importance, with the precipitation peaks that occur closer to the target flood peak having a greater influence (see pronounced positive contributions in red). Nevertheless, the total contribution from antecedent precipitation is more important in predicting the peak compared with the contribution from recent precipitation, as indicated by the aggregated scores 17ϕiP and 8180ϕiP. The temperature, on the other hand, has an overall negative impact, which may be related to evapotranspiration that could decrease the discharge magnitude, while the influence of the day length is relatively negligible. Additionally, Fig. 4 further illustrates two other typical cases of feature importance patterns, where the contributions from recent precipitation (i.e., 17ϕiP) and temperature (i.e., 17ϕiT), respectively, are dominant in predicting target peak discharges. The distinct patterns of predictor contribution to annual maximum peak discharge predictions suggest that these flood events were triggered by different mechanisms.

Figure 4Additional examples to the case shown in Fig. 2, which illustrate the importance pattern of temperature, precipitation, and day length in predicting two discharge peaks from other catchments. (a) Recent precipitation contributes most to the discharge peak. (b) Recent temperature contributes most strongly to the discharge peak.


3.2 Flooding types revealed by cluster analysis

To separate the 53 968 annual maximum peak discharges into discrete groups characterized by distinct patterns of predictor contributions, we performed K-means clustering on the normalized contribution vectors. The results of the silhouette analysis suggest that clustering into three main groups would lead to the best clustering quality because it achieves the high average silhouette coefficient, and silhouette coefficients for individual samples are reasonably distributed within each cluster (see Fig. A1 for more details). It should be noted that the clustering results here only reveal major patterns widespread in data, with certain local and specific mechanisms unlikely to be detected.

Figure 5a–c show the distinct patterns of the three identified clusters, with cluster 1 featuring the high importance of recent temperature (Fig. 5a, a positive contribution in line with high temperature favoring snowmelt), cluster 2 featuring the dominant contributions from recent precipitation (Fig. 5b), and cluster 3 featuring the importance of antecedent precipitation events (Fig. 5c). Compared to cluster 1, clusters 2 and 3 show a generally negative effect of antecedent temperature, in line with drying favored by evapotranspiration. Moreover, annual maximum peak discharges in cluster 1 are characterized by higher contributions from day length (Fig. 5a) when compared to the other two clusters. The role of day length implies that the magnitude of these peak discharges can be partially explained by the seasonality presented by day length, which peaks around the June solstice. In contrast, the main differences between clusters 2 and 3 are due to the fractions of 17ϕiP and 8180ϕiP. Overall, each cluster accounts for 15.5 %, 49.9 %, and 34.6 % of all the identified annual maximum peak discharges, respectively.

Figure 5The cluster centroids and variance for the three clusters and their respective proportions of all peak discharge events in each catchment. The bars and error bars in (a)(c) represent the cluster centroids and standard deviations of the six aggregated feature contributions. The proportions in (d)(f) correspond to clusters 1–3, respectively.

Figure 5d–f illustrate the distributions in terms of the proportion of annual maximum peak discharges associated with each cluster within a catchment. Annual maximum peak discharges associated with high contributions from temperature (cluster 1) mainly occur in northern Europe and in mountainous regions such as the Alps (Fig. 5d), i.e., in regions with high snowfall fractions (Fig. 1d) where rising air temperature can lead to snowmelt. The spatial distribution together with the feature pattern shown in Fig. 5a indicates that these floods were probably driven by snowmelt events. In contrast, catchments within cluster 2, where recent precipitation played a decisive role in causing most floods (Fig. 5b), are primarily located in regions that have a west-facing or north-west-facing coast or mountain range, such as Ireland, Scotland, Wales, the Norwegian coast, north-west of the Iberian Peninsula, and the area extending from the Alps, the Massif Central, and the Pyrenees (Figs. 5e and 1a). These regions are characterized by a generally humid climate (Schiemann et al., 2018), as also indicated by Fig. 1c, and are strongly affected by the Northern Atlantic polar front and the associated storm tracks (Bengtsson et al., 2006) and/or by the presence of mountain barriers perpendicular to the prevailing flow direction, which force moist air to lift and condense (Isotta et al., 2014). Previous studies indicate that flooding in the regions could be largely explained by individual heavy-precipitation events (Gobiet et al., 2014; Whan et al., 2020; Blanchet and Creutin, 2017), some of which are associated with atmospheric rivers (Lavers and Villarini, 2013).

Catchments associated with cluster 3 are mostly located over the north European plain, southern Scandinavia, and parts of the British Isles (Fig. 5f). Here, information from antecedent precipitation has an overall higher weight than that from recent precipitation or other predictors (Fig. 5c), suggesting that recent precipitation alone would not suffice to explain annual maximum peak discharges. Therefore, flooding in these areas presents additionally heavy reliance on antecedent precipitation that is stored in the form of soil moisture. For example, Nied et al. (2014) revealed that, in the Elbe River basin, some weather patterns only cause flooding in the case of preceding soil saturation. Also, Ledingham et al. (2019) found that, in southeast England, fewer than 15 % of daily flood events correspond to extreme precipitation events, lower than in the rest of Britain, which was attributed to the relevant contribution of soil moisture storage to flooding.

It should be noted that the three kinds of flooding mechanisms (i.e., snowmelt-driven, recent-precipitation-driven, and antecedent-precipitation-driven) identified from the cluster analysis using the optimal cluster number only indicate which features carry greater weights for peak discharge predictions, and they are not necessarily mutually exclusive. Particularly, the peak discharge events near the decision boundaries between the three clusters, such as those with similar Euclidean distances to at least two different “closest” centroids, are likely affected by two or more flooding processes simultaneously. For example, the events categorized as snowmelt-driven floods are probably impacted additionally by saturated soils or extreme precipitation, such as rain-on-snow events (Cohen et al., 2015). These events generally represent compound flood events that arise from several drivers occurring concurrently (Bevacqua et al., 2021; Zscheischler et al., 2018). Recently, compound events have received increasing attention (Zscheischler et al., 2020); however, this study will only focus on the main flooding types obtained from the clustering results, regardless of whether compound effects were involved.

Figure 6The dominant flooding mechanisms and their relevance to catchment attributes and seasonality. Each dot in (b)(e) represents one catchment. “Mixture means” the associated catchments are dominated by two or more flooding mechanisms. For example, “mixture (r+s)” indicates that either recent precipitation (r) or snowmelt (s) is the primary cause of the annual maximum discharges for the associated catchments, and the difference between the two proportions is less than 70 %.

3.3 Dominant flooding mechanisms in Europe

The result of event-based flooding classification allows us to identify the dominant flooding mechanisms (among clusters 1–3, Fig. 5) for each catchment (Fig. 6a). A mechanism is considered dominant in a catchment if the proportion of the annual maximum peak discharges exceeded the maximum proportion of the other annual maximum peak discharges by more than 70 %. Otherwise, the catchment was regarded as being dominated by a mixture of flooding mechanisms. The mixture of mechanisms could be further classified into specific combinations based on which clusters were present in the catchment. Accordingly, for the catchments investigated in the study, 52.1 % were dominated by a mixture of mechanisms, while snowmelt, recent precipitation, and antecedent precipitation solely accounted for 10.1 %, 26.9 %, and 10.9 % of catchments, respectively. Among the mixtures of mechanisms, the combination of recent precipitation and antecedent precipitation accounted for 33.8 % of all the catchments, followed by the combination of all three mechanisms (15.8 %), the combination of recent precipitation and snowmelt (2.1 %), and the combination of antecedent precipitation and snowmelt (0.4 %).

It is worth noting again that the presence of a mixture of flooding mechanisms in a catchment only indicates that annual maximum discharges in the catchment are not uniformly caused by the same mechanism rather than signifying whether individual annual maximum peak discharge events are driven by multiple processes (i.e., compound events). Despite this, floods in catchments with a mixture of flooding mechanisms, in general, are more likely to be affected by two or more flooding processes, since the classification of floods in these catchments can be ambiguous (e.g., the events near the decision boundaries between clusters). For example, floods caused by both heavy precipitation and excessive soil moisture tend to present a high reliance on both recent precipitation and antecedent precipitation, which results in the catchment presenting a mixture of flooding mechanisms, depending on which feature importance is superior. Using 0.10 as a distance threshold to define events near the cluster decision boundaries (i.e., the difference between the distances from one point to its closest centroids and to its second-closest centroids is less than 0.10), 78.9 % of such events were found in catchments dominated by a mixture of mechanisms, whereas only 21.1 % were found in catchments dominated by single mechanisms.

Table 1Comparisons of flooding mechanisms in Europe identified by different methods.

Note: the summaries above were compiled from relevant figures or qualitative descriptions in the respective studies, and the subregions of Europe were not strictly defined. The definitions of various flooding mechanisms were not identical between the studies. * The catchment size range was not stated in the paper, and we calculated it from the original results provided by the authors.

Download Print Version | Download XLSX

In Fig. 6b–e, we further examine the relevance of dominant mechanisms to catchment physiographic and hydroclimatic characteristics demonstrated in Fig. 1. Unsurprisingly, snowmelt dominates flooding in regions with high snowfall fractions and obvious characteristics in latitude and altitude, where floods usually occur from May to July. The catchments dominated by antecedent precipitation are within plain terrains, where flooding occurs mainly during the winter and spring. Catchments with a gentle slope generally tend to have thicker soil, slower transmission, and therefore more potential to store antecedent precipitation (Hallema et al., 2016). In contrast, recent precipitation-dominated catchments have a broader spectrum of slopes and elevations and also experience summer floods. The distribution of catchment attributes from catchments dominated by a mixture of mechanisms is consistent with what we found based on catchments dominated by a single mechanism. For example, catchments dominated by snowmelt mixed with recent precipitation (purple in Fig. 6) or antecedent precipitation (yellow in Fig. 6) have relatively high snowfall fractions, with the former mainly occurring on areas with steep slopes (mainly in the Alps and Scandinavian mountains) and the latter mainly occurring on gentle slopes (such as parts of Finland). The catchments controlled by both recent and antecedent precipitation (light blue in Fig. 6) are located mostly in western Europe, suggesting that floods there were likely to be affected by the interaction between extreme precipitation and antecedent soil moisture, and their respective relative importance has varied between events. In addition, some catchments in the Alps, Germany, and Poland are impacted by all three mechanisms (slate gray in Fig. 6). In summary, these findings indicate that dominant flooding mechanisms differ substantially across catchments and are related to their geographic and climatic characteristics. In addition to elevation, slope, and snow fraction, the study by Stein et al. (2021) on catchments in the United States demonstrated that other catchment characteristics (e.g., aridity, precipitation seasonality, and mean precipitation) also significantly influence flood-generating processes. An in-depth investigation of how geographic and climatic characteristics affect flood mechanisms in European catchments can be expected in future studies.

3.4 Comparative analysis with other studies

A better understanding of the generating processes of river flooding is crucial for interpreting past flood changes and for improving future flood risk predictions. In recent years, large-scale quantitative investigations of flooding mechanisms specifically for Europe have been undertaken in several studies, with different methodologies and scales applied. For example, by using circular statistics analysis, Berghuijs et al. (2019) examined the relative importance of three flooding mechanisms based on the seasonality of floods and three potential drivers such as the largest daily precipitation, the largest daily soil moisture excess, and the largest daily snowmelt. Bertola et al. (2021) attributed changes in the magnitude of flood quantiles to changes in possible drivers by using regression analysis and determined their contributions to flood changes accordingly. In contrast to these analyses conducted at catchment or coarser levels, Kemter et al. (2020) and Stein et al. (2020) performed event-based classifications to determine flooding mechanisms in respective regions or catchments, both using predefined criteria but with different indicators and thresholds. Table 1 summarizes the main findings in these studies regarding the major flooding mechanisms per geographic subregion of Europe and compares them with those identified in this study.

As indicated in Table 1, despite the different definitions, methods, and standards in recognizing flooding mechanisms, the five studies present some consistency, especially in Northern Europe and the Alps, which are dominated by snowmelt or by snowmelt combined with extreme precipitation. Among the four previous studies, this study shows the largest consistency with Berghuijs et al. (2019), especially when it comes to the contribution of meteorological drivers to flood generation in individual catchments. However, Berghuijs et al. (2019) and Kemter et al. (2020) almost exclusively regarded floods in regions from northern France to northern Germany to be a consequence of soil moisture excess. In contrast, Bertola et al. (2021) and this study also included extreme precipitation as a crucial factor, and we have demonstrated that floods in those regions are driven by a combination of both heavy precipitation and saturated soil moisture.

In addition to methodological differences, the inconsistent catchment samples are also responsible for the divergent attribution results in different studies. As shown in Table 1, the catchments examined in this study are generally smaller and tend to be more susceptible to high-intensity rainfall. Moreover, discrepancies in the estimation of soil moisture might be an additional reason. In the absence of direct observations, soil moisture in the four previous studies was explicitly estimated by using simple water balance models (Berghuijs et al., 2019; Stein et al., 2020), reanalysis data (Kemter et al., 2020), and a proxy based on antecedent precipitation (Bertola et al., 2021). The uncertainty associated with soil moisture estimates may, however, make a difference in determining whether floods are triggered by extreme precipitation or soil moisture excess. Tarasova et al. (2020) conducted a rigorous uncertainty analysis of input data for a runoff event classification framework, emphasizing the importance of developing novel indicators to reduce these uncertainties. Here, profiting from the memory property of LSTM models, the present study identified flooding mechanisms based on long-term predictive relationships between precipitation, temperature, day length, and discharge. The method has reduced the need for accurate catchment wetness estimates, yet such uncertainty is not eliminated completely, particularly since we chose a 7 d window to distinguish between antecedent and recent precipitation. Compared to analyses at catchment or coarser levels, event-based investigations of flooding mechanisms have the advantage of allowing for the detection of stronger signals about their potential changes over time, since averaged information tends to obscure information about individual event processes and thus makes the trends imperceptible. For example, Berghuijs et al. (2019) found no discernible change in the relative importance of flood drivers for most regions in Europe, while some regional studies (e.g., Vormoor et al., 2016; Beniston and Stoffel, 2016) and event-based studies (e.g., Kemter et al., 2020) have indicated such changes.

3.5 Temporal evolutions of flooding mechanisms

To test whether the dominant mechanism has changed over the period 1950–2020, we first compared the catchment-level dominant mechanisms separately for 1950–1985 and 1985–2020 by applying the procedure implemented in Sect. 3.3. Only the 818 catchments with at least 15 years of records in each period were considered. Figure 7a summarizes the proportions of the single dominant mechanisms (represented by colorful blocks) and their combinations (represented by gray blocks) during each period along with shifts between them. The Sankey plot indicates that a majority of catchments (79.6 %) retain their dominant mechanisms and that there has not been a shift from one dominant mechanism to another (see the absence of data flow between two different blocks from left to right). However, some catchments with single mechanisms have become dominated by a mixture of mechanisms (i.e., flowing from colorful blocks to gray ones, which accounts for 7.2 % of the total), while some behave in the opposite way (7.3 %). In a few catchments with a mixture of mechanisms (5.9 %), the dominant mechanisms have also changed, though they remain mixed.

Figure 7(a) Sankey plot indicating the proportions of single dominant flood-generating mechanisms and their combinations during two time periods, with the flow lines indicating shifts between them. The proportions were calculated based on the 818 catchments that have at least 15 years of records available in each period. (b) The evolution of the proportions of annual maximum peak discharges with the three flooding mechanisms. The shades denote the 95 % confidence interval of the proportions, which was calculated as p^±1.96×p^(1-p^)n (p^ is the estimated proportion, and n is the sample size). The dashed black lines indicate the slope of their trends estimated by Theil–Sen's estimator, with their significance being assessed by the modified Mann–Kendall test. (c–e) The spatial trends in different event-based flooding mechanisms, where the trends indicated by the colorful dots were calculated using a 20-year moving window. Markers with black edges denote catchments with significant trends (α=0.05). The black boxes highlight five hotspot regions that are discussed in the main text.

Despite only a few fractions of catchments presenting a change in their dominant flooding mechanisms, Fig. 7b reveals tendencies for specific mechanisms at event levels when considering all annual maximum peak discharges in the 818 catchments over the past seven decades. In particular, the annual maximum peak discharges driven by snowmelt have been declining by 0.8 % per decade. In contrast, recent precipitation has become more dominant in causing floods, increasing by 1.1 % per decade, despite weaker significance that is probably due to the inconsistent changes from 2005 onward. Both frequency changes are probably associated with the warming atmosphere, which causes decreased snowpack (Fontrodona-Bach et al., 2018). Also, because of the rising temperatures, the atmosphere has a higher moisture-holding capacity, leading to an increase in precipitation extremes on average (Trenberth, 2011; Fischer and Knutti, 2016). These factors make it more likely that the annual floods are driven by recent precipitation and less frequently by snowmelt. Additionally, we observe an overall slight decrease in soil-moisture-excess-driven floods as a result of counterbalancing the other two trends, though the trend is not statistically significant when considering the entire period. The above conclusions hold when considering a smaller subset of catchments (460 in this case) with at least 25 years of records in each period.

Note that Fig. 7b only presents the overall trends in flooding mechanisms at the continental scale, while disparate trends that could cancel each other out may exist in different regions. Therefore, we further examined the trends in different event-based mechanisms in the 818 catchments (Fig. 7c–e), with the color representing the Theil–Sen slopes computed on the time series of respective proportions in individual catchments. The results indicate that most catchments in the Alps, which are typically dominated by snowmelt, have experienced significant decreases in snowmelt-driven floods, with similar cases having occurred in Scandinavia as well (Fig. 7c). In contrast, extreme precipitation has become a more frequent cause of annual maximum discharges in the Massif Central, the north European plain, and the Alps, while decreased trends are observed in some regions of western Europe and, especially, southeast England (Fig. 7d). As for soil-moisture-induced floods, their proportion generally shows opposing trends relative to those of extreme precipitation (Fig. 7e).

The decreasing trend in snowmelt-driven floods was also detected by Kemter et al. (2020), with a decrease of 1.65 % per decade, mainly occurring in eastern Europe, which was outside of our study area. In addition, they detected an increase in stratiform rainfall-driven floods (0.49 % per decade), mainly along the Mediterranean coast, and an increase in soil-moisture-excess-driven floods (1.55 % per decade) in the British Isles and central and northern Europe. The difference between Kemter et al. (2020) and this study probably arises from the varying study areas (the former additionally includes a large number of eastern and southern European catchments), as well as the definition of flood types. For example, their study defined soil-moisture-excess-driven floods as non-snowmelt floods when the mean soil water content was above 70 % before a time window, and the remainder were defined as stratiform rainfall-driven floods. In contrast, this study used cluster analysis for the actual contributions of precipitation events before floods, and soil-moisture-induced floods were related to annual maximum peak discharges where the contribution from antecedent precipitation is more important than recent precipitation.

3.6 Possible causes and implications of the trends

To gain insights into the causes of the identified trends, we analyze five selected regions, highlighted in Fig. 7c–e (see region numbers in Fig. 7c), which feature consistent trends in certain mechanisms. For region 1 (the Alps) and region 3 (northeast Scandinavia), catchments with significant decreasing trends in snowmelt-driven events were considered. For region 2 (southeast France) and region 4 (northern Germany), we considered catchments with significant increasing trends in extreme precipitation-driven events, as well as those presenting significant decreases for region 5 (southeast England). Figure 8 shows the temporal regional evolution of the event-level mechanisms within the considered catchments along with the change in magnitude of the annual maximum 7 d precipitation and mean spring temperatures over the past 70 years. For the two regions with significant soil moisture effect on flooding (i.e., regions 4 and 5), we additionally added the averaged trends of antecedent soil moisture conditions prior to flooding for analysis.

Figure 8The temporal changes of the event-level mechanisms in relevant catchments within the five selected regions (see Fig. 7c), as well as the changes in average extreme precipitation (represented by annual maximum 7 d total precipitation), mean spring temperatures (represented by average temperature between January and April), and antecedent soil moisture conditions prior to flooding (represented by the 30 d total precipitation preceding the 7 d window of recent precipitation). The numbers in panel titles indicate the number of catchments considered. The proportions were calculated by a 20-year moving window, while precipitation and temperature were smoothed using a 20-year moving average window, with their values at central positions in time windows. The dashed gray lines indicate the slope of relevant trends with their significance.


Mean spring temperatures have increased significantly in all five regions (Fig. 8), confirming the previous explanations for the reduced influence of snowmelt on river discharge annual maxima in snowy areas (regions 1–3) (Vormoor et al., 2016; Beniston and Stoffel, 2016). Furthermore, in regions 1–4, the increased magnitude of maximum 7 d precipitation can explain the rise in proportions of annual maximum peak discharges driven by extreme precipitation events. In contrast, the maximum 7 d precipitation in southeast England (region 5) remained almost unchanged (Fig. 8e). Nonetheless, soil moisture conditions before discharges might have increased in southeast England, as indicated by the increasing antecedent precipitation accumulations, which causes annual maximum discharge to be more likely to be driven by soil moisture excesses than by recent precipitation. Blöschl et al. (2017) stated that the region has a large subsurface water storage capacity, which is capable of storing a large amount of water that continuously increases until flooding occurs. In comparison, in northern Germany (region 4), the antecedent precipitation before annual maximum peak discharges changed more slightly (Fig. 8d), while the increase in precipitation extremes likely caused an increase in floods driven by recent heavy precipitation. Note that, here, we merely examined the monotonic trends within data over the 70 years, while the trends may vary piecewise (e.g., the changes in maximum weekly precipitation in the Alps and southeast France), the impact of which on flooding mechanisms deserves further research. These figures are robust against spatial variability within regions (see Fig. S6).

A change in flooding mechanisms may affect the seasonality and magnitude of flooding, which might ultimately impair the current flood risk management measures. For example, in catchments previously dominated by snowmelt, increasing floods from extreme precipitation and soil moisture excess may lead to shifted flood mean dates and less concentrated seasonal patterns (as exemplified in Fig. B1). By simulating daily discharge for a reference period (1961–1990) and a future period (2071–2099), Vormoor et al. (2015) predicted that floods in some Nordic catchments could even shift from spring to autumn as rain replaced snowmelt as the dominant flood-inducing process. These results suggest that, in a warmer climate, flood risk predictions in snowmelt-affected catchments should consider the interconnection between changes in flooding drivers and seasonality.

As for the impact on flooding magnitude, while it is challenging to link observed changes in individual flooding drivers alone to changes in flooding magnitudes, a link may appear, especially in light of climate change (Blöschl et al., 2019). For example, the catchments where floods are dominated by recent precipitation tend to be more susceptible to changes in extreme 7 d precipitation (Fig. B2). Despite a lack of sufficient observational evidence that the magnitude of floods increases with more extreme precipitation (Sharma et al., 2018), the trend of which is often determined jointly by both changes in rainfall and changes in antecedent soil moisture, some studies demonstrated the changed precipitation severity could vary the relationship between precipitation and streamflow (Bennett et al., 2018). When recent rainfall increases, changes in antecedent moisture conditions would become less important in modulating the response to rainfall (Wasko and Nathan, 2019). Brunner et al. (2021) indicated that it is possible to identify a catchment-specific extremeness threshold, above which precipitation increases clearly produce greater flood magnitudes, and below which flood magnitude is strongly modulated by soil moisture. Therefore, the persistent risk that recent extreme precipitation would have an increasingly decisive role in flood generation for a large proportion of catchments, as implied by Fig. 7, cannot be disregarded. Recognizing the impact of such mechanism shifts in flooding mechanisms is crucial for understanding the link between changes in precipitation and flood risk in a warming climate.

3.7 Limitations and outlooks

In this study, we trained LSTM models in a local fashion (i.e., training the model individually for each catchment) rather than a regional fashion (training a single model across multiple catchments), since the main objective of the study is to identify distinguishable patterns of meteorological variables' contributions at local scales. From a prediction standpoint, particularly for unprecedented events and ungauged basins (Nearing et al., 2021; Frame et al., 2022), regional modeling may be a better choice because it is capable of learning more general relationships from a larger variety of hydrological data (Kratzert et al., 2019b). However, for the regional modeling, both meteorological time series and static catchment attributes are used as inputs to distinguish response behaviors across time and space. Adding such static attributes would introduce substantial multicollinearities among the considered variables (see Fig. S7 for illustration). Multicollinearity might not be a problem for ML models when they are used for prediction, as long as the collinearity between variables remains stationary (Dormann et al., 2013). Nevertheless, for our study that aims to interpret the effects of predictors on responses, high multicollinearity in predictors indicates that considerable information may be shared among the collinear sets. This would result in difficulties in separating the physical effects of these variables – this is also the case in traditional regression models (Hartono et al., 2020). Therefore, interpreting flooding mechanisms with regional LSTM models may become more challenging than with local LSTM models that use only meteorological time series, since some catchment attributes would confound the interpretation. In this study, we therefore employed simple local models, which avoids confounding and multicollinearity resulting from static catchment attributes. However, in light of the benefit of regional modeling, which can provide insights into how flooding mechanisms vary spatially by geographic and climatic characteristics of catchments, how to deal with these challenges in the interpretation merits more exploration in future studies. An immediate question to address is whether adopting different modeling strategies will result in different interpretations regarding the gradient contributions of meteorological forcings, which ultimately leads to alternative understandings of flooding mechanisms. The emerging differences may provide us with an opportunity to gain new insights into flooding mechanisms from these models.

The multicollinearity also exists in meteorological drivers at daily scales, which requires careful handling of the interpretation results if adding more predictors. For example, radiation is usually an important driver of snowmelt that favors flooding (Merz and Blöschl, 2003), but the interpretation method might not assign it high importance when it is combined with day length as an additional predictor due to the high correlation between the two variables (see Fig. S8 for an example). This is because the used interpretation technique does not measure how important a feature is in the real world but rather how important it is to the model. Therefore, it is not necessarily better to add more input features to a model in terms of process understanding, which can even be misleading if the interpretation results are not justified by sufficient physical knowledge (Kroll and Song, 2013). In this study, instead of using more predictors that result in less interpretability, we restricted ourselves to few input features whose effect can be relatively easily interpreted and understood. Therefore, we only selected daily precipitation, temperature, and day length as meteorological inputs, the combination of which results in uncovering three well-known flooding mechanisms. The results are physically interpretable and comparable with findings from other studies that used classical methods. Incorporating more meteorological drivers into the model might, in theory, allow for the identification of additional flooding mechanisms that may be overlooked. However, multicollinearity and confounding can pose a challenge to interpretability, especially when the recognized patterns cannot be linked to fundamental physical processes. Therefore, we leave how to resolve the trade off as an open question for future studies.

In the clustering procedure, we chose to use a 7 d window to aggregate the daily IG scores into a low-dimensional contribution vector for the sake of efficiency in clustering lengthy time series, which could induce inevitable uncertainties and subjectivity. Despite this, additional tests indicate that our findings are similar when using a 5 d window, which is also a common interval to consider flooding drivers (e.g., Rottler et al., 2021). Specifically, based on the 5 d window, the events identified with snowmelt, recent precipitation, or antecedent precipitation as the primary causes account for 15.3 %, 48.9 %, and 35.8 % of all the 53 968 annual maximum peak discharges (Fig. S9), which is only slightly different from using a 7 d window. As for the three mechanisms in individual catchments, decreasing the window length has the least impact on identifying snowmelt-driven floods, with the absolute changes in their proportions being within 1 % for 81.5 % of catchments and within 5 % for 97.2 % of catchments. In comparison, the proportion changes for two other flooding types are more sensitive, with changes within 5 % for 76.6 % (78.0 %) of catchments in terms of recent (antecedent) precipitation-driven flooding. However, this does not affect the conclusion regarding the respective trends in flooding mechanisms (Fig. S10), indicating the robustness of the methodology. Despite this sensitivity analysis, we would like to emphasize that the selection of the separating window remains somewhat subjective, and further exploration is needed to avoid a possible bias due to arbitrary judgments in identifying flooding mechanisms.

4 Conclusions

Flooding in rivers is usually caused by complex interactions between heavy precipitation, high soil moisture, and melting snow. Climate change has resulted in an overall decreased snowpack and more intense short-term precipitation extremes, which might systematically alter the interaction between flood drivers at the catchment level. To investigate whether flooding mechanisms have changed in European catchments, this study introduced a novel explainable ML method to identify flooding mechanisms. Compared with conventional classification approaches, where the results are usually dependent on appropriate flood process definitions and are sensitive to the selected indicators and threshold parameters, the combination of explainable ML and cluster analysis is able to avoid such predefinitions and reduces subjectivities in identification processes. With the ML-captured feature importance of precipitation, temperature, and day length for predicting annual maximum discharges, we aggregated driver contributions in the recent 7 d and an earlier period (back to 180 d) and then applied cluster analysis to group them based on similar patterns. As a result, the method identifies three major patterns that induce floods across 977 European catchments, corresponding to three typical flooding mechanisms, including recent precipitation (responsible for 49.9 % of the annual maximum discharge events), antecedent precipitation (i.e., excessive soil moisture, accounting for 34.6 %), and snowmelt (15.5 %). The results indicate that, for 26.9 % of catchments, recent precipitation is the typical main contributor to floods, while floods are typically controlled by antecedent precipitation (linked to excessive soil moisture) in 10.9 % of catchments. In around one-third (33.8 %) of catchments, floods are dominated by a combination of recent heavy precipitation and antecedent-precipitation events, meaning that some floods there were caused by recent rains, and others were primarily driven by antecedent precipitation, although many of them were likely due to the compound effect between the two drivers. The remaining catchments are dominated by snowmelt (10.1 %) or by combinations of snowmelt with the other two drivers. The spatial distribution of the dominant flooding mechanisms reflects the variation of the catchment's geographic and climatic characteristics and is generally consistent with results reported in earlier studies, some of which did not perform event-based classifications but rather identified the overall mechanisms within individual catchments.

We further detected changes in dominant flooding mechanisms over the last 70 years in over 20.4 % of European catchments; in particular, some catchments that were previously dominated by single mechanisms became dominated by a mixture of mechanisms, and some catchments show opposite shifts. Despite no regime shift from one single flooding mechanism to another single one, tendencies in their mechanisms at event levels were found. Specifically, when taking all annual maximum discharge events into account, those triggered by snowmelt have significantly decreased, with their proportion dropping by 0.8 % per decade. Recent 7 d precipitation, on the other hand, has become increasingly important for flooding, with flooding triggered by such recent heavy precipitation increasing by 1.1 % per decade. The changes in flooding mechanisms present a largely consistent pattern with climate change responses, and we discuss the potential risks associated with the resulting effects on flooding seasonality and magnitude.

Overall, this study highlights the usability of explainable ML in helping uncover complex and possibly non-linear changes in weather and climate extreme events in the warming Earth system. With more large-sample hydrometeorological datasets becoming readily accessible, one next step is to extend the research to a larger scale for a better understanding of variations in flooding mechanisms globally. Still, many challenges remain for future work, providing potential research opportunities. For example, the clustering procedure can be improved by developing algorithms to aggregate daily feature importance adaptively, thereby avoiding the predefined separation window while maintaining high efficiency. Moreover, regional LSTM models that incorporate static catchment attributes can be employed to capture the spatial variations in flooding mechanisms and to quantify the influence of catchments' geographical and climatic conditions on flooding processes. In addition to the integrated gradient method used in this study, other interpretation techniques might be explored further to uncover potentially valuable information when more input variables are included.

Appendix A

Figure A1Determination of optimal cluster number. (a) The average silhouette coefficients and total within-cluster sum of squares assessed for respective candidate cluster numbers. (b) The silhouette plots for various clusters when the cluster number is 2 or 3, where the x axis represents the silhouette coefficient for individual samples, and they were ordered by the coefficients and grouped by clusters in the y axis; (a) suggests that clustering the samples into either two or three groups can achieve similarly high average silhouette coefficients, while the silhouette plots for individual samples under the two candidate numbers in (b) further suggest that clustering into three groups would be the best choice because a cluster with all below-average silhouette coefficients is present when clustering into two groups. Therefore, we cluster annual maximum peak discharges into three main groups in the main text.


Appendix B

Figure B1(a) Change in flooding mean dates (difference from 1985–2020 to 1950–1985) in 44 catchments, with a significant reduction of snowmelt-driven floods in the Alps (region 1 in Fig. 7c), for snowmelt-driven floods and for all floods irrespective of their cause. For these catchments, the overall proportion of annual maximum discharges caused by snowmelt has decreased from 49.0 % in 1950–1985 to 36.8 % in 1985–2020. (b) The differences in mean resultant length of flood dates for the same cases as in (a). The mean resultant length is a measure in circular statistics between 0 and 1 that reflects the spread of a circular variable, with 0 representing the spread of flood dates evenly distributed over the year and 1 representing the spread concentrated at 1 d. It can be deducted from (a) that, following the temperature increase, snowmelt-driven floods generally occurred earlier in the year during 1985–2020 compared to 1950–1985, with a median shift of −5.9 d. On the other hand, annual peak discharges occurred later in more than half of the catchments due to the increasing presence of other types of floods. Furthermore, (b) shows that the seasonality of annual maximum discharges has become more diffuse (decreasing mean resultant length) in most catchments for the same reason, though snowmelt-driven floods remain relatively stable.


Figure B2The distribution of Spearman's correlations between annual maximum discharge and annual maximum 7 d precipitation for two groups of catchments (blue, recent-precipitation-dominated catchments; green, antecedent-precipitation-dominated catchments, based on Fig. 6a). It shows that the catchments where floods are dominated by recent precipitation tend to have higher correlations than antecedent-precipitation-dominated catchments, which implies that the former might be more susceptible to changes in extreme 7 d precipitation.


Code and data availability

The river discharge data can be obtained from the GRDC dataset (; Federal Institute of Hydrology, 2022). The E-OBS gridded precipitation and temperature dataset is available at (ECA & D, 2022; Haylock et al., 2008). Catchment attributes and boundaries are available at (Do et al., 2018b) and (Federal Institute of Hydrology, 2011; Lehner, 2012). The 30 arcsec elevation data shown in Fig. 1a are accessible at (EROS, 2018). The code for the explainable machine learning framework is available at (Jiang, 2022).


The supplement related to this article is available online at:

Author contributions

SJ and JZ conceived the study. SJ performed all analyses and wrote the initial draft. All authors substantially contributed to the final draft.

Competing interests

The contact author has declared that none of the authors has any competing interests.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


We would like to acknowledge the use of the computing resources provided by Yi Zheng, and we thank Peter Miersch for proofreading the manuscript.

Financial support

The authors acknowledge the European COST Action DAMOCLES (grant no. CA17109) and the Helmholtz Initiative and Networking Fund (Young Investigator Group COMPOUNDX; grant no. VH-NG-1537). This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant no. 101003469 (XAIDA).

The article processing charges for this open-access publication were covered by the Helmholtz Centre for Environmental Research – UFZ.

Review statement

This paper was edited by Manuela Irene Brunner and reviewed by Frederik Kratzert, Larisa Tarasova, and one anonymous referee.


Alfieri, L., Burek, P., Feyen, L., and Forzieri, G.: Global warming increases the frequency of river floods in Europe, Hydrol. Earth Syst. Sci., 19, 2247–2260,, 2015. 

Alfieri, L., Bisselink, B., Dottori, F., Naumann, G., de Roo, A., Salamon, P., Wyser, K., and Feyen, L.: Global projections of river flood risk in a warmer world, Earth's Future, 5, 171–182,, 2017. 

Barnes, E. A., Toms, B., Hurrell, J. W., Ebert-Uphoff, I., Anderson, C., and Anderson, D.: Indicator patterns of forced change learned by an artificial neural network, J. Adv. Model. Earth Syst., 12, e2020MS002195,, 2020. 

Bengtsson, L., Hodges, K. I., and Roeckner, E.: Storm tracks and climate change, J. Climate, 19, 3518–3543,, 2006. 

Beniston, M. and Stoffel, M.: Rain-on-snow events, floods and climate change in the Alps: Events may increase with warming up to 4 degrees C and decrease thereafter, Sci. Total Environ., 571, 228–236,, 2016. 

Bennett, B., Leonard, M., Deng, Y., and Westra, S.: An empirical investigation into the effect of antecedent precipitation on flood volume, J. Hydrol., 567, 435–445,, 2018. 

Berghuijs, W. R., Woods, R. A., Hutton, C. J., and Sivapalan, M.: Dominant flood generating mechanisms across the United States, Geophysical Research Letters, 43, 4382–4390, 10.1002/2016gl068070, 2016. 

Berghuijs, W. R., Harrigan, S., Molnar, P., Slater, L. J., and Kirchner, J. W.: The relative importance of different flood-generating mechanisms across Europe, Water Resour. Res., 55, 4582–4593,, 2019. 

Bertola, M., Viglione, A., Lun, D., Hall, J., and Blöschl, G.: Flood trends in Europe: are changes in small and big floods different?, Hydrol. Earth Syst. Sci., 24, 1805–1822,, 2020. 

Bertola, M., Viglione, A., Vorogushyn, S., Lun, D., Merz, B., and Bloeschl, G.: Do small and large floods have the same drivers of change? A regional attribution analysis in Europe, Hydrol. Earth Syst. Sci., 25, 1347–1364,, 2021. 

Bevacqua, E., De Michele, C., Manning, C., Couasnon, A., Ribeiro, A. F. S., Ramos, A. M., Vignotto, E., Bastos, A., Blesic, S., Durante, F., Hillier, J., Oliveira, S. C., Pinto, J. G., Ragno, E., Rivoire, P., Saunders, K., van der Wiel, K., Wu, W. Y., Zhang, T. Y., and Zscheischler, J.: Guidelines for studying diverse types of compound weather and climate events, Earth's Future, 9, e2021EF002340,, 2021. 

Blanchet, J. and Creutin, J. D.: Co-occurrence of extreme daily rainfall in the French mediterranean region, Water Resour. Res., 53, 9330–9349,, 2017. 

Blöschl, G., Hall, J., Parajka, J., Perdigao, R. A. P., Merz, B., Arheimer, B., Aronica, G. T., Bilibashi, A., Bonacci, O., Borga, M., Canjevac, I., Castellarin, A., Chirico, G. B., Claps, P., Fiala, K., Frolova, N., Gorbachova, L., Gul, A., Hannaford, J., Harrigan, S., Kireeva, M., Kiss, A., Kjeldsen, T. R., Kohnova, S., Koskela, J. J., Ledvinka, O., Macdonald, N., Mavrova-Guirguinova, M., Mediero, L., Merz, R., Molnar, P., Montanari, A., Murphy, C., Osuch, M., Ovcharuk, V., Radevski, I., Rogger, M., Salinas, J. L., Sauquet, E., Sraj, M., Szolgay, J., Viglione, A., Volpi, E., Wilson, D., Zaimi, K., and Zivkovic, N.: Changing climate shifts timing of European floods, Science, 357, 588–590,, 2017. 

Blöschl, G., Hall, J., Viglione, A., Perdigao, R. A. P., Parajka, J., Merz, B., Lun, D., Arheimer, B., Aronica, G. T., Bilibashi, A., Bohac, M., Bonacci, O., Borga, M., Canjevac, I., Castellarin, A., Chirico, G. B., Claps, P., Frolova, N., Ganora, D., Gorbachova, L., Gul, A., Hannaford, J., Harrigan, S., Kireeva, M., Kiss, A., Kjeldsen, T. R., Kohnova, S., Koskela, J. J., Ledvinka, O., Macdonald, N., Mavrova-Guirguinova, M., Mediero, L., Merz, R., Molnar, P., Montanari, A., Murphy, C., Osuch, M., Ovcharuk, V., Radevski, I., Salinas, J. L., Sauquet, E., Sraj, M., Szolgay, J., Volpi, E., Wilson, D., Zaimi, K., and Zivkovic, N.: Changing climate both increases and decreases European river floods, Nature, 573, 108–111,, 2019. 

Brunner, M. I., Swain, D. L., Wood, R. R., Willkofer, F., Done, J. M., Gilleland, E., and Ludwig, R.: An extremeness threshold determines the regional response of floods to changes in rainfall extremes, Commun. Earth Environ., 2, 173,, 2021. 

Cohen, J., Ye, H. C., and Jones, J.: Trends and variability in rain-on-snow events, Geophys. Res. Lett., 42, 7115–7122,, 2015. 

Davenport, F. V., Herrera-Estrada, J. E., Burke, M., and Diffenbaugh, N. S.: Flood size increases nonlinearly across the western United States in response to lower snow-precipitation ratios, Water Resour. Res., 56, e2019WR025571,, 2020. 

Do, H. X., Gudmundsson, L., Leonard, M., and Westra, S.: The Global Streamflow Indices and Metadata Archive (GSIM) – Part 1: The production of a daily streamflow archive and metadata, Earth Syst. Sci. Data, 10, 765–785,, 2018a. 

Do, H. X., Gudmundsson, L., Leonard, M., and Westra, S.: The Global Streamflow Indices and Metadata Archive – Part 1: Station catalog and Catchment boundary, PANGAEA [data set],, 2018b. 

Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., Marquéz, J. R. G., Gruber, B., Lafourcade, B., Leitão, P. J., Münkemüller, T., McClean, C., Osborne, P. E., Reineking, B., Schröder, B., Skidmore, A. K., Zurell, D., and Lautenbach, S.: Collinearity: a review of methods to deal with it and a simulation study evaluating their performance, Ecography, 36, 27–46,, 2013. 

ECA & D: E-OBS gridded dataset, ECA & D [data set], (last access: 1 November 2021), 2022. 

EROS: Digital Elevation – Global 30 Arc-Second Elevation (GTOPO30), USGS [data set],, 2018. 

Federal Institute of Hydrology: Watershed Boundaries of GRDC Stations, Global Runoff Data Centre [data set], (last access: 1 November 2021), 2011. 

Federal Institute of Hydrology: Global Runoff Database, Global Runoff Data Centre [data set], (last access: 1 November 2021), 2022. 

Fischer, E. M. and Knutti, R.: Observed heavy precipitation increase confirms theory and early models, Nat. Clim. Change, 6, 986–991,, 2016. 

Fontrodona-Bach, A., van der Schrier, G., Melsen, L. A., Tank, A., and Teuling, A. J.: Widespread and accelerated decrease of observed mean and extreme snow depth over Europe, Geophys. Res. Lett., 45, 12312–12319,, 2018. 

Forsythe, W. C., Rykiel, E. J., Stahl, R. S., Wu, H. I., and Schoolfield, R. M.: A model comparison for daylength as a function of latitude and day of year, Ecological Modelling, 80, 87-95, 10.1016/0304-3800(94)00034-f, 1995. 

Frame, J. M., Kratzert, F., Klotz, D., Gauch, M., Shalev, G., Gilon, O., Qualls, L. M., Gupta, H. V., and Nearing, G. S.: Deep learning rainfall–runoff predictions of extreme events, Hydrol. Earth Syst. Sci., 26, 3377–3392,, 2022. 

Gers, F. A., Schmidhuber, J., and Cummins, F.: Learning to forget: continual prediction with LSTM, in: 1999 Ninth International Conference on Artificial Neural Networks ICANN 99, Conf. Publ. No. 470,, 7–10 September 1999, Edinburgh, UK,, 1999. 

Gobiet, A., Kotlarski, S., Beniston, M., Heinrich, G., Rajczak, J., and Stoffel, M.: 21st century climate change in the European Alps – A review, Sci. Total Environ., 493, 1138–1151,, 2014. 

Hall, J. and Blöschl, G.: Spatial patterns and characteristics of flood seasonality in Europe, Hydrolo. Earth Syst. Sci., 22, 3883–3901,, 2018. 

Hall, J., Arheimer, B., Borga, M., Brazdil, R., Claps, P., Kiss, A., Kjeldsen, T. R., Kriauciuniene, J., Kundzewicz, Z. W., Lang, M., Llasat, M. C., Macdonald, N., McIntyre, N., Mediero, L., Merz, B., Merz, R., Molnar, P., Montanari, A., Neuhold, C., Parajka, J., Perdigao, R. A. P., Plavcova, L., Rogger, M., Salinas, J. L., Sauquet, E., Schar, C., Szolgay, J., Viglione, A., and Bloschl, G.: Understanding flood regime changes in Europe: a state-of-the-art assessment, Hydrol. Earth Syst. Sci., 18, 2735–2772,, 2014. 

Hallema, D. W., Moussa, R., Sun, G., and McNulty, S. G.: Surface storm flow prediction on hillslopes based on topography and hydrologic connectivity, Ecol. Process., 5, 13,, 2016. 

Hamed, K. H. and Rao, A. R.: A modified Mann–Kendall trend test for autocorrelated data, J. Hydrol., 204, 182–196,, 1998. 

Hamon, W. R.: Estimating potential evapotranspiration, J. Hydraul. Div., 87, 107–102,, 1961. 

Hartono, N. T. P., Thapa, J., Tiihonen, A., Oviedo, F., Batali, C., Yoo, J. J., Liu, Z., Li, R., Marrón, D. F., Bawendi, M. G., Buonassisi, T., and Sun, S.: How machine learning can help select capping layers to suppress perovskite degradation, Nat. Commun., 11, 4172,, 2020. 

Haylock, M. R., Hofstra, N., Tank, A., Klok, E. J., Jones, P. D., and New, M.: A European daily high-resolution gridded data set of surface temperature and precipitation for 1950–2006, J. Geophys. Res.-Atmos., 113, D20119,, 2008. 

Hirabayashi, Y., Mahendran, R., Koirala, S., Konoshima, L., Yamazaki, D., Watanabe, S., Kim, H., and Kanae, S.: Global flood risk under climate change, Nat. Clim. Change, 3, 816–821,, 2013. 

Hochreiter, S. and Schmidhuber, J.: Long short-term memory, Neural Comput., 9, 1735–1780,, 1997. 

Isotta, F. A., Frei, C., Weilguni, V., Tadic, M. P., Lassegues, P., Rudolf, B., Pavan, V., Cacciamani, C., Antolini, G., Ratto, S. M., Munari, M., Micheletti, S., Bonati, V., Lussana, C., Ronchi, C., Panettieri, E., Marigo, G., and Vertacnik, G.: The climate of daily precipitation in the Alps: development and analysis of a high-resolution grid dataset from pan-Alpine rain-gauge data, Int. J. Climatol., 34, 1657–1675,, 2014. 

Jiang, S.: An interpretive deep learning framework for identifying flooding mechanisms, Zenodo [code],, 2022. 

Jiang, S. J., Zheng, Y., Wang, C., and Babovic, V.: Uncovering flooding mechanisms across the contiguous United States through interpretive deep learning on representative catchments, Water Resour. Res., 58, e2021WR030185,, 2022. 

Keller, L., Rossler, O., Martius, O., and Weingartner, R.: Delineation of flood generating processes and their hydrological response, Hydrol. Process., 32, 228–240,, 2018. 

Kemter, M., Merz, B., Marwan, N., Vorogushyn, S., and Bloeschl, G.: Joint trends in flood magnitudes and spatial extents across Europe, Geophys. Res. Lett., 47, e2020GL087464,, 2020. 

Kingma, D. P. and Ba, J.: Adam: A method for stochastic optimization, in: 3rd International Conference on Learning Representations, 7–9 May 2015, San Diego,, 2015. 

Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall-runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005–6022,, 2018. 

Kratzert, F., Herrnegger, M., Klotz, D., Hochreiter, S., and Klambauer, G.: NeuralHydrology – Interpreting LSTMs in Hydrology, in: Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, edited by: Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K., and Müller, K.-R., Springer International Publishing, Cham, 347–362,, 2019a. 

Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., and Nearing, G.: Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets, Hydrol. Earth Syst. Sci., 23, 5089–5110,, 2019b. 

Kroll, C. N. and Song, P.: Impact of multicollinearity on small sample hydrologic regression models, Water Resour. Res., 49, 3756-3769,, 2013. 

Labe, Z. M. and Barnes, E. A.: Detecting climate signals using explainable AI With single-forcing large ensembles, J. Adv. Model. Earth Syst., 13, e2021MS002464,, 2021. 

Lavers, D. A. and Villarini, G.: The nexus between atmospheric rivers and extreme precipitation across Europe, Geophys. Res. Lett., 40, 3259–3264,, 2013. 

Ledingham, J., Archer, D., Lewis, E., Fowler, H., and Kilsby, C.: Contrasting seasonality of storm rainfall and flood runoff in the UK and some implications for rainfall-runoff methods of flood estimation, Hydrol. Res., 50, 1309–1323,, 2019. 

Lees, T., Buechel, M., Anderson, B., Slater, L., Reece, S., Coxon, G., and Dadson, S. J.: Benchmarking data-driven rainfall-runoff models in Great Britain: a comparison of long short-term memory (LSTM)-based models with four lumped conceptual models, Hydrol. Earth Syst. Sci., 25, 5517–5534,, 2021. 

Lees, T., Reece, S., Kratzert, F., Klotz, D., Gauch, M., De Bruijn, J., Kumar Sahu, R., Greve, P., Slater, L., and Dadson, S. J.: Hydrological concept formation inside long short-term memory (LSTM) networks, Hydrol. Earth Syst. Sci., 26, 3079–3101,, 2022. 

Lehner, B.: Derivation of watershed boundaries for GRDC gauging stations based on the HydroSHEDS drainage network, Federal Institute of Hydrology (BfG), Koblenz, 18 pp., (last access: 1 November 2021), 2012. 

Merz, B., Vorogushyn, S., Uhlemann, S., Delgado, J., and Hundecha, Y.: HESS Opinions “More efforts and scientific rigour are needed to attribute trends in flood time series”, Hydrol. Earth Syst. Sci., 16, 1379–1387,, 2012. 

Merz, B., Blöschl, G., Vorogushyn, S., Dottori, F., Aerts, J. C. J. H., Bates, P., Bertola, M., Kemter, M., Kreibich, H., Lall, U., and Macdonald, E.: Causes, impacts and patterns of disastrous river floods, Nat. Rev. Earth Environ., 2, 592–609,, 2021. 

Merz, R. and Blöschl, G.: A process typology of regional floods, Water Resour. Res., 39, 1340,, 2003. 

Moriasi, D. N., Gitau, M. W., Pai, N., and Daggupati, P.: Hydrologic and water quality models: Performance measures and evaluation criteria, T. ASABE, 58, 1763–1785, 2015. 

Murdoch, W. J., Singh, C., Kumbier, K., Abbasi-Asl, R., and Yu, B.: Definitions, methods, and applications in interpretable machine learning, P. Natl. Acad. Sci. USA, 116, 22071–22080,, 2019. 

Nash, J. E. and Sutcliffe, J. V.: River flow forecasting through conceptual models part I – A discussion of principles, J. Hydrol., 10, 282–290,, 1970. 

Nearing, G. S., Kratzert, F., Sampson, A. K., Pelissier, C. S., Klotz, D., Frame, J. M., Prieto, C., and Gupta, H. V.: What Role Does Hydrological Science Play in the Age of Machine Learning?, Water Resour. Res., 57, e2020WR028091,, 2021. 

Nied, M., Pardowitz, T., Nissen, K., Ulbrich, U., Hundecha, Y., and Merz, B.: On the relationship between hydro-meteorological patterns and flood types, J. Hydrol., 519, 3249–3262,, 2014. 

Pagano, T. and Garen, D.: A recent increase in western US streamflow variability and persistence, J. Hydrometeorol., 6, 173–179,, 2005. 

Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N., and Prabhat: Deep learning and process understanding for data-driven Earth system science, Nature, 566, 195–204,, 2019. 

Rottler, E., Bronstert, A., Burger, G., and Rakovec, O.: Projected changes in Rhine River flood seasonality under global warming, Hydrol. Earth Syst. Sci., 25, 2353–2371,, 2021. 

Rousseeuw, P. J.: Silhouettes – A graphical aid to the interpretation and validation of cluster-analysis, J. Comput. Appl. Math., 20, 53–65,, 1987. 

Salvador, S. and Chan, P.: Toward accurate dynamic time warping in linear time and space, Intell. Data Anal., 11, 561–580,, 2007. 

Schiemann, R., Vidale, P. L., Shaffrey, L. C., Johnson, S. J., Roberts, M. J., Demory, M. E., Mizielinski, M. S., and Strachan, J.: Mean and extreme precipitation over European river basins better simulated in a 25 km AGCM, Hydrol. Earth Syst. Sci., 22, 3933–3950,, 2018. 

Sharma, A., Wasko, C., and Lettenmaier, D. P.: If precipitation extremes are increasing, why aren't floods?, Water Resour. Res., 54, 8545–8551,, 2018. 

Shen, C. P.: A transdisciplinary review of deep learning research and its relevance for water resources scientists, Water Resour. Res., 54, 8558–8593,, 2018. 

Sherstinsky, A.: Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network, Physica D, 404, 132306,, 2020. 

Sikorska, A. E., Viviroli, D., and Seibert, J.: Flood-type classification in mountainous catchments using crisp and fuzzy decision trees, Water Resour. Res., 51, 7959–7976,, 2015. 

Stein, L., Pianosi, F., and Woods, R.: Event-based classification for global study of river flood generating processes, Hydrol. Process., 34, 1514–1529,, 2020. 

Stein, L., Clark, M. P., Knoben, W. J. M., Pianosi, F., and Woods, R. A.: How do climate and catchment attributes influence flood generating processes? A large-sample study for 671 catchments across the contiguous USA, Water Resour. Res., 57, e2020WR028300,, 2021. 

Su, Y. H. and Kuo, C. C. J.: On extended long short-term memory and dependent bidirectional recurrent neural network, Neurocomputing, 356, 151–161,, 2019. 

Sundararajan, M., Taly, A., and Yan, Q.: Axiomatic attribution for deep networks, in: Proceedings of the 34th International Conference on Machine Learning, August 2017, Sydney,, 2017. 

Tarasova, L., Merz, R., Kiss, A., Basso, S., Blöschl, G., Merz, B., Viglione, A., Plotner, S., Guse, B., Schumann, A., Fischer, S., Ahrens, B., Anwar, F., Bardossy, A., Buhler, P., Haberlandt, U., Kreibich, H., Krug, A., Lun, D., Muller-Thomy, H., Pidoto, R., Primo, C., Seidel, J., Vorogushyn, S., and Wietzke, L.: Causative classification of river flood events, Wiley Interdisciplin. Rev.-Water, 6, e1353,, 2019. 

Tarasova, L., Basso, S., Wendi, D., Viglione, A., Kumar, R., and Merz, R.: A Process-Based Framework to Characterize and Classify Runoff Events: The Event Typology of Germany, Water Resour. Res., 56, e2019WR026951,, 2020. 

Tavenard, R., Faouzi, J., Vandewiele, G., Divo, F., Androz , G., Holtz, C., Payne, M., Yurchak, R., Rußwurm, M., Kolar, K., and Woods, E.: Tslearn, a machine learning toolkit for time series data, J. Mach. Learn. Res., 21, 1–6, 2020. 

Tellman, B., Sullivan, J. A., Kuhn, C., Kettner, A. J., Doyle, C. S., Brakenridge, G. R., Erickson, T. A., and Slayback, D. A.: Satellite imaging reveals increased proportion of population exposed to floods, Nature, 596, 80–86,, 2021. 

Toms, B. A., Barnes, E. A., and Ebert-Uphoff, I.: Physically interpretable neural networks for the geosciences: Applications to Earth system variability, J. Adv. Model. Earth Syst., 12, e2019MS002002,, 2020. 

Trenberth, K. E.: Changes in precipitation with climate change, Clim. Res., 47, 123–138,, 2011. 

Turkington, T., Breinl, K., Ettema, J., Alkema, D., and Jetten, V.: A new flood type classification method for use in climate change impact studies, Weather Clim. Ext., 14, 1–16,, 2016. 

Vormoor, K., Lawrence, D., Heistermann, M., and Bronstert, A.: Climate change impacts on the seasonality and generation processes of floods – projections and uncertainties for catchments with mixed snowmelt/rainfall regimes, Hydrol. Earth Syst. Sci., 19, 913–931,, 2015. 

Vormoor, K., Lawrence, D., Schlichting, L., Wilson, D., and Wong, W. K.: Evidence for changes in the magnitude and frequency of observed rainfall vs. snowmelt driven floods in Norway, J. Hydrol., 538, 33–48,, 2016. 

Wasko, C. and Nathan, R.: Influence of changes in rainfall and soil moisture on trends in flooding, J. Hydrol., 575, 432–441,, 2019. 

Whan, K., Sillmann, J., Schaller, N., and Haarsma, R.: Future changes in atmospheric rivers and extreme precipitation in Norway, Clim. Dynam., 54, 2071–2084,, 2020. 

Yu, S. W. and Ma, J. W.: Deep learning for geophysics: Current and future trends, Rev. Geophys., 59, e2021RG000742,, 2021. 

Zscheischler, J., Westra, S., van den Hurk, B., Seneviratne, S. I., Ward, P. J., Pitman, A., AghaKouchak, A., Bresch, D. N., Leonard, M., Wahl, T., and Zhang, X. B.: Future climate risk from compound events, Nat. Clim. Change, 8, 469–477,, 2018. 

Zscheischler, J., Martius, O., Westra, S., Bevacqua, E., Raymond, C., Horton, R. M., van den Hurk, B., AghaKouchak, A., Jezequel, A., Mahecha, M. D., Maraun, D., Ramos, A. M., Ridder, N. N., Thiery, W., and Vignotto, E.: A typology of compound weather and climate events, Nat. Rev. Earth Environ., 1, 333–347,, 2020. 

Short summary
Using a novel explainable machine learning approach, we investigated the contributions of precipitation, temperature, and day length to different peak discharges, thereby uncovering three primary flooding mechanisms widespread in European catchments. The results indicate that flooding mechanisms have changed in numerous catchments over the past 70 years. The study highlights the potential of artificial intelligence in revealing complex changes in extreme events related to climate change.