<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing with OASIS Tables v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpub-oasis3.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:oasis="http://docs.oasis-open.org/ns/oasis-exchange/table" xml:lang="en" dtd-version="3.0" article-type="research-article">
  <front>
    <journal-meta><journal-id journal-id-type="publisher">HESS</journal-id><journal-title-group>
    <journal-title>Hydrology and Earth System Sciences</journal-title>
    <abbrev-journal-title abbrev-type="publisher">HESS</abbrev-journal-title><abbrev-journal-title abbrev-type="nlm-ta">Hydrol. Earth Syst. Sci.</abbrev-journal-title>
  </journal-title-group><issn pub-type="epub">1607-7938</issn><publisher>
    <publisher-name>Copernicus Publications</publisher-name>
    <publisher-loc>Göttingen, Germany</publisher-loc>
  </publisher></journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5194/hess-30-2135-2026</article-id><title-group><article-title>Interpretable feature incorporation machine-learning framework for flood magnitude estimation</article-title><alt-title>Machine learning and flood estimation</alt-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes" rid="aff1 aff2">
          <name><surname>Ford</surname><given-names>Emma</given-names></name>
          <email>emma.ford@hertford.ox.ac.uk</email>
        <ext-link>https://orcid.org/0009-0004-7208-3826</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff3 aff4 aff5">
          <name><surname>Brunner</surname><given-names>Manuela I.</given-names></name>
          
        <ext-link>https://orcid.org/0000-0001-8824-877X</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff1">
          <name><surname>Christensen</surname><given-names>Hannah</given-names></name>
          
        <ext-link>https://orcid.org/0000-0001-8244-0218</ext-link></contrib>
        <contrib contrib-type="author" corresp="no" rid="aff2">
          <name><surname>Slater</surname><given-names>Louise</given-names></name>
          
        <ext-link>https://orcid.org/0000-0001-9416-488X</ext-link></contrib>
        <aff id="aff1"><label>1</label><institution>Atmospheric, Oceanic and Planetary Physics, University of Oxford, Oxford, UK</institution>
        </aff>
        <aff id="aff2"><label>2</label><institution>School of Geography and the Environment, University of Oxford, Oxford, UK</institution>
        </aff>
        <aff id="aff3"><label>3</label><institution>Institute for Atmospheric and Climate Science, ETH Zurich, Zurich, Switzerland</institution>
        </aff>
        <aff id="aff4"><label>4</label><institution>WSL Institute for Snow and Avalanche Research SLF, Davos Dorf, Switzerland</institution>
        </aff>
        <aff id="aff5"><label>5</label><institution>Climate Change, Extremes and Natural Hazards in Alpine Regions Research Center CERC, Davos Dorf, Switzerland</institution>
        </aff>
      </contrib-group>
      <author-notes><corresp id="corr1">Emma Ford (emma.ford@hertford.ox.ac.uk)</corresp></author-notes><pub-date><day>16</day><month>April</month><year>2026</year></pub-date>
      
      <volume>30</volume>
      <issue>7</issue>
      <fpage>2135</fpage><lpage>2160</lpage>
      <history>
        <date date-type="received"><day>28</day><month>March</month><year>2025</year></date>
           <date date-type="rev-request"><day>15</day><month>May</month><year>2025</year></date>
           <date date-type="rev-recd"><day>30</day><month>January</month><year>2026</year></date>
           <date date-type="accepted"><day>17</day><month>March</month><year>2026</year></date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2026 Emma Ford et al.</copyright-statement>
        <copyright-year>2026</copyright-year>
      <license license-type="open-access"><license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p></license></permissions><self-uri xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026.html">This article is available from https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026.html</self-uri><self-uri xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026.pdf">The full text article is available as a PDF file from https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026.pdf</self-uri>
      <abstract><title>Abstract</title>

      <p id="d2e135">Fluvial floods pose severe socioeconomic and environmental risks and are projected to change in frequency and severity in future decades. Estimating the magnitude of extreme floods remains challenging, particularly for sparse tail events. This motivates the need to identify predictors across catchments and time. Synoptic-scale weather patterns (WPs) are often more temporally persistent and predictable than local meteorological variables, such as precipitation. However, the value of weather patterns as predictors for flood magnitude estimation is not well established. This study introduces a feature incorporation machine learning framework to quantify the relative contribution of synoptic, meteorological, and catchment controls on winter peak-over-threshold (POT) flood magnitudes (<inline-formula><mml:math id="M1" display="inline"><mml:mrow><mml:mo>≥</mml:mo><mml:mn mathvariant="normal">99</mml:mn></mml:mrow></mml:math></inline-formula>th percentile) in near-natural catchments across the United Kingdom (UK) benchmark network. We train Random Forest regression models for a pooled national sample and for multiple hydro-climatic regional samples. Model interpretability was examined using Shapley Additive Explanations (SHAP). Additionally, we analyze the conditional probabilities of the WPs co-occurring with flood magnitudes. Our results show that WPs associated with cyclonic low-pressure systems frequently coincide with flood magnitudes but add minimal value to their estimation. Model skill is dominated by static catchment attributes such as aridity and event-day precipitation in the UK model, with regional model variability in feature importance reflecting hydro-climatic contrasts. Our findings highlight the variability in model outcomes depending on the model structure and the choice of features. This study also offers methodological guidance for developing large-sample machine learning models for flood estimation that integrate atmospheric predictors with traditional hydro-meteorological and geographical variables across a feature incorporation framework.</p>
  </abstract>
    
<funding-group>
<award-group id="gs1">
<funding-source>Natural Environment Research Council</funding-source>
<award-id>NE/S007474/1</award-id>
</award-group>
</funding-group>
</article-meta>
  </front>
<body>
      

<sec id="Ch1.S1" sec-type="intro">
  <label>1</label><title>Introduction</title>
      <p id="d2e157">Fluvial floods are generated by complex interactions between atmospheric, hydrological, and land-surface processes occurring across various temporal and spatial scales <xref ref-type="bibr" rid="bib1.bibx6 bib1.bibx86 bib1.bibx7 bib1.bibx65" id="paren.1"/>. The predictability of these events varies significantly due to the interplay of local and large-scale drivers, and the rarity of extreme events in observational records further complicates predictive modeling <xref ref-type="bibr" rid="bib1.bibx8 bib1.bibx6 bib1.bibx13" id="paren.2"/>. At the catchment scale, data on extreme flood events are even more limited, increasing uncertainty in both the physical understanding of flood generation mechanisms and the development of robust prediction frameworks <xref ref-type="bibr" rid="bib1.bibx93 bib1.bibx81 bib1.bibx85" id="paren.3"/>. Flood generation is also influenced by a combination of variables operating at different timescales. The complex mechanisms responsible for flood generation are often simplified and categorized into short-duration intense rainfall, saturated soil conditions, long-duration lower intensity rainfall, snowmelt, and rain on snow <xref ref-type="bibr" rid="bib1.bibx51 bib1.bibx8 bib1.bibx6 bib1.bibx58" id="paren.4"/>.</p>
      <p id="d2e172">While the main drivers of flood occurrence have been well studied, disentangling the relative importance of predictors across different timescales is challenging <xref ref-type="bibr" rid="bib1.bibx79 bib1.bibx55 bib1.bibx2" id="paren.5"/>. This remains difficult due to the complexity and nonlinearity of flood systems and the limited availability of observed extreme event data. Emerging tools such as machine learning (ML) and explainable artificial intelligence (XAI) offer significant potential for exploring driver contributions. They have been successfully applied in hydrological studies to analyze large datasets and provide insights into the importance of flood drivers <xref ref-type="bibr" rid="bib1.bibx50 bib1.bibx83 bib1.bibx82 bib1.bibx36 bib1.bibx17 bib1.bibx88" id="paren.6"/>.</p>
      <p id="d2e181">Atmospheric circulation patterns are a valuable tool for exploring the relationship between floods and atmospheric conditions over a large area, such as the UK and Europe <xref ref-type="bibr" rid="bib1.bibx45 bib1.bibx46 bib1.bibx78 bib1.bibx2 bib1.bibx20 bib1.bibx91 bib1.bibx12" id="paren.7"/>. Weather patterns (WPs) are static categories of atmospheric conditions defined over specific spatial and temporal scales, typically derived from meteorological variables such as mean sea level pressure <xref ref-type="bibr" rid="bib1.bibx62 bib1.bibx42 bib1.bibx4" id="paren.8"/>. They can be distinguished from weather regimes (or types) based on their spatio-temporal scale. For example, a cyclonic low-pressure system influencing a regional area is often defined as a WP, whereas the North Atlantic Oscillation (NAO), which operates over a much larger spatio-temporal scale, represents a weather regime <xref ref-type="bibr" rid="bib1.bibx21 bib1.bibx62" id="paren.9"/>.</p>
      <p id="d2e193">The MO-30 weather categorization scheme produced by <xref ref-type="bibr" rid="bib1.bibx62" id="text.10"/> categorizes daily synoptic-scale circulation over the United Kingdom (UK) and Europe into thirty discrete weather types. MO-30 has been widely used in previous UK focused research to identify the atmospheric circulation patterns influencing precipitation, drought, and coastal flooding <xref ref-type="bibr" rid="bib1.bibx74 bib1.bibx75 bib1.bibx71 bib1.bibx63" id="paren.11"/>, to understand the atmospheric influence on temperature-related mortality <xref ref-type="bibr" rid="bib1.bibx35" id="paren.12"/>, and to assess projected changes in WP frequency under future climate change scenarios <xref ref-type="bibr" rid="bib1.bibx72 bib1.bibx35" id="paren.13"/>. These WPs also underpin operational decision-support forecasting tools developed by the UK Met Office and the Flood Forecasting Centre, such as the “Decider” framework. For example, “Coastal Decider” as described in <xref ref-type="bibr" rid="bib1.bibx63" id="text.14"/> and “Fluvial Decider” as described in <xref ref-type="bibr" rid="bib1.bibx75" id="text.15"/> link MO-30 WPs to regional extreme precipitation and coastal surge risk to highlight potential flood events at medium- to long-range lead times <xref ref-type="bibr" rid="bib1.bibx75 bib1.bibx63 bib1.bibx71" id="paren.16"/>. While the tool provides valuable insights and guidance for early warning and preparedness, relationships that directly link the WPs with catchment streamflow time-series or flood event characteristics have not been explored. Therefore, their utility in fluvial flood magnitude estimation remains unknown and presents a clear gap regarding the predictive value of MO-30 WPs for fluvial flood estimation.</p>
      <p id="d2e219">Several studies have examined the relationships between atmospheric circulation, streamflow, and flooding across Europe and the United States, showing promising results <xref ref-type="bibr" rid="bib1.bibx12 bib1.bibx78 bib1.bibx20 bib1.bibx91" id="paren.17"/>. Despite their promise, large-sample hydrological UK ML models have generally not incorporated synoptic-scale WPs as predictive features alongside land-surface and hydrometeorological variables, even though these features are closely interlinked <xref ref-type="bibr" rid="bib1.bibx12 bib1.bibx78 bib1.bibx73 bib1.bibx20" id="paren.18"/>. Previous studies have not evaluated the use of the MO-30 WPs as predictors for flood magnitude estimation within a large-sample, data-driven machine learning hydrological framework. Furthermore, to date research in this field has not adopted a modeling approach that incrementally adds features across a successive feature set framework, including the MO-30 WPs, to evaluate and quantify their relative contributions to flood estimation. This approach can provide model transparency in how models use features to make flood predictions, and provide insights into the possible physical mechanisms, feature interactions and relative importance of flood drivers.</p>
      <p id="d2e228">Recent years have seen rapid developments in ML techniques that can handle non-linear interactions, large datasets, and high variability in predictors <xref ref-type="bibr" rid="bib1.bibx24 bib1.bibx64 bib1.bibx82" id="paren.19"/>. Long Short-Term Memory (LSTM) networks have been extensively explored for predicting streamflow time-series and have proved very successful <xref ref-type="bibr" rid="bib1.bibx48 bib1.bibx49 bib1.bibx37 bib1.bibx39 bib1.bibx25" id="paren.20"/>. These methods typically focus on time-series predictions and produce outputs at the daily time-step. In contrast, our approach focuses on estimating the magnitude of extreme streamflow events above the 99th percentile threshold exceedances at each site using large-sample data rather than by modeling continuous time-series data.  Random Forest (RF) models have proven effective for understanding the drivers of hydrological events due to their ability to model complex, non-linear relationships and handle multiple data types while avoiding overfitting <xref ref-type="bibr" rid="bib1.bibx88 bib1.bibx82 bib1.bibx36 bib1.bibx92" id="paren.21"/>. RF models are an ensemble method that constructs multiple decision trees, aggregating predictions across these trees to provide robust outputs <xref ref-type="bibr" rid="bib1.bibx10 bib1.bibx19 bib1.bibx22" id="paren.22"/>. Prior work has employed RF and XAI tools to highlight feature importance in hydrological studies <xref ref-type="bibr" rid="bib1.bibx83 bib1.bibx36 bib1.bibx92" id="paren.23"/>. These data-driven approaches can reveal the potential drivers of extreme events and quantify their contributions to predictions <xref ref-type="bibr" rid="bib1.bibx92 bib1.bibx60 bib1.bibx88 bib1.bibx17 bib1.bibx82" id="paren.24"/>.</p>
      <p id="d2e250">Current flood estimation research often ignores the interaction between weather patterns and hydrological variables in predicting flood magnitudes. Testing the integration of synoptic-scale features alongside meteorological and hydrological variables in a predictive framework is important, as novel and creative ways to enhance extreme flood prediction are needed. This study fills this research gap by investigating the contribution of synoptic-scale WPs, meteorological factors, and physical catchment features as drivers of winter extreme flood magnitudes in UK natural catchments, using large-sample ML RF models. In this work, we: (1) Assess the conditional probabilities of MO-30 WPs on fluvial flood days and the distributions of flood magnitudes associated with the MO-30 WPs. (2) Develop an interpretable feature incorporation framework with seven different feature sets in a large-sample ML framework to assess the influence of spatial identifiers, atmospheric circulation (WPs), catchment characteristics, and meteorological variables on flood magnitude estimation in UK natural catchments. (3) Evaluate the model performance and feature importance results for national and regional data samples.</p>
      <p id="d2e253">To achieve these objectives, we trained structured RF models across nine spatial samples (the UK national model and eight predefined hydro-climatic regional models) for each of the seven feature sets, resulting in 63 model configurations in total. Each feature set represents a successive stage in the feature-incorporation framework. The national and regional models enable comparison between large-sample learning at the UK scale and more localized learning within homogeneous hydro-climatic regions. This study advances large-sample flood estimation research by integrating both atmospheric WPs and hydrological drivers and by providing physically interpretable insights from XAI analyses for extreme flood magnitudes (≥99th percentile).</p>
</sec>
<sec id="Ch1.S2">
  <label>2</label><title>Data and methods</title>
<sec id="Ch1.S2.SS1">
  <label>2.1</label><title>Data sources</title>
      <p id="d2e271">Daily streamflow (<inline-formula><mml:math id="M2" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">m</mml:mi><mml:mn mathvariant="normal">3</mml:mn></mml:msup><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">s</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>) data between 1969 and 2021 were obtained for UK Benchmark Catchments from the National River Flow Archive (NRFA), as described in <xref ref-type="bibr" rid="bib1.bibx31" id="text.25"/>. The corresponding catchment-averaged precipitation, potential evapotranspiration, and temperature variables were obtained from CAMELS-GB <xref ref-type="bibr" rid="bib1.bibx18" id="paren.26"/>, derived from the Met Office HadUK-Grid Climate Observations from <xref ref-type="bibr" rid="bib1.bibx33" id="text.27"/>. A selection of static catchment attributes for each of the catchments were obtained from <xref ref-type="bibr" rid="bib1.bibx16" id="text.28"/>. The full list of extracted variables and calculated antecedent precipitation totals, used to capture the influence of precipitation during previous days, is presented in Appendix Table <xref ref-type="table" rid="TA2"/>.</p>
      <p id="d2e309">The temporal span was chosen to maximize the availability of streamflow data for natural benchmark catchments. Benchmark catchments are minimally influenced by anthropogenic influences <xref ref-type="bibr" rid="bib1.bibx31" id="paren.29"/> and were chosen to explore the influence of the WPs as predictors, minimizing the influence of noise from anthropogenic activity. From the full set of benchmark catchments, we selected those for analysis which had at least 95 % of data for each water year (1 October–30 September), and at least 30 complete years of data. For each unique catchment, the normalization of streamflow (<inline-formula><mml:math id="M3" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">m</mml:mi><mml:mn mathvariant="normal">3</mml:mn></mml:msup><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">s</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>) to specific discharge (<inline-formula><mml:math id="M4" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>) accounted for catchment size variability in the large-sample model and enhanced model generalizability.</p>
      <p id="d2e352">These selection criteria resulted in 134 suitable catchments. For the UK-wide analysis, all selected catchments were pooled together. For regional analysis, the catchments were grouped into their corresponding Met Office Hadley Centre Observations Dataset (HadUKP) climate regions using shapefiles <xref ref-type="bibr" rid="bib1.bibx57" id="paren.30"/>. Region names and corresponding abbreviations (e.g., Northern Scotland <inline-formula><mml:math id="M5" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> NS, Southern Scotland <inline-formula><mml:math id="M6" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> SS, Central and Eastern England <inline-formula><mml:math id="M7" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> CEE) are defined in Appendix Table <xref ref-type="table" rid="TA3"/>. Daily WP classifications were obtained from the Met Office MO-30 dataset <xref ref-type="bibr" rid="bib1.bibx62" id="paren.31"/>, for the same time period as the catchment streamflow data (1969–2021). These WPs were created using an annealed k-means clustering method of mean sea level pressure (MSLP) from the European Mean sea level Pressure dataset (EMSLP) (1850–2003) <xref ref-type="bibr" rid="bib1.bibx62 bib1.bibx1" id="paren.32"/>. Lower-numbered WPs are associated with weaker MSLP anomalies, are historically more frequent, and occur more often in summer. Higher-numbered WPs are associated with stronger MSLP anomalies, are historically less frequent, and occur more often in winter <xref ref-type="bibr" rid="bib1.bibx62" id="paren.33"/>. The WPs are displayed in Fig. <xref ref-type="fig" rid="F1"/>, and the corresponding descriptions from <xref ref-type="bibr" rid="bib1.bibx62" id="text.34"/> are presented in Appendix Table <xref ref-type="table" rid="TA1"/>.</p>

      <fig id="F1" specific-use="star"><label>Figure 1</label><caption><p id="d2e401">Weather Pattern (WP) classifications from the MO-30 dataset. From <xref ref-type="bibr" rid="bib1.bibx62" id="text.35"/>.</p></caption>
          <graphic xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026-f01.jpg"/>

        </fig>

</sec>
<sec id="Ch1.S2.SS2">
  <label>2.2</label><title>Identifying flood magnitudes and dataset creation</title>
      <p id="d2e421">Our analyses focus on winter floods. The winter months of December, January, and February (DJF) were chosen for analysis because most of the largest flood events in the UK occur during this season <xref ref-type="bibr" rid="bib1.bibx47" id="paren.36"/>. Extreme flood events above the 99th percentile were identified for each catchment between 1969 and 2021 using the Peak Over Threshold (POT) method <xref ref-type="bibr" rid="bib1.bibx77 bib1.bibx76" id="paren.37"/>. The 99th percentile threshold was used to capture the most severe events while maintaining an adequate sample size. A 7 <inline-formula><mml:math id="M8" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">d</mml:mi></mml:mrow></mml:math></inline-formula> independence window was applied, following <xref ref-type="bibr" rid="bib1.bibx12" id="text.38"/>. The target variable for modeling was the flood magnitude, defined as the largest specific-discharge value (<inline-formula><mml:math id="M9" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>) for each independent event. The resulting sample sizes and regional breakdown of catchments and independent flood events are summarized in Appendix Table <xref ref-type="table" rid="TA3"/>.</p>
      <p id="d2e461">Our analysis focuses on the day with the highest flood peak, defined in this study as the event-day. While this approach does not capture the full hydrograph duration, it intentionally isolates the peak magnitude of independent events. The POT method is widely recognized for its ability to capture multiple extreme events within a single year, providing a more comprehensive analysis of flood extremes compared to the Annual Maxima (AM) method <xref ref-type="bibr" rid="bib1.bibx54 bib1.bibx69" id="paren.39"/>.</p>
      <p id="d2e467">For the identified flood magnitude event days, feature sets 1–6 were compiled from the datasets described above for the UK and regional samples. Feature sets include a combination of: (1) spatial identifiers (latitude and longitude), used as baseline spatial features and to provide location context for the WP classification; (2) the UK Met Office MO-30 WP category on the event day; (3) the antecedent WPs (AWPs) from one to three days prior, representing synoptic scale atmospheric conditions and pre-event circulation; (4) catchment characteristics including aridity index, area, baseflow index, streamflow elasticity, maximum elevation, and runoff ratio; (5) meteorological variables such as event-day precipitation, mean/minimum/maximum temperature, and potential evapotranspiration; and (6) antecedent cumulative precipitation for 1–3 <inline-formula><mml:math id="M10" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">d</mml:mi></mml:mrow></mml:math></inline-formula> prior to the event, capturing short-term catchment wetness. The 3 <inline-formula><mml:math id="M11" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">d</mml:mi></mml:mrow></mml:math></inline-formula> antecedent window was selected based on the typical response time of small, near-natural UK catchments. Overall, eight regional flood magnitude datasets in addition to the UK national dataset were produced for subsequent modeling. A summary for each region, including the number of catchments, total flood magnitude events, and mean events per catchment is presented in Appendix Table <xref ref-type="table" rid="TA3"/>.</p>
      <p id="d2e488">To address multicollinearity and improve the interpretability of the final feature set, a post hoc feature pruning step was applied to feature set 6 to identify the final feature set 7. Variance Inflation Factor (VIF) analysis, as described in <xref ref-type="bibr" rid="bib1.bibx67" id="text.40"/>, was used to quantify collinearity among predictors. Predictors with <inline-formula><mml:math id="M12" display="inline"><mml:mrow><mml:mtext>VIF</mml:mtext><mml:mo>&gt;</mml:mo><mml:mn mathvariant="normal">10</mml:mn></mml:mrow></mml:math></inline-formula> were iteratively removed, prioritizing the retention of physically interpretable variables. The final seventh pruned feature set represents the outcome of a targeted feature selection procedure applied separately to each sample (UK and regional). While feature sets 1–6 are the same across all models to allow direct comparison of process groups, feature set 7 is specific to each data sample. Consequently, the retained features differ slightly by region. Figure <xref ref-type="fig" rid="FB1"/> displays the retained and dropped predictors for the UK and the 7 regional feature set models. Table <xref ref-type="table" rid="T1"/> gives an overview of each additional feature set model, using the data sources described in Table <xref ref-type="table" rid="TA2"/>, with abbreviations for model results in Fig. <xref ref-type="fig" rid="F4"/>.</p>

<table-wrap id="T1" specific-use="star"><label>Table 1</label><caption><p id="d2e519">Summary of the feature sets used in successive RF models. Features are added cumulatively; e.g., Set 2 includes all features from Set 1 and 2.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="3">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="justify" colwidth="6cm"/>
     <oasis:colspec colnum="3" colname="col3" align="justify" colwidth="7.5cm"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">Set</oasis:entry>
         <oasis:entry colname="col2" align="left">Features Added (abbrev.)</oasis:entry>
         <oasis:entry colname="col3" align="left">Physical Processes Represented</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">1</oasis:entry>
         <oasis:entry colname="col2" align="left">Latitude, Longitude (Lat, Lon)</oasis:entry>
         <oasis:entry colname="col3" align="left">Spatial variability linked to geographic and climatic gradients.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">2</oasis:entry>
         <oasis:entry colname="col2" align="left">Weather Pattern (WP) category on event day (WP_t; 1–30 MO-30 types)</oasis:entry>
         <oasis:entry colname="col3" align="left">Synoptic-scale atmospheric circulation on the flood day.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">3</oasis:entry>
         <oasis:entry colname="col2" align="left">Antecedent Weather Pattern (AWP) categories (WP_t–1, WP_t–2, WP_t–3; 1–30 MO types)</oasis:entry>
         <oasis:entry colname="col3" align="left">Synoptic meteorological conditions on days preceding the event.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">4</oasis:entry>
         <oasis:entry colname="col2" align="left">Static catchment characteristics (CC): Area, Max. Elevation, Aridity Index, Runoff Ratio, Streamflow Elasticity, Baseflow Index</oasis:entry>
         <oasis:entry colname="col3" align="left">Catchment form, storage, and long-term hydro-climatic controls.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">5</oasis:entry>
         <oasis:entry colname="col2" align="left">Event-day hydro-meteorological variables (HM): Precipitation, Mean/Maximum/Minimum Temperature, Potential Evapotranspiration</oasis:entry>
         <oasis:entry colname="col3" align="left">Local meteorological forcing on the flood magnitude day.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">6</oasis:entry>
         <oasis:entry colname="col2" align="left">Antecedent hydro-meteorological variables (AHM): Total Precipitation 1–3 <inline-formula><mml:math id="M13" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">d</mml:mi></mml:mrow></mml:math></inline-formula> before</oasis:entry>
         <oasis:entry colname="col3" align="left">Short-term wetness and memory effects preceding flood events.</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">7</oasis:entry>
         <oasis:entry colname="col2" align="left">Pruned feature set (non-collinear, physically interpretable predictors from Set 6)</oasis:entry>
         <oasis:entry colname="col3" align="left">Reduced-complexity model capturing dominant, physically consistent drivers.</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</sec>
<sec id="Ch1.S2.SS3">
  <label>2.3</label><title>Conditional probabilities</title>
      <p id="d2e652">Prior to selecting the WPs as a feature set for the RF models, we examined which WPs frequently occurred with extreme flood event days by computing the conditional probability of each WP given that a flood occurred on the day, denoted as “P(WP <inline-formula><mml:math id="M14" display="inline"><mml:mo>|</mml:mo></mml:math></inline-formula> flood)”. This analysis was performed for (1) the UK sample dataset and (2) the regional sample datasets. To account for the cumulative influence of synoptic-scale conditions leading to floods, we extended the analysis beyond the day of the event to include AWP categories (up to three days prior).</p>
<sec id="Ch1.S2.SS3.SSSx1" specific-use="unnumbered">
  <title>Distribution of flood magnitudes</title>
      <p id="d2e667">We further explored the distribution of flood magnitudes associated with each WP, and present this for the WPs most often associated with flood magnitude days. Flood magnitude (<inline-formula><mml:math id="M15" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>) has already been normalized. However, since larger catchments might represent smaller flood magnitudes compared to smaller catchments, given their larger drainage areas, we present the flood magnitudes associated with each WP stratified by catchment size. To do so, the catchment sizes were categorized into lower, middle, and upper terciles (3.12–66.82, 66.82–194.81, and 194.81–1505.54 <inline-formula><mml:math id="M16" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">km</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, respectively), each containing approximately one-third of the data.</p>
</sec>
</sec>
<sec id="Ch1.S2.SS4">
  <label>2.4</label><title>Machine learning model structure</title>
      <p id="d2e707">Next, we developed seven RF regression models, each incorporating a new feature set, as described in Table <xref ref-type="table" rid="T1"/>, to gain insights into the roles of different features as predictors of extreme flood magnitudes (<inline-formula><mml:math id="M17" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>), for the UK and regional samples (see Table <xref ref-type="table" rid="TA3"/>). The seven models were run both as a UK-wide model with all catchments pooled and for predefined regional samples (where each region has catchments within corresponding regions). This enhances interpretability by aligning the model structure with established hydro-climatic regions and enables insights into the performance and feature importance of regional and UK samples. That is, each new feature set incrementally incorporates new features, as shown in Table <xref ref-type="table" rid="T1"/>, while the model architecture remains the same. The RF was implemented using the Scikit-learn Python package by <xref ref-type="bibr" rid="bib1.bibx70" id="text.41"/>. Model performance metrics and Shapley Additive Explanations (SHAP) were calculated based on the test set results. One-hot encoding (1 <inline-formula><mml:math id="M18" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> TRUE, 0 <inline-formula><mml:math id="M19" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> FALSE) was applied to the WP and AWP categories to create binary features for the model. A temporal train-test split was applied to all UK and regional models, with events between 1969 and 2010 (80 %) used for training, and events from 2011 to 2021 (20 %) used for testing. This split ensured that later events were unseen during model fitting and hyper-parameter optimization. This temporal validation method has been used in other recent hydrological studies <xref ref-type="bibr" rid="bib1.bibx36" id="paren.42"><named-content content-type="pre">e.g., </named-content></xref>, respects the temporal sequence of the data, and enables evaluation only on unseen years. Model hyper-parameters were optimized within the training period, using RandomizedSearchCV to balance model complexity and performance. The final configuration selected was: <monospace>n_estimators = 1000</monospace>, <monospace>min_samples split = 10</monospace>, <monospace>min_samples leaf = 2</monospace>, <monospace>max_depth = None</monospace>, and <monospace>bootstrap = True</monospace>. A sensitivity analysis used to test alternative tree numbers and split parameters was conducted. This also reflects a realistic forecasting scenario where future events are predicted based on past conditions <xref ref-type="bibr" rid="bib1.bibx9" id="paren.43"/>. A three-way split (training: 1969–2000, validation: 2001–2010, test: 2011–2021) was also evaluated but produced very similar model skill. To maximize the available data for learning hydrological relationships, the two-period split was retained, providing an effective balance between robustness and data efficiency.</p>
</sec>
<sec id="Ch1.S2.SS5">
  <label>2.5</label><title>Uncertainty quantification</title>
      <p id="d2e784">To quantify predictive uncertainty in the RF models, ensemble-based uncertainty metrics were derived from the final feature set 7 models. For each test-set sample, predictions were obtained from all individual trees within the ensemble. Two complementary metrics were employed. First, the ensemble spread of tree predictions standard deviations (Pred_SD), representing the absolute spread of predictions and thus the model's absolute predictive uncertainty in <inline-formula><mml:math id="M20" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>. Second, the coefficient of variation (CV), expressing the relative uncertainty. The results are presented in Fig. <xref ref-type="fig" rid="FB2"/>.</p>
</sec>
<sec id="Ch1.S2.SS6">
  <label>2.6</label><title>Test set evaluation metrics</title>
      <p id="d2e814">Overall model performance was evaluated using the coefficient of determination <inline-formula><mml:math id="M21" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> and Percentage Bias (PBIAS), calculated once per model from all test set predictions of flood magnitudes. The overall <inline-formula><mml:math id="M22" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> represents the total proportion of variance in observed flood magnitudes explained by the model at the national or regional scale. For each model, <inline-formula><mml:math id="M23" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> was first calculated for the overall model and then at the catchment level. For the final 7 sets of models, comparisons were performed by re-evaluating the UK model on the exact subset of catchments used in each regional model, using identical test periods. Catchment level <inline-formula><mml:math id="M24" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values were computed separately for each catchment and then aggregated (median and mean) across catchments within each region. Only catchments with at least ten events were included. This approach provides consistent comparisons across spatial scales and captures both temporal variation within catchments (intra-catchment variability) and variation between catchments (inter-catchment variability). The equations for <inline-formula><mml:math id="M25" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> and PBIAS are provided in Appendix C.</p>
<sec id="Ch1.S2.SS6.SSSx1" specific-use="unnumbered">
  <title>Significance of <inline-formula><mml:math id="M26" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> change across model generations</title>
      <p id="d2e889">To evaluate the incremental effect of feature set additions on model performance, paired permutation randomized significance tests were conducted for the <inline-formula><mml:math id="M27" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> results. This method assesses whether changes in <inline-formula><mml:math id="M28" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> are statistically significant between successive model feature incorporation sets or when compared to the baseline model (feature set 1), which contains latitude and longitude. Flood event predictability inherently varies due to hydro-meteorological conditions and the physical characteristics of catchments <xref ref-type="bibr" rid="bib1.bibx14 bib1.bibx30" id="paren.44"/>, which can introduce day-to-day and catchment-to-catchment variability in model performance. This analysis was applied to each region and the UK samples for all models. By pairing predictions for the same flood events, we control for variability in event predictability, enabling robust comparisons across successive feature sets. Importantly, this approach does not assume normality, making it suitable for datasets with limited or heterogeneous samples. See the full equations in Appendix C.</p>
</sec>
</sec>
<sec id="Ch1.S2.SS7">
  <label>2.7</label><title>Model interpretability with SHAP (SHapley additive exPlanations)</title>
      <p id="d2e926">SHAP values quantify each feature’s contribution to individual predictions, providing insights into local and global influences on the target variable <xref ref-type="bibr" rid="bib1.bibx41" id="paren.45"/>. By applying SHAP analysis to the test sets for the 7 sets of models, model interpretability is assessed on unseen data. This approach allows for the evaluation of generalization performance and feature influence during the independent testing period. SHAP, derived from cooperative game theory, is a powerful explainable AI tool used to interpret the outputs of machine learning models <xref ref-type="bibr" rid="bib1.bibx52" id="paren.46"/>. For each prediction in the test set, SHAP values represent the extent to which a feature contributes to deviations from the mean prediction <xref ref-type="bibr" rid="bib1.bibx89 bib1.bibx92" id="paren.47"/>. Compared to feature importance derived from Gini impurity, SHAP is more robust as it provides insights into both the direction and magnitude of the relationship between predictors and the target variable and enables analysis of local and global effects <xref ref-type="bibr" rid="bib1.bibx53 bib1.bibx52" id="paren.48"/>. See equations for SHAP calculation in Appendix C.</p>
</sec>
</sec>
<sec id="Ch1.S3">
  <label>3</label><title>Results</title>
      <p id="d2e951">Cyclonic MO-30 WPs have the highest conditional probabilities on winter POT flood-event days across most hydro-climatic regions and in the UK-wide sample (Fig. <xref ref-type="fig" rid="F2"/>a). In the UK sample, WP 30 has the highest event-day conditional probability and occurs on 19 % of flood-event days (Fig. <xref ref-type="fig" rid="F2"/>a). Other cyclonic types (including WPs 20, 21, and 29) also show elevated conditional probabilities, with regional contrasts in the most frequent WP (Fig. <xref ref-type="fig" rid="F2"/>a). In North Scotland (NS), WP 23 is the most prominent event-day type and occurs on 16 % of flood-event days (Fig. <xref ref-type="fig" rid="F2"/>a). Blank cells indicate region–WP combinations with no recorded flood events during the study period (Fig. <xref ref-type="fig" rid="F2"/>b). Three days prior to flood events, WP 30 is less dominant than on the event day, while other higher-numbered cyclonic types (notably WPs 20 and 29) occur more frequently across multiple regions (Fig. <xref ref-type="fig" rid="F2"/>b).</p>

      <fig id="F2" specific-use="star"><label>Figure 2</label><caption><p id="d2e969">Conditional probabilities of WPs associated with flood days.  Panel <bold>(a)</bold> shows conditional probabilities on the day of the flood, and panel <bold>(b)</bold> shows probabilities of the AWPs three days prior to the flood. Regions are indicated on the <inline-formula><mml:math id="M29" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> axis (abbreviated as in Table <xref ref-type="table" rid="TA3"/>) and WP categories on the <inline-formula><mml:math id="M30" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> axis.  The color scale represents the conditional probability value between 0 and 1, with blank cells indicating no recorded events for that region–WP combination.  </p></caption>
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026-f02.png"/>

      </fig>

      <p id="d2e1000">Across all catchments, the distributions of winter POT flood magnitudes differ by WP, with WP 23 associated with higher typical magnitudes (higher median and mean) than the other selected patterns, while WP 30 shows a comparatively lower central tendency but a wide spread (Fig. <xref ref-type="fig" rid="F3"/>a). When stratified by catchment area, small catchments exhibit the widest ranges and inter-quartile spreads in flood magnitudes for several WPs, particularly WPs 23 and 30, whereas the medium and large catchment groups show narrower distributions overall with reduced spread relative to the small catchment tercile (Fig. <xref ref-type="fig" rid="F3"/>b).</p>

      <fig id="F3" specific-use="star"><label>Figure 3</label><caption><p id="d2e1010">Comparison of flood magnitude (<inline-formula><mml:math id="M31" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>) distributions under selected weather patterns (WPs 20, 21, 23, 29 and 30). Panel <bold>(a)</bold> shows the distribution across all natural catchments. Panel <bold>(b)</bold> shows the distribution stratified by catchment area: Small (3.12–66.82 <inline-formula><mml:math id="M32" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">km</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>), Medium (66.82–194.81 <inline-formula><mml:math id="M33" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">km</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>), and Large (194.81–1505.54 <inline-formula><mml:math id="M34" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">km</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>). These terciles are equally sized to represent three bins of catchment area data. Boxplots show the interquartile range (IQR; Q1–Q3), the median as a solid black line, and the mean as a black dotted line; whiskers extend to 1.5 <inline-formula><mml:math id="M35" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> IQR, and outliers beyond that are omitted for clarity.</p></caption>
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026-f03.png"/>

      </fig>

      <p id="d2e1083">The UK pooled model achieves consistently higher test-set <inline-formula><mml:math id="M36" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values than the regional models across all feature sets, with peak performance in feature sets 6–7 (maximum <inline-formula><mml:math id="M37" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.84</mml:mn></mml:mrow></mml:math></inline-formula> in set 6 and <inline-formula><mml:math id="M38" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.83</mml:mn></mml:mrow></mml:math></inline-formula> in set 7) (Fig. <xref ref-type="fig" rid="F4"/>a). Regional performance varies, with SW showing the highest regional <inline-formula><mml:math id="M39" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> (0.83 in set 6; 0.82 in set 7) and CEE the lowest performance (0.37 in set 6; 0.33 in set 7) (Fig. <xref ref-type="fig" rid="F4"/>a). In most samples, the largest increases in <inline-formula><mml:math id="M40" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> occur when hydrometeorological predictors and antecedent precipitation indices are added (feature sets 5–6), whereas adding WP and AWP predictors (feature sets 2–3) generally does not improve <inline-formula><mml:math id="M41" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> relative to the feature set 1 baseline (Fig. <xref ref-type="fig" rid="F4"/>a). Bold values with an asterisk indicate statistically significant differences in <inline-formula><mml:math id="M42" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> relative to the feature set 1 baseline (Fig. <xref ref-type="fig" rid="F4"/>a). Across feature sets, PBIAS is predominantly negative in most samples for the higher feature sets, indicating a tendency to underestimate peak flood magnitudes in the test period (Fig. <xref ref-type="fig" rid="F4"/>b). Bias magnitude differs by region, with the most negative PBIAS values in CEE, while NS shows comparatively positive PBIAS values (Fig. <xref ref-type="fig" rid="F4"/>b).</p>

      <fig id="F4" specific-use="star"><label>Figure 4</label><caption><p id="d2e1187">Comparison of UK and regional model results across model feature sets. Panel <bold>(a)</bold> shows <inline-formula><mml:math id="M43" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> and panel <bold>(b)</bold> shows percentage bias (PBIAS). In panel <bold>(a)</bold>, bold values with an asterisk (*) denote statistically significant changes in <inline-formula><mml:math id="M44" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> compared to the feature set 1 models. In panel <bold>(b)</bold>, blue shading indicates overestimation (positive PBIAS) and red shading indicates underestimation (negative PBIAS).</p></caption>
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026-f04.png"/>

      </fig>

      <p id="d2e1231">Catchment-level test-set performance shows substantial within-region variability in <inline-formula><mml:math id="M45" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> across matched catchments (Fig. <xref ref-type="fig" rid="F5"/>a). Spatial differences in model skill vary by catchment, with performance contrasts captured by <inline-formula><mml:math id="M46" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> (regional minus UK) (Fig. <xref ref-type="fig" rid="F5"/>b). Positive <inline-formula><mml:math id="M47" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values (red shading) identify catchments where the regional model achieves higher <inline-formula><mml:math id="M48" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, whereas negative values (blue shading) indicate catchments where the UK model performs better (Fig. <xref ref-type="fig" rid="F5"/>b). The <inline-formula><mml:math id="M49" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> performance differs between regions. The SS, ES and NW show comparatively higher median <inline-formula><mml:math id="M50" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values, whereas CEE and SE regions exhibit lower medians and a slightly greater concentration of low-skill catchments (Fig. <xref ref-type="fig" rid="F5"/>a). The head-to-head comparison indicates that differences in model skill are catchment-specific, with <inline-formula><mml:math id="M51" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> (regional minus UK) varying in sign and magnitude across the country (Fig. <xref ref-type="fig" rid="F5"/>b). Positive <inline-formula><mml:math id="M52" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values (red shading) identify catchments where the regional model achieves higher <inline-formula><mml:math id="M53" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, whereas negative values (blue shading) indicate catchments where the UK model performs better. The mixed pattern of red and blue points suggests that neither approach consistently dominates and that any regional-model advantage is spatially heterogeneous rather than uniform (Fig. <xref ref-type="fig" rid="F5"/>b). In several regions, the distributions for the UK and regional models overlap strongly, implying that improvements from regional models are modest for many catchments but can be more pronounced for specific locations, as indicated by the tails of the <inline-formula><mml:math id="M54" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M55" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> distributions (Fig. <xref ref-type="fig" rid="F5"/>a and b). Overall, across all matched catchments, the UK model attains higher <inline-formula><mml:math id="M56" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> more often than the regional models (54.9 % versus 45.1 %), indicating that regional modeling does not yield a consistent overall advantage despite some catchment-specific gains (Appendix Fig. <xref ref-type="fig" rid="FB3"/>).</p>

      <fig id="F5" specific-use="star"><label>Figure 5</label><caption><p id="d2e1393">Comparison of UK and regional model performance at the catchment scale. <bold>(a)</bold> Boxplots of catchment <inline-formula><mml:math id="M57" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> for matched catchments across regions. Each box shows the interquartile range (IQR; 25th–75th percentile) with the median as a solid black line. <bold>(b)</bold> Spatial distribution of <inline-formula><mml:math id="M58" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> differences across the UK, where red shading indicates higher performance of the regional model and blue shading indicates higher performance of the UK model.</p></caption>
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026-f05.png"/>

      </fig>

      <p id="d2e1433">Across the UK and regional feature set 7 models, the most influential predictors (ranked by mean absolute SHAP value) generally combine climatic context with event-scale forcing, with event-day precipitation and related antecedent precipitation indices frequently appearing among the highest-ranked variables (Fig. <xref ref-type="fig" rid="F6"/>). The SHAP value distributions indicate directionality, with higher precipitation values typically associated with positive SHAP values (higher predicted flood magnitudes) and lower precipitation values associated with negative SHAP values (Fig. <xref ref-type="fig" rid="F6"/>). Predictor retention and relative importance vary between regions, reflecting differences in the final pruned feature sets (Fig. <xref ref-type="fig" rid="F6"/>).</p>

      <fig id="F6" specific-use="star"><label>Figure 6</label><caption><p id="d2e1444">SHAP summary (beeswarm) plots for the UK and all regional final (feature-pruned) models (feature set 7). The number of features differs across each region. Each panel shows SHAP values representing the impact of each predictor on model output.  Predictors (<inline-formula><mml:math id="M59" display="inline"><mml:mi>y</mml:mi></mml:math></inline-formula> axis) are ranked by their mean absolute SHAP value, with the most influential at the top. The <inline-formula><mml:math id="M60" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> axis represents the magnitude and direction of each feature’s influence, where positive values indicate increased predicted flood magnitude and negative values indicate a reduction. The <inline-formula><mml:math id="M61" display="inline"><mml:mi>x</mml:mi></mml:math></inline-formula> axis values vary across regions. Each point corresponds to an individual prediction, with color indicating the predictor value (red <inline-formula><mml:math id="M62" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> high, blue <inline-formula><mml:math id="M63" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> low).  Definitions and units of all variables are provided in Table <xref ref-type="table" rid="TA2"/>.</p></caption>
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026-f06.png"/>

      </fig>

      <p id="d2e1491">Across the UK and regional feature set 7 models, the top ten predictors ranked by mean absolute SHAP (<inline-formula><mml:math id="M64" display="inline"><mml:mrow><mml:mo>|</mml:mo><mml:mtext>SHAP</mml:mtext><mml:mo>|</mml:mo></mml:mrow></mml:math></inline-formula>) show that the UK model is dominated by climatic context and precipitation forcing, with the aridity index as the most influential predictor, followed by event-day precipitation and antecedent precipitation indices (Fig. <xref ref-type="fig" rid="F7"/>). Regional rankings show both similarities and differences relative to the UK model: precipitation-related predictors remain prominent in several regions, while some regions show a larger contribution from static catchment attributes (e.g., baseflow index, area, or elevation) among the top-ranked predictors (Fig. <xref ref-type="fig" rid="F7"/>).</p>

      <fig id="F7" specific-use="star"><label>Figure 7</label><caption><p id="d2e1512">Final (pruned feature set 7) models. Mean absolute SHAP (<inline-formula><mml:math id="M65" display="inline"><mml:mrow><mml:mo>|</mml:mo><mml:mtext>SHAP</mml:mtext><mml:mo>|</mml:mo></mml:mrow></mml:math></inline-formula>) values are shown for the top ten predictors in the UK and regional feature set 7 models. Higher bars indicate stronger average influence on predicted flood magnitudes. Full variable descriptions, units, and calculation methods are listed in Table <xref ref-type="table" rid="TA2"/>.</p></caption>
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026-f07.png"/>

      </fig>

</sec>
<sec id="Ch1.S4">
  <label>4</label><title>Discussion</title>
<sec id="Ch1.S4.SS1">
  <label>4.1</label><title>Conditional probability and distribution</title>
      <p id="d2e1550">The dominance of cyclonic WPs on flood-event days (Fig. <xref ref-type="fig" rid="F2"/>a), such as WP 30 in the UK sample, is consistent with previous work linking these synoptic regimes to enhanced UK precipitation and flooding potential <xref ref-type="bibr" rid="bib1.bibx74 bib1.bibx63" id="paren.49"/>. Regional differences in the most frequent flood-associated WPs (e.g., WP 23 in NS) indicate that synoptic circulation provides a broad-scale atmospheric context. However, the resulting flood response likely depends on how that context interacts with regional hydro-climatic conditions and catchment properties <xref ref-type="bibr" rid="bib1.bibx29 bib1.bibx28 bib1.bibx6" id="paren.50"/>. Spatial variability may also reflect differences in the precipitation footprints and storm tracks associated with individual WPs. Moreover, coastal exposure and orographic enhancement in western regions could be playing a role.</p>
      <p id="d2e1561">WP 21 provides an example of how frequency on flood-event days can differ from circulation types associated with widespread extreme precipitation. Previous analyses have shown that WP 21 can be associated with elevated precipitation across multiple UK regions <xref ref-type="bibr" rid="bib1.bibx74 bib1.bibx75" id="paren.51"/>. In the present analysis, WP 21 occurs frequently on flood-event days in several regions (notably SE, SS, and NW; Fig. <xref ref-type="fig" rid="F2"/>a), but WP 30 remains the most frequent type overall. This reinforces that the circulation type most strongly linked to extreme precipitation does not necessarily correspond to the type that co-occurs most frequently with flood events. This is because flood occurrence and magnitude also depend on antecedent wetness, catchment storage, and hydrological memory <xref ref-type="bibr" rid="bib1.bibx84 bib1.bibx12" id="paren.52"/>. Consequently, the same WP can produce different flood responses in different catchments and under different pre-event conditions.</p>
      <p id="d2e1572">The antecedent analysis (Fig. <xref ref-type="fig" rid="F2"/>b) further indicates that the synoptic context preceding flood events is not limited to the event-day circulation type. The reduced dominance of WP 30 three days prior to events, alongside more frequent occurrence of other cyclonic types (e.g., WPs 20 and 29), is consistent with the importance of multi-day circulation sequences in conditioning catchment wetness through cumulative rainfall <xref ref-type="bibr" rid="bib1.bibx5 bib1.bibx6 bib1.bibx12" id="paren.53"/>. These results motivate the inclusion of antecedent circulation descriptors as candidate predictors in the subsequent modeling framework, while also highlighting the potential for non-linear interactions between atmospheric regime, precipitation, and catchment state.</p>
      <p id="d2e1580">The conditional probability patterns (Fig. <xref ref-type="fig" rid="F2"/>a) provide synoptic-scale context for winter flood-event days, but the translation from circulation type to flood response is not one-to-one. The same WP can coincide with flood events more frequently in steep, fast-response catchments that are already wet, but less frequently in drier or more permeable catchments with higher infiltration capacity, reflecting the importance of antecedent wetness, storage, and hydrological memory <xref ref-type="bibr" rid="bib1.bibx84 bib1.bibx12" id="paren.54"/>.</p>
      <p id="d2e1589">The antecedent analysis (Fig. <xref ref-type="fig" rid="F2"/>b) indicates persistence of cyclonic circulation in the days preceding flood events. In particular, higher-numbered cyclonic types are more common three days prior to events, while WP 30 becomes less dominant and WPs 20 and 29 occur more frequently. This supports the interpretation that multi-day synoptic sequences condition catchment wetness through cumulative rainfall and soil saturation, which can increase the likelihood of flood generation <xref ref-type="bibr" rid="bib1.bibx5 bib1.bibx6 bib1.bibx12" id="paren.55"/>.</p>
      <p id="d2e1597">The magnitude distributions (Fig. <xref ref-type="fig" rid="F3"/>) demonstrate that WPs associated with frequent flood-event occurrences are not necessarily those associated with the highest flood magnitudes. Although WP 30 most frequently co-occurs with flood-event days (Fig. <xref ref-type="fig" rid="F2"/>), it is not associated with the highest median or mean flood magnitudes (Fig. <xref ref-type="fig" rid="F3"/>a). WP 30 represents a broad cyclonic regime that can persist for several days and affect large areas, which may favor frequent flood-event occurrences rather than the largest peak magnitudes <xref ref-type="bibr" rid="bib1.bibx62" id="paren.56"/>. WP 30 could be driving lower intensity but longer duration events, especially in larger catchments where prolonged rainfall is a more important factor for flood generation, and be associated with duration rather than event peak. In contrast, WP 23 is associated with higher flood magnitudes in this event-based analysis, highlighting the importance of considering both the frequency of flood-conducive circulation types and the intensity of the flood magnitudes linked to those types <xref ref-type="bibr" rid="bib1.bibx62" id="paren.57"/>.</p>
      <p id="d2e1612">Catchment size modulates these WP-magnitude relationships (Fig. <xref ref-type="fig" rid="F3"/>b). Smaller catchments display wider spreads and higher upper-tail magnitudes under WPs 23 and 30, consistent with their shorter response times and sensitivity to intense rainfall. Larger catchments exhibit narrower distributions, reflecting spatial averaging and storage effects that can dampen peak responses. These differences further support that synoptic-scale regimes provide a necessary but insufficient descriptor of event-scale flood magnitude without accounting for catchment state and response characteristics.</p>
      <p id="d2e1617">Finally, changes in the frequency of specific circulation types could have implications for future flood hazards. For example, under RCP8.5, <xref ref-type="bibr" rid="bib1.bibx72" id="text.58"/> reported an increase in the occurrence of WP 23, which may be particularly relevant for regions where this WP is already prominent on flood-event days (e.g., NS; Fig. <xref ref-type="fig" rid="F2"/>). This motivates future work linking circulation-type projections to hydrological impacts using predictors that better resolve catchment states and event-scale forcing.</p>
</sec>
<sec id="Ch1.S4.SS2">
  <label>4.2</label><title>UK and regional model performance</title>
      <p id="d2e1633">Feature set 7 is the final pruned specification. Latitude/longitude and WP/AWP predictors were excluded, and the remaining predictors were pruned for collinearity (Appendix Fig. <xref ref-type="fig" rid="FB1"/>). This means that the results for feature set 7 reflect skill attributable to the retained hydrometeorological and catchment-relevant predictors, while spatial proxies and WP/AWP categories do not contribute to (and in some regions reduce) predictive performance. A brief sensitivity check indicated that increasing model complexity did not materially change test performance, supporting the robustness of the final specification.</p>
      <p id="d2e1638">The UK pooled model consistently outperforms the regional models across feature sets, reaching <inline-formula><mml:math id="M66" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.84</mml:mn></mml:mrow></mml:math></inline-formula> in feature set 6 and <inline-formula><mml:math id="M67" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.83</mml:mn></mml:mrow></mml:math></inline-formula> in feature set 7 (Fig. <xref ref-type="fig" rid="F4"/>a). The statistically significant improvement from the feature set 1 baseline (<inline-formula><mml:math id="M68" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.66</mml:mn></mml:mrow></mml:math></inline-formula>) is consistent with the advantages of pooling information across a larger and more diverse set of near-natural catchments, which can improve the ability of machine-learning models to learn generalizable relationships from limited extreme-event samples <xref ref-type="bibr" rid="bib1.bibx40 bib1.bibx38 bib1.bibx82" id="paren.59"/>.</p>
      <p id="d2e1691">Across regions, model performance varies substantially, reflecting differences in dominant hydrological regimes, within-region heterogeneity, and event sample composition. The SW region achieves the highest regional performance (<inline-formula><mml:math id="M69" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.83</mml:mn></mml:mrow></mml:math></inline-formula> in feature set 6; <inline-formula><mml:math id="M70" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.82</mml:mn></mml:mrow></mml:math></inline-formula> in feature set 7), while the NW exhibits a large improvement relative to baseline (<inline-formula><mml:math id="M71" display="inline"><mml:mrow><mml:mo>+</mml:mo><mml:mn mathvariant="normal">0.35</mml:mn></mml:mrow></mml:math></inline-formula>) reaching <inline-formula><mml:math id="M72" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.75</mml:mn></mml:mrow></mml:math></inline-formula>. In contrast, SS and ES show more modest improvements, reaching feature set 7 <inline-formula><mml:math id="M73" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values of 0.60 and 0.54, respectively, consistent with the constraints of smaller regional samples.</p>
      <p id="d2e1760">CEE exhibited the lowest performance (<inline-formula><mml:math id="M74" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">0.37</mml:mn></mml:mrow></mml:math></inline-formula> in feature set 6 and 0.33 in feature set 7), despite having the largest dataset. This region's low relief, permeable, and chalk dominated terrain can mean that floods are more groundwater and storage driven rather than influenced by event precipitation <xref ref-type="bibr" rid="bib1.bibx43 bib1.bibx16" id="paren.60"/>. The CEE region is therefore likely less sensitive to short-duration rainfall predictors. Similarly, the SE region has low performance, reaching <inline-formula><mml:math id="M75" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> of 0.49 with hydrometeorological features and up to 0.60 by feature set 7, although this is not statistically significant compared to the baseline model (feature set 1). This aligns with previous work showing that the SE is difficult to model without explicit groundwater representation <xref ref-type="bibr" rid="bib1.bibx48" id="paren.61"/>. These results underscore that regions dominated by slower-response processes require predictors that capture longer-term hydrological memory.</p>
      <p id="d2e1796">The changes in skill across feature sets highlight which predictor groups are most informative for event-scale flood magnitude estimation. In most regions, the inclusion of WP and AWP predictors (feature sets 2–3) does not improve performance relative to the spatial baseline and can reduce skill, suggesting limited incremental predictive information at daily resolution for peak magnitude estimation. This is consistent with the view that synoptic-scale circulation descriptors may not resolve the fine-scale variability associated with local precipitation extremes and catchment-scale runoff generation processes <xref ref-type="bibr" rid="bib1.bibx44" id="paren.62"/>. In contrast, the largest gains in <inline-formula><mml:math id="M76" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> occur once event-day hydrometeorological variables and short-window antecedent precipitation indices are included (feature sets 5–6), reinforcing the importance of forcing and antecedent wetness in conditioning peak magnitudes in many UK catchments <xref ref-type="bibr" rid="bib1.bibx8 bib1.bibx6 bib1.bibx5" id="paren.63"/>.</p>
      <p id="d2e1816">In feature set 7, performance remains similar to feature set 6 in the UK model and in several regional models, indicating that the pruning step removes redundant predictors while retaining most of the information relevant to prediction. The small reduction in <inline-formula><mml:math id="M77" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> between feature sets 6 and 7 reflects the expected trade-off between interpretability and predictive performance and supports the use of a reduced, physically interpretable predictor set for subsequent process attribution.</p>
      <p id="d2e1830">PBIAS results (Fig. <xref ref-type="fig" rid="F4"/>b) indicate a general tendency towards the underestimation of peak magnitudes in the test period across many regions, consistent with the scarcity of observations in the upper tail and the inherent difficulty of learning extreme responses from limited event samples. NS shows comparatively smaller bias and includes cases of overestimation, suggesting region-specific differences in error structure. The largest biases in CEE co-occur with low <inline-formula><mml:math id="M78" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, further indicating that key storage-related processes are not fully represented by the current predictor set. Predictive uncertainty derived from RF ensemble spread (Appendix Fig. <xref ref-type="fig" rid="FB2"/>) provides additional context, with generally low median uncertainty across samples but increased uncertainty in regions with lower skill.</p>
</sec>
<sec id="Ch1.S4.SS3">
  <label>4.3</label><title>Catchment scale performance</title>
      <p id="d2e1856">Catchment-level evaluation complements pooled-scale metrics by isolating within-catchment temporal predictability in the test period. Unlike the aggregate results discussed previously, which combine inter- and intra-catchment variability across regions, this analysis calculates <inline-formula><mml:math id="M79" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> individually for each catchment based on its test events and then compares these values between the UK and the corresponding regional models. Figure <xref ref-type="fig" rid="F5"/> shows these per-catchment <inline-formula><mml:math id="M80" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> distributions, with matched comparisons between the UK and regional models. At the regional scale, the southwest (SW) maintained high aggregated <inline-formula><mml:math id="M81" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values, but per-catchment performance revealed wide variability (ranging from <inline-formula><mml:math id="M82" display="inline"><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">0.94</mml:mn></mml:mrow></mml:math></inline-formula> to 0.69), highlighting heterogeneity in local dynamics. This indicates that regional models can generalize well to broader regional trends but may not consistently capture localized hydrological processes driving flood magnitudes. Similar variability was observed in other regions, suggesting that model skill depends strongly on individual catchment characteristics, data density, and the distinct hydrometeorological drivers of extremes. Across all regions, the UK model outperformed regional models in 54.9 % of matched catchments (Appendix Fig. <xref ref-type="fig" rid="FB3"/>), confirming that large-sample pooled models generally provide stronger generalization and robustness <xref ref-type="bibr" rid="bib1.bibx38 bib1.bibx40 bib1.bibx82" id="paren.64"/>.</p>
      <p id="d2e1910">However, regional models achieved higher performance than the UK-wide model in 45.1 % of catchments, demonstrating that locally trained models can still outperform larger models when regional hydrological characteristics are strongly distinct. Scatter-plots of per-catchment <inline-formula><mml:math id="M83" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values in Appendix Fig. <xref ref-type="fig" rid="FB4"/> show that regional models sometimes outperform the UK model, particularly in regions with coherent hydrological regimes, suggesting that local models retain value where region-specific flood processes dominate. The stronger overall UK performance reflects the model's ability to leverage greater data diversity and capture inter-catchment variability, consistent with established findings in large-sample hydrology <xref ref-type="bibr" rid="bib1.bibx40 bib1.bibx38 bib1.bibx82" id="paren.65"/>. Despite this, the relatively modest catchment-level <inline-formula><mml:math id="M84" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values in many regions highlight persistent challenges in modeling intra-catchment variability of the largest fluvial floods at the event scale. Limited extreme event records for individual catchments (often <inline-formula><mml:math id="M85" display="inline"><mml:mrow><mml:mo>&lt;</mml:mo><mml:mn mathvariant="normal">20</mml:mn></mml:mrow></mml:math></inline-formula> events) constrain model learning and increase noise, reducing temporal prediction reliability. These data constraints likely explain some of the low or negative <inline-formula><mml:math id="M86" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values observed in smaller or more heterogeneous catchments. Overall, this analysis reveals that while pooled UK models achieve higher generalisation, regional and catchment-specific processes still drive substantial local variability. Aggregated metrics may therefore mask poor local model skill, emphasising the importance of targeted feature engineering and model designs capable of balancing generalisation with sensitivity to local catchment dynamics.</p>
</sec>
<sec id="Ch1.S4.SS4">
  <label>4.4</label><title>Feature importance</title>
      <p id="d2e1970">Our analysis of dominant processes indicates that flood magnitude predictions are primarily controlled by a combination of climatic context, catchment features, and event-scale forcing, with aridity, catchment area and precipitation related features consistently among the most important (Figs. <xref ref-type="fig" rid="F6"/> and <xref ref-type="fig" rid="F7"/>). SHAP summary plots show both the direction and magnitude of individual predictor effects on model output, whereas mean absolute SHAP values summarize the overall contribution of each predictor across all events (Figs. <xref ref-type="fig" rid="F6"/> and <xref ref-type="fig" rid="F7"/>). Regional differences in predictor rankings indicate that the relative importance of climatic context, hydrometeorological forcing, and static catchment attributes varies between hydro-climatic regimes (Figs. <xref ref-type="fig" rid="F6"/> and <xref ref-type="fig" rid="F7"/>).</p>
      <p id="d2e1986">In the highest-performing UK model, the aridity index is the most influential feature overall, followed by precipitation (event day) and cumulative precipitation on the day before. The aridity index, though static, represents a long-term control on catchment response by defining the climatic water balance under which hydrological processes operate <xref ref-type="bibr" rid="bib1.bibx16 bib1.bibx18 bib1.bibx56" id="paren.66"/>. Lower aridity (e.g., wetter conditions) is associated with higher flood magnitudes, suggesting that catchments with persistently higher moisture availability are more efficient at converting rainfall into runoff. While the aridity index does not vary from event-to-event, its importance reflects the underlying hydro-climatic context and catchment wetness, which influence flood potential by setting the background (long-term) hydro-climatic conditions under which individual storms occur. The aridity index captures broad spatial gradients in climate and hydrology that may have previously been represented by latitude and longitude in the simpler earlier model feature sets. However, aridity provides a more physically interpretable and hydrologically meaningful descriptor of spatial variability across UK catchments.</p>
      <p id="d2e1992">Across both the UK and regional models, precipitation on the event day consistently emerges as a dominant predictor of extreme flood magnitudes, while antecedent precipitation plays a secondary role. In nearly all regions, the SHAP distributions confirm this behavior, with positive SHAP values associated with higher rainfall leading to higher predicted flood magnitudes. This consistency demonstrates that the models capture key physical processes, reflecting similar findings from other UK ML-based hydrological studies <xref ref-type="bibr" rid="bib1.bibx48 bib1.bibx49 bib1.bibx17" id="paren.67"/>.</p>
      <p id="d2e1998">In the SW and NW regional models, feature rankings largely mirror the UK model but reveal important regional nuances. Aridity remains a key control, but antecedent precipitation one day prior to the event gains prominence. In the SW, this could be explained by the mechanism of soil saturation and cumulative rainfall in modulating flood magnitudes <xref ref-type="bibr" rid="bib1.bibx80 bib1.bibx27" id="paren.68"/>. The SW model also attributes greater influence to topographic and temperature-related variables, consistent with elevation-driven orographic enhancement of rainfall and temperature-linked variability in evapotranspiration and antecedent wetness that together shape flood response in steep, maritime catchments <xref ref-type="bibr" rid="bib1.bibx32 bib1.bibx80" id="paren.69"/>.  In contrast, lower-performing regions such as CEE and SE show a weaker dominance of dynamic predictors. Here, static features such as baseflow index, area, or elevation rank higher. This further reflects the importance of capturing slower, groundwater-influenced flood generation processes characteristic of these permeable, low-relief catchments.</p>
      <p id="d2e2008">Finally, while SHAP values provide valuable insight into model behavior, they represent associations rather than direct causal relationships. Non-linear feature interactions may amplify or mask underlying process signals. Therefore, SHAP-based interpretability should be used alongside physical reasoning to avoid over-interpreting model-derived relationships <xref ref-type="bibr" rid="bib1.bibx83" id="paren.70"/>. Overall, the SHAP analysis supports that the models capture physically consistent mechanisms governing flood magnitudes, with the dominance of rainfall intensity and climatic wetness, while highlighting region-specific sensitivities to antecedent and physiographic factors.</p>
</sec>
<sec id="Ch1.S4.SS5">
  <label>4.5</label><title>Limitations and future recommendations</title>
      <p id="d2e2023">This study has several limitations that future research can address. First, uncertainty in flow observations remains a major challenge, particularly for extremes. High-flow discharge estimates are often derived by extrapolating stage–discharge rating curves beyond the range of direct gauge measurements and are sensitive to both rating curve uncertainty and stage measurement error which can be substantial during flood conditions <xref ref-type="bibr" rid="bib1.bibx34 bib1.bibx59 bib1.bibx90" id="paren.71"/>. In addition, the use of daily discharge can under-represent the magnitude of short-lived flood peaks because daily averaging smooths the hydrograph <xref ref-type="bibr" rid="bib1.bibx3" id="paren.72"/>. Future work could include quality-controlled sub-daily discharge where available to provide an improved representation of peak magnitudes <xref ref-type="bibr" rid="bib1.bibx23" id="paren.73"/>.</p>
      <p id="d2e2035">The selection of near-natural catchments was a deliberate decision to isolate natural processes; this may have reduced variability and limited the data sample size. Static features have proven to be highly important in the models, but they do not directly capture dynamic processes such as soil moisture infiltration, groundwater levels, and snowmelt. Incorporating additional dynamic variables such as snowmelt and groundwater datasets could significantly improve predictive performance in future work. Moreover, uncertainty in precipitation data also constrains model accuracy. No rainfall product will perfectly represent local conditions, and this limitation can be amplified when modeling extremes. The inclusion of WPs was motivated by their use in operational tools (e.g., Fluvial and Coastal Decider), and their potential to capture more predictable synoptic scale drivers of floods. However, despite WP-flood associations, their use in flood magnitude estimations proved limited in this study. This is likely due to a scale mismatch between the spatial resolution of the WPs and the localized behavior of flood events. Future research could aim to capture more accurately the synoptic and antecedent conditions driving floods.</p>
      <p id="d2e2038">Furthermore, while aggregated UK and regional metrics provide a useful overall picture, they may obscure substantial heterogeneity at the catchment scale. The lower and more variable catchment-level <inline-formula><mml:math id="M87" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values observed in this study highlight the importance of developing multi-scale evaluation frameworks that explicitly assess local predictive skill and uncertainty. Finally, the study's deliberate design to focus exclusively on DJF events in near-natural catchments exceeding the 99th percentile threshold means that the findings represent a subset of hydrological regimes and may not fully generalize to managed or urbanized systems.</p>
</sec>
</sec>
<sec id="Ch1.S5" sec-type="conclusions">
  <label>5</label><title>Conclusions</title>
      <p id="d2e2062">This study presents a comprehensive feature-incorporation framework for applying ML models to quantify the contributions of different predictor sets in flood magnitude estimation across natural UK catchments. By comparing feature sets consistently for a pooled UK model and multiple regional models, we quantify the extent to which different static and dynamic variables influence model performance and interpretability. The UK model achieved the highest predictive performance in the final model specifications (feature sets 6 and 7; <inline-formula><mml:math id="M88" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> 0.83–0.84), demonstrating the benefits of large-sample pooling and the diversity of training data to capture broad hydrological variability. However, catchment level evaluation showed substantial heterogeneity in skill within each region. Some catchments are predicted well, and others remain challenging for both the pooled and regional models. This variability indicates that achieving consistently high performance at local scales depends on representing fine-scale catchment properties and event-specific processes that are not fully captured by the available predictors.</p>
      <p id="d2e2076">This study also provides the first quantification of the limited role of the Met Office WPs in flood magnitude estimation, in this large-sample hydrological ML framework. Although cyclonic WPs are frequently associated with flood-event days, including WP and antecedent WP predictors does not improve test-set performance and can reduce skill in some regions. This suggests that, at catchment scale, WP categories provide limited additional information beyond direct hydrometeorological forcing variables. Where future work aims to understand or exploit circulation–flood linkages, improvements are more likely to come from higher-resolution circulation descriptors (e.g., moisture transport, storm-track metrics, or circulation indices tied more directly to rainfall persistence and intensity) or alternative classifications designed specifically for hydrological extremes.</p>
      <p id="d2e2079">SHAP-based process analysis showed that both static and dynamic hydrometeorological features were critical for estimating flood magnitudes. The aridity index was the most influential feature in the UK model. Dynamic variables, such as event day and previous days antecedent precipitation also strongly influenced flood estimation. Slower-response, groundwater-influenced regions (e.g., CEE and SE) remain more challenging to predict, underscoring the need for longer-term storage and groundwater indicators in future modeling frameworks. Future developments in flood magnitude estimation should aim to combine the generalizability of large-sample models with feature engineered processes relevant for lower performing regions. Such advances are essential for developing scalable, data-driven approaches that can inform flood risk assessment and forecasting in a changing climate. In doing this, ML models can achieve both broader applicability and enhanced predictive skill across national, regional, and catchment scales.</p>
</sec>

      
      </body>
    <back><app-group>

<app id="App1.Ch1.S1">
  <label>Appendix A</label><title>Supplementary tables</title>

<table-wrap id="TA1"><label>Table A1</label><caption><p id="d2e2098">Descriptions of the MO-30 weather pattern categories, reproduced from the dataset provided by <xref ref-type="bibr" rid="bib1.bibx62" id="text.74"/>.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="justify" colwidth="200pt"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="justify" colwidth="200pt"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">No.</oasis:entry>
         <oasis:entry colname="col2" align="left">Category</oasis:entry>
         <oasis:entry colname="col3">No.</oasis:entry>
         <oasis:entry colname="col4" align="left">Category</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">1</oasis:entry>
         <oasis:entry colname="col2" align="left">Unbiased northwesterly</oasis:entry>
         <oasis:entry colname="col3">16</oasis:entry>
         <oasis:entry colname="col4" align="left">Anticyclonic south-southeasterly with a high east of Denmark</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">2</oasis:entry>
         <oasis:entry colname="col2" align="left">Cyclonic southwesterly with a returning polar maritime air-mass</oasis:entry>
         <oasis:entry colname="col3">17</oasis:entry>
         <oasis:entry colname="col4" align="left">Anticyclonic east-southeasterly with a high over Denmark</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">3</oasis:entry>
         <oasis:entry colname="col2" align="left">Anticyclonic southwesterly with a high pressure ridge over northern France</oasis:entry>
         <oasis:entry colname="col3">18</oasis:entry>
         <oasis:entry colname="col4" align="left">Anticyclonic southwesterly with a high over northern France</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">4</oasis:entry>
         <oasis:entry colname="col2" align="left">Unbiased westerly</oasis:entry>
         <oasis:entry colname="col3">19</oasis:entry>
         <oasis:entry colname="col4" align="left">Unbiased northerly with a low east of Denmark</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">5</oasis:entry>
         <oasis:entry colname="col2" align="left">Unbiased southerly with high pressure centred over Scandinavia</oasis:entry>
         <oasis:entry colname="col3">20</oasis:entry>
         <oasis:entry colname="col4" align="left">Cyclonic westerly with an intense low near Iceland</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">6</oasis:entry>
         <oasis:entry colname="col2" align="left">Anticyclonic Azores high extension towards the UK</oasis:entry>
         <oasis:entry colname="col3">21</oasis:entry>
         <oasis:entry colname="col4" align="left">Cyclonic southwesterly with a deep low south of Iceland</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">7</oasis:entry>
         <oasis:entry colname="col2" align="left">Cyclonic southwesterly with a low centred wester-northwest of Ireland</oasis:entry>
         <oasis:entry colname="col3">22</oasis:entry>
         <oasis:entry colname="col4" align="left">Cyclonic southerly with a low west of Ireland</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">8</oasis:entry>
         <oasis:entry colname="col2" align="left">Cyclonic westerly with a low centred near Shetland</oasis:entry>
         <oasis:entry colname="col3">23</oasis:entry>
         <oasis:entry colname="col4" align="left">Unbiased westerly and very windy in the north</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">9</oasis:entry>
         <oasis:entry colname="col2" align="left">Anticyclonic north-northeasterly with a high centred near Iceland</oasis:entry>
         <oasis:entry colname="col3">24</oasis:entry>
         <oasis:entry colname="col4" align="left">Cyclonic northerly with a low in the North Sea</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">10</oasis:entry>
         <oasis:entry colname="col2" align="left">Anticyclonic west-southwesterly with a slight Azores high ridge</oasis:entry>
         <oasis:entry colname="col3">25</oasis:entry>
         <oasis:entry colname="col4" align="left">Anticyclonic northerly with a high centred in the Irish Sea</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">11</oasis:entry>
         <oasis:entry colname="col2" align="left">Cyclonic with a low centred over southern UK</oasis:entry>
         <oasis:entry colname="col3">26</oasis:entry>
         <oasis:entry colname="col4" align="left">Cyclonic northwesterly with a low near Norway – very windy</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">12</oasis:entry>
         <oasis:entry colname="col2" align="left">Anticyclonic southerly with a high over Poland</oasis:entry>
         <oasis:entry colname="col3">27</oasis:entry>
         <oasis:entry colname="col4" align="left">Anticyclonic easterly with a high in Norwegian Sea</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">13</oasis:entry>
         <oasis:entry colname="col2" align="left">Anticyclonic northwesterly with a high southwest of Ireland</oasis:entry>
         <oasis:entry colname="col3">28</oasis:entry>
         <oasis:entry colname="col4" align="left">Cyclonic southeasterly with a low southwest of the UK</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">14</oasis:entry>
         <oasis:entry colname="col2" align="left">Cyclonic north-northwesterly with a low near southern Sweden</oasis:entry>
         <oasis:entry colname="col3">29</oasis:entry>
         <oasis:entry colname="col4" align="left">Cyclonic south-southwesterly with a deep low west of Ireland</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">15</oasis:entry>
         <oasis:entry colname="col2" align="left">Unbiased southwesterly, very windy in northwest Britain</oasis:entry>
         <oasis:entry colname="col3">30</oasis:entry>
         <oasis:entry colname="col4" align="left">Cyclonic west-southwesterly with a deep low southeast of Iceland</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<table-wrap id="TA2"><label>Table A2</label><caption><p id="d2e2371">Variables used in the RF models.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="justify" colwidth="3.6cm"/>
     <oasis:colspec colnum="2" colname="col2" align="justify" colwidth="1.3cm"/>
     <oasis:colspec colnum="3" colname="col3" align="justify" colwidth="3.0cm"/>
     <oasis:colspec colnum="4" colname="col4" align="justify" colwidth="7.8cm"/>
     <oasis:thead>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Variable</oasis:entry>
         <oasis:entry colname="col2" align="left">Units</oasis:entry>
         <oasis:entry colname="col3" align="left">Source</oasis:entry>
         <oasis:entry colname="col4" align="left">Description/Definition</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Specific discharge</oasis:entry>
         <oasis:entry colname="col2" align="left"><inline-formula><mml:math id="M89" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3" align="left">
                    <xref ref-type="bibr" rid="bib1.bibx66" id="text.75"/>
                  </oasis:entry>
         <oasis:entry colname="col4" align="left">Streamflow (<inline-formula><mml:math id="M90" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">m</mml:mi><mml:mn mathvariant="normal">3</mml:mn></mml:msup><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">s</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>) at gauging stations normalized to specific discharge using catchment area.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Latitude, Longitude</oasis:entry>
         <oasis:entry colname="col2" align="left">degrees</oasis:entry>
         <oasis:entry colname="col3" align="left">
                    <xref ref-type="bibr" rid="bib1.bibx16" id="text.76"/>
                  </oasis:entry>
         <oasis:entry colname="col4" align="left">Coordinates of the catchment centroid.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Precipitation (event day)</oasis:entry>
         <oasis:entry colname="col2" align="left"><inline-formula><mml:math id="M91" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi><mml:mspace width="0.125em" linebreak="nobreak"/><mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3" align="left">HadUK-Grid as presented in <xref ref-type="bibr" rid="bib1.bibx33" id="text.77"/> consistent with <xref ref-type="bibr" rid="bib1.bibx18" id="text.78"/></oasis:entry>
         <oasis:entry colname="col4" align="left">Catchment-averaged daily precipitation.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Total precipitation 1 <inline-formula><mml:math id="M92" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">d</mml:mi></mml:mrow></mml:math></inline-formula> before, 2 <inline-formula><mml:math id="M93" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">d</mml:mi></mml:mrow></mml:math></inline-formula> before, 3 <inline-formula><mml:math id="M94" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">d</mml:mi></mml:mrow></mml:math></inline-formula> before</oasis:entry>
         <oasis:entry colname="col2" align="left"><inline-formula><mml:math id="M95" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3" align="left">HadUK-Grid as presented in <xref ref-type="bibr" rid="bib1.bibx33" id="text.79"/> consistent with <xref ref-type="bibr" rid="bib1.bibx18" id="text.80"/></oasis:entry>
         <oasis:entry colname="col4" align="left">Total precipitation accumulated on the 1–3 antecedent days (including the event magnitude day), representing catchment wetness.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Potential evapotranspiration</oasis:entry>
         <oasis:entry colname="col2" align="left"><inline-formula><mml:math id="M96" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">mm</mml:mi><mml:mspace linebreak="nobreak" width="0.125em"/><mml:msup><mml:mi mathvariant="normal">d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3" align="left">Hydro-PE as presented in <xref ref-type="bibr" rid="bib1.bibx11" id="text.81"/> consistent with <xref ref-type="bibr" rid="bib1.bibx18" id="text.82"/></oasis:entry>
         <oasis:entry colname="col4" align="left">Daily potential evapotranspiration averaged across the catchment.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Mean temperature, minimum temperature, and maximum temperature</oasis:entry>
         <oasis:entry colname="col2" align="left"><inline-formula><mml:math id="M97" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">°</mml:mi><mml:mi mathvariant="normal">C</mml:mi></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3" align="left">HadUK-Grid as presented in <xref ref-type="bibr" rid="bib1.bibx33" id="text.83"/> consistent with <xref ref-type="bibr" rid="bib1.bibx18" id="text.84"/></oasis:entry>
         <oasis:entry colname="col4" align="left">Catchment mean, minimum and maximum daily temperature.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Area</oasis:entry>
         <oasis:entry colname="col2" align="left"><inline-formula><mml:math id="M98" display="inline"><mml:mrow class="unit"><mml:msup><mml:mi mathvariant="normal">km</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3" align="left">
                    <xref ref-type="bibr" rid="bib1.bibx16" id="text.85"/>
                  </oasis:entry>
         <oasis:entry colname="col4" align="left">Total catchment drainage area.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Aridity index</oasis:entry>
         <oasis:entry colname="col2" align="left">–</oasis:entry>
         <oasis:entry colname="col3" align="left">
                    <xref ref-type="bibr" rid="bib1.bibx16" id="text.86"/>
                  </oasis:entry>
         <oasis:entry colname="col4" align="left"><inline-formula><mml:math id="M99" display="inline"><mml:mrow><mml:mtext>AI</mml:mtext><mml:mo>=</mml:mo><mml:mtext>PET</mml:mtext><mml:mo>/</mml:mo><mml:mi>P</mml:mi></mml:mrow></mml:math></inline-formula>, where PET <inline-formula><mml:math id="M100" display="inline"><mml:mo>=</mml:mo></mml:math></inline-formula> mean potential evapotranspiration and <inline-formula><mml:math id="M101" display="inline"><mml:mrow><mml:mi>P</mml:mi><mml:mo>=</mml:mo></mml:mrow></mml:math></inline-formula> mean precipitation; higher values indicate drier climate.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Runoff ratio</oasis:entry>
         <oasis:entry colname="col2" align="left">–</oasis:entry>
         <oasis:entry colname="col3" align="left">
                    <xref ref-type="bibr" rid="bib1.bibx16" id="text.87"/>
                  </oasis:entry>
         <oasis:entry colname="col4" align="left"><inline-formula><mml:math id="M102" display="inline"><mml:mrow><mml:mtext>RR</mml:mtext><mml:mo>=</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>Q</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mtext>year</mml:mtext></mml:msub><mml:mo>/</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:mtext>year</mml:mtext></mml:msub></mml:mrow></mml:math></inline-formula>; fraction of precipitation converted to streamflow, indicating catchment runoff efficiency.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Streamflow elasticity</oasis:entry>
         <oasis:entry colname="col2" align="left">–</oasis:entry>
         <oasis:entry colname="col3" align="left">
                    <xref ref-type="bibr" rid="bib1.bibx16" id="text.88"/>
                  </oasis:entry>
         <oasis:entry colname="col4" align="left">Sensitivity of streamflow to precipitation, approximated by <inline-formula><mml:math id="M103" display="inline"><mml:mrow><mml:mi mathvariant="normal">d</mml:mi><mml:mi>ln⁡</mml:mi><mml:mi>Q</mml:mi><mml:mo>/</mml:mo><mml:mi mathvariant="normal">d</mml:mi><mml:mi>ln⁡</mml:mi><mml:mi>P</mml:mi></mml:mrow></mml:math></inline-formula> from log–log regression of annual <inline-formula><mml:math id="M104" display="inline"><mml:mi>Q</mml:mi></mml:math></inline-formula> on <inline-formula><mml:math id="M105" display="inline"><mml:mi>P</mml:mi></mml:math></inline-formula>.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Baseflow index</oasis:entry>
         <oasis:entry colname="col2" align="left">–</oasis:entry>
         <oasis:entry colname="col3" align="left">
                    <xref ref-type="bibr" rid="bib1.bibx16" id="text.89"/>
                  </oasis:entry>
         <oasis:entry colname="col4" align="left">Proportion of total streamflow contributed by baseflow.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">Maximum elevation</oasis:entry>
         <oasis:entry colname="col2" align="left"><inline-formula><mml:math id="M106" display="inline"><mml:mrow class="unit"><mml:mi mathvariant="normal">m</mml:mi></mml:mrow></mml:math></inline-formula></oasis:entry>
         <oasis:entry colname="col3" align="left">
                    <xref ref-type="bibr" rid="bib1.bibx16" id="text.90"/>
                  </oasis:entry>
         <oasis:entry colname="col4" align="left">Maximum elevation within the catchment.</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1" align="left">WPs (e.g., wp_1)</oasis:entry>
         <oasis:entry colname="col2" align="left">–</oasis:entry>
         <oasis:entry colname="col3" align="left">
                    <xref ref-type="bibr" rid="bib1.bibx62" id="text.91"/>
                  </oasis:entry>
         <oasis:entry colname="col4" align="left">Daily synoptic-scale weather-pattern classification (MO-30) based on mean sea-level pressure anomalies.</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1" align="left">AWPs (e.g., wp_1-1)</oasis:entry>
         <oasis:entry colname="col2" align="left">–</oasis:entry>
         <oasis:entry colname="col3" align="left">
                    <xref ref-type="bibr" rid="bib1.bibx62" id="text.92"/>
                  </oasis:entry>
         <oasis:entry colname="col4" align="left">Antecedent Weather-pattern categories for the day, two days, and three days prior to flooding, representing antecedent atmospheric conditions.</oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

<table-wrap id="TA3"><label>Table A3</label><caption><p id="d2e2897">Regional flood event summary. Regions are ordered by total event count (highest: SW; lowest: NS). Short names (e.g., “SE”) are used throughout.</p></caption><oasis:table frame="topbot"><oasis:tgroup cols="4">
     <oasis:colspec colnum="1" colname="col1" align="left"/>
     <oasis:colspec colnum="2" colname="col2" align="right"/>
     <oasis:colspec colnum="3" colname="col3" align="right"/>
     <oasis:colspec colnum="4" colname="col4" align="right"/>
     <oasis:thead>
       <oasis:row>
         <oasis:entry colname="col1">Region</oasis:entry>
         <oasis:entry colname="col2">Total events</oasis:entry>
         <oasis:entry colname="col3">Catchments</oasis:entry>
         <oasis:entry colname="col4">Average number of</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1"/>
         <oasis:entry colname="col2"/>
         <oasis:entry colname="col3"/>
         <oasis:entry colname="col4">events per catchment</oasis:entry>
       </oasis:row>
     </oasis:thead>
     <oasis:tbody>
       <oasis:row>
         <oasis:entry colname="col1">Southwest England and South Wales (SW)</oasis:entry>
         <oasis:entry colname="col2">1828</oasis:entry>
         <oasis:entry colname="col3">33</oasis:entry>
         <oasis:entry colname="col4">55</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Central and Eastern England (CEE)</oasis:entry>
         <oasis:entry colname="col2">1018</oasis:entry>
         <oasis:entry colname="col3">20</oasis:entry>
         <oasis:entry colname="col4">51</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Northwest England and North Wales (NW)</oasis:entry>
         <oasis:entry colname="col2">1061</oasis:entry>
         <oasis:entry colname="col3">19</oasis:entry>
         <oasis:entry colname="col4">56</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Southeast England (SE)</oasis:entry>
         <oasis:entry colname="col2">730</oasis:entry>
         <oasis:entry colname="col3">17</oasis:entry>
         <oasis:entry colname="col4">43</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">Northeast England (NE)</oasis:entry>
         <oasis:entry colname="col2">694</oasis:entry>
         <oasis:entry colname="col3">14</oasis:entry>
         <oasis:entry colname="col4">50</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">South Scotland (SS)</oasis:entry>
         <oasis:entry colname="col2">687</oasis:entry>
         <oasis:entry colname="col3">12</oasis:entry>
         <oasis:entry colname="col4">57</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1">East Scotland (ES)</oasis:entry>
         <oasis:entry colname="col2">511</oasis:entry>
         <oasis:entry colname="col3">11</oasis:entry>
         <oasis:entry colname="col4">47</oasis:entry>
       </oasis:row>
       <oasis:row rowsep="1">
         <oasis:entry colname="col1">North Scotland (NS)</oasis:entry>
         <oasis:entry colname="col2">405</oasis:entry>
         <oasis:entry colname="col3">8</oasis:entry>
         <oasis:entry colname="col4">51</oasis:entry>
       </oasis:row>
       <oasis:row>
         <oasis:entry colname="col1"><bold>UK Overall</bold></oasis:entry>
         <oasis:entry colname="col2"><bold>6934</bold></oasis:entry>
         <oasis:entry colname="col3"><bold>134</bold></oasis:entry>
         <oasis:entry colname="col4"><bold>52</bold></oasis:entry>
       </oasis:row>
     </oasis:tbody>
   </oasis:tgroup></oasis:table></table-wrap>

</app>

<app id="App1.Ch1.S2">
  <label>Appendix B</label><title>Supplementary figures</title>
      <p id="d2e3095">Feature set 7 corresponds to the final pruned model specification used in the main analysis. The initial predictor pool included geographic coordinates (latitude and longitude), synoptic WPs and AWPs, static catchment attributes, hydrometeorological event-day predictors, and antecedent hydrometeorological indices. Latitude and longitude were included initially to provide spatial context for synoptic circulation (i.e., the same WP can have different hydro-meteorological implications depending on location). However, as these coordinates are not physically interpretable predictors of flood magnitude and primarily act as spatial proxies, they were removed prior to the final pruning stage to ensure that the retained predictors remained physically meaningful.</p>
      <p id="d2e3098">WP and AWP predictors were also removed prior to final pruning. This decision was based on model comparison results across successive feature sets (Fig. <xref ref-type="fig" rid="F4"/>), which showed that adding WP and AWP predictors did not improve test-set performance relative to the baseline in most regions and, in some cases, reduced skill. Removing WP/AWP prior to collinearity pruning therefore simplified the model without loss of predictive performance, and ensured that the final specification emphasized event-scale and catchment-relevant drivers of flood magnitude.</p>
      <p id="d2e3103">After excluding latitude, longitude and WP/AWP predictors, variance inflation factor (VIF) pruning was applied to the remaining predictor set to reduce multicollinearity and improve interpretability. The set of retained predictors differs by region because the pruning was performed separately for each regional model. The predictors retained after pruning for each model are summarized in Fig. <xref ref-type="fig" rid="FB1"/>.</p>
      <p id="d2e3110">A sensitivity analysis was also performed to assess whether increasing model complexity affected predictive performance. In particular, model configurations with larger ensemble sizes (e.g., 2000 trees compared with 1000 trees) were tested and produced near-identical test-set performance. This indicates that the reported results are not sensitive to hyperparameter configuration, supporting the robustness of the final model specification.</p><fig id="FB1"><label>Figure B1</label><caption><p id="d2e3116">Feature set 7 predictors retained across UK and regional models after removal of latitude, longitude, WPs, AWPs, and then VIF pruning.  Each colored square indicates that the corresponding predictor was retained in that region’s model, while blank cells denote exclusion.    </p></caption>
        
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026-f08.png"/>

      </fig>

      <fig id="FB2"><label>Figure B2</label><caption><p id="d2e3129">Ensemble-based relative uncertainty (Coefficient of Variation, CV) by region for the final feature-pruned models (Feature Set 7).  The CV (<inline-formula><mml:math id="M107" display="inline"><mml:mrow><mml:mtext>CV</mml:mtext><mml:mo>=</mml:mo><mml:mtext>Pred_SD</mml:mtext><mml:mo>/</mml:mo><mml:mo>|</mml:mo><mml:mtext>Pred_Mean</mml:mtext><mml:mo>|</mml:mo></mml:mrow></mml:math></inline-formula>) rescales ensemble prediction spread by mean predicted flood magnitude, allowing comparison across regions with differing event scales.  Boxes indicate inter-quartile range (25th–75th percentile), whiskers extend to 1.5 <inline-formula><mml:math id="M108" display="inline"><mml:mo>×</mml:mo></mml:math></inline-formula> IQR, and circles denote outliers.  Lower CV values indicate higher ensemble agreement and lower predictive uncertainty.  Regional models exhibit similar median CV values (<inline-formula><mml:math id="M109" display="inline"><mml:mrow><mml:mo>≈</mml:mo><mml:mn mathvariant="normal">0.15</mml:mn></mml:mrow></mml:math></inline-formula>–0.25), while slightly higher uncertainty is observed in NE, SE, and NS.</p></caption>
        
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026-f09.png"/>

      </fig>

<fig id="FB3"><label>Figure B3</label><caption><p id="d2e3180">Per-catchment performance difference between regional and UK models (<inline-formula><mml:math id="M110" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>) for matched catchments (i.e. same catchments in regional and UK models). Positive values indicate higher regional model performance. Overall, the UK model achieved higher <inline-formula><mml:math id="M111" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> in 54.9 % of catchments, while regional models performed better in 45.1 %. Boxplots show the interquartile range (IQR; 25th–75th percentile), with the median as a solid black line, whiskers extending to <inline-formula><mml:math id="M112" display="inline"><mml:mrow><mml:mo>±</mml:mo><mml:mn mathvariant="normal">1.5</mml:mn><mml:mo>×</mml:mo></mml:mrow></mml:math></inline-formula> IQR, and outliers shown as points.</p></caption>
        
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026-f10.png"/>

      </fig>

      <fig id="FB4"><label>Figure B4</label><caption><p id="d2e3230">Scatter plots of per-catchment <inline-formula><mml:math id="M113" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> comparing UK (matched subset) and regional models for each region. Points above the dashed <inline-formula><mml:math id="M114" display="inline"><mml:mrow><mml:mn mathvariant="normal">1</mml:mn><mml:mo>:</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:math></inline-formula> line indicate catchments where the regional model achieved higher <inline-formula><mml:math id="M115" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>. While the UK model generally performs better overall, regional models outperform in some areas, indicating potential benefits of region-specific model calibration.</p></caption>
        
        <graphic xlink:href="https://hess.copernicus.org/articles/30/2135/2026/hess-30-2135-2026-f11.png"/>

      </fig>


</app>

<app id="App1.Ch1.S3">
  <label>Appendix C</label><title>Supplementary methods and equations</title>
      <p id="d2e3286"><italic>Uncertainty metric definition.</italic> The coefficient of variation was calculated by:

          <disp-formula id="App1.Ch1.S3.E1" content-type="numbered"><label>C1</label><mml:math id="M116" display="block"><mml:mrow><mml:mtext>CV</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mtext>Pred_SD</mml:mtext><mml:mrow><mml:mo>|</mml:mo><mml:mtext>Pred_Mean</mml:mtext><mml:mo>|</mml:mo><mml:mo>+</mml:mo><mml:mi mathvariant="italic">ε</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>

        where Pred_Mean is the ensemble mean prediction and <inline-formula><mml:math id="M117" display="inline"><mml:mrow><mml:mi mathvariant="italic">ε</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mn mathvariant="normal">10</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn mathvariant="normal">6</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> prevents division by zero. This metric presents uncertainty as a fraction of the predicted magnitude. Appendix Fig. <xref ref-type="fig" rid="FB2"/> shows the ensemble based relative uncertainty expressed as the coefficient of variation (CV) for the final pruned feature set 7 models. Median CV values across most regions fall between 0.15 and 0.25. This indicates generally good model stability and agreement among ensemble members. The SS and NW models show the lowest uncertainty, with compact inter-quartile ranges, suggesting stronger ensemble agreement in these regions. In contrast, slightly higher and more variable CV values occur in SE and NE regions, suggesting greater ensemble spread and reduced confidence in predictions. These regions also correspond to lower model skill in the main text Fig. <xref ref-type="fig" rid="F4"/>, and are known to exhibit heterogeneous or groundwater-dominated hydrological responses, which may be increasing predictive uncertainty. The UK and SW models show moderate uncertainty with broader tails, likely reflecting the wider range of hydrological conditions represented in their training data. Overall, the low median CVs across all regions demonstrate that ensemble variability remains limited and that the RF models are internally stable.</p>
      <p id="d2e3342"><italic>Performance metrics.</italic> The equations for the performance metrics calculated on each test set per model generation are as follows:</p>
      <p id="d2e3348"><inline-formula><mml:math id="M118" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> <italic>(R-squared)</italic> measures the proportion of variance in actual flood magnitudes captured by the model predictions.</p>
      <p id="d2e3365"><inline-formula><mml:math id="M119" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> values are often reported between 0 and 1, with values closer to 1 indicating a better fit. They can also take negative values when the model performs worse than predicting the mean of the observations. <inline-formula><mml:math id="M120" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> was assessed at the national, regional, and individual catchment levels. The formula is as follows <xref ref-type="bibr" rid="bib1.bibx15" id="paren.93"/>:

          <disp-formula id="App1.Ch1.S3.E2" content-type="numbered"><label>C2</label><mml:math id="M121" display="block"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:msubsup><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:msubsup><mml:mo>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover><mml:msup><mml:mo>)</mml:mo><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:mfrac></mml:mstyle></mml:mrow></mml:math></disp-formula>

        where <inline-formula><mml:math id="M122" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the observed value, <inline-formula><mml:math id="M123" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the predicted value, and <inline-formula><mml:math id="M124" display="inline"><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal">‾</mml:mo></mml:mover></mml:math></inline-formula> is the mean of the observed values.</p>
      <p id="d2e3515"><italic>Percentage Bias (PBIAS)</italic> evaluates whether the model tends to overestimate or underestimate the observed values. In this definition, a negative PBIAS indicates underestimation (predictions are lower than observed values), and a positive PBIAS indicates overestimation (predictions are higher than observed values). The formula, consistent with the revised convention, is as follows <xref ref-type="bibr" rid="bib1.bibx87" id="paren.94"/>:

          <disp-formula id="App1.Ch1.S3.E3" content-type="numbered"><label>C3</label><mml:math id="M125" display="block"><mml:mrow><mml:mtext>PBIAS</mml:mtext><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:msubsup><mml:mo>(</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo mathvariant="normal" stretchy="false">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:msubsup><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfrac></mml:mstyle><mml:mo>×</mml:mo><mml:mn mathvariant="normal">100</mml:mn></mml:mrow></mml:math></disp-formula>

        where <inline-formula><mml:math id="M126" display="inline"><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the observed value and <inline-formula><mml:math id="M127" display="inline"><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false" mathvariant="normal">^</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> is the predicted value.</p>
      <p id="d2e3616"><italic>Permutation testing for significance.</italic> Permutation testing, a non-parametric approach widely used in machine learning and environmental sciences, provides a robust framework for evaluating the statistical significance of observed effects without assuming data normality <xref ref-type="bibr" rid="bib1.bibx26 bib1.bibx68" id="paren.95"/>. It is particularly useful in cases with small sample sizes or heterogeneous data <xref ref-type="bibr" rid="bib1.bibx61 bib1.bibx26" id="paren.96"/>.</p>
      <p id="d2e3628">First, for a given region, predictions for the same flood events were extracted across model generations, and the observed difference in <inline-formula><mml:math id="M128" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> between the two model generations was calculated as:

          <disp-formula id="App1.Ch1.S3.E4" content-type="numbered"><label>C4</label><mml:math id="M129" display="block"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msubsup><mml:mi>R</mml:mi><mml:mtext>obs</mml:mtext><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mi>R</mml:mi><mml:mrow><mml:msub><mml:mtext>Gen</mml:mtext><mml:mrow><mml:mi>j</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>-</mml:mo><mml:msubsup><mml:mi>R</mml:mi><mml:mrow><mml:msub><mml:mtext>Gen</mml:mtext><mml:mi>j</mml:mi></mml:msub></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e3689">To simulate the null hypothesis (<inline-formula><mml:math id="M130" display="inline"><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mn mathvariant="normal">0</mml:mn></mml:msub></mml:mrow></mml:math></inline-formula>), which assumes no systematic difference in <inline-formula><mml:math id="M131" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula>, model predictions for the same flood events were randomly shuffled between the two successive feature sets. The shuffled predictions were used to recalculate <inline-formula><mml:math id="M132" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> for each feature set, and the difference was computed as:

          <disp-formula id="App1.Ch1.S3.E5" content-type="numbered"><label>C5</label><mml:math id="M133" display="block"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msubsup><mml:mi>R</mml:mi><mml:mtext>shuffled</mml:mtext><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mi>R</mml:mi><mml:mrow><mml:msub><mml:mtext>shuffled, Gen</mml:mtext><mml:mrow><mml:mi>j</mml:mi><mml:mo>+</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>-</mml:mo><mml:msubsup><mml:mi>R</mml:mi><mml:mrow><mml:msub><mml:mtext>shuffled, Gen</mml:mtext><mml:mi>j</mml:mi></mml:msub></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e3773">This shuffling process was repeated 1000 times to construct a null distribution of <inline-formula><mml:math id="M134" display="inline"><mml:mrow><mml:mi mathvariant="normal">Δ</mml:mi><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> differences. Finally, the <inline-formula><mml:math id="M135" display="inline"><mml:mi>p</mml:mi></mml:math></inline-formula>-value was calculated as the proportion of shuffled differences that were as extreme as or more extreme than the observed difference:

          <disp-formula id="App1.Ch1.S3.E6" content-type="numbered"><label>C6</label><mml:math id="M136" display="block"><mml:mrow><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>b</mml:mi><mml:mo>=</mml:mo><mml:mn mathvariant="normal">1</mml:mn></mml:mrow><mml:mi>B</mml:mi></mml:msubsup><mml:mo>|</mml:mo><mml:mi mathvariant="normal">Δ</mml:mi><mml:msubsup><mml:mi>R</mml:mi><mml:mrow><mml:mtext>shuffled</mml:mtext><mml:mo>,</mml:mo><mml:mi>b</mml:mi></mml:mrow><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>|</mml:mo><mml:mo>≥</mml:mo><mml:mo>|</mml:mo><mml:mi mathvariant="normal">Δ</mml:mi><mml:msubsup><mml:mi>R</mml:mi><mml:mtext>obs</mml:mtext><mml:mn mathvariant="normal">2</mml:mn></mml:msubsup><mml:mo>|</mml:mo></mml:mrow><mml:mi>B</mml:mi></mml:mfrac></mml:mstyle><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula></p>
      <p id="d2e3857">If <inline-formula><mml:math id="M137" display="inline"><mml:mrow><mml:mi>p</mml:mi><mml:mo>&lt;</mml:mo><mml:mn mathvariant="normal">0.05</mml:mn></mml:mrow></mml:math></inline-formula>, the observed change in <inline-formula><mml:math id="M138" display="inline"><mml:mrow><mml:msup><mml:mi>R</mml:mi><mml:mn mathvariant="normal">2</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> was considered statistically significant, indicating that the feature set changes had a meaningful effect on model performance.</p>
      <p id="d2e3884"><italic>SHAP definition.</italic> As presented in <xref ref-type="bibr" rid="bib1.bibx53" id="text.97"/>, <xref ref-type="bibr" rid="bib1.bibx52" id="text.98"/>, and <xref ref-type="bibr" rid="bib1.bibx92" id="text.99"/>, SHAP can be explained by the following:</p>
      <p id="d2e3899">The SHAP value <inline-formula><mml:math id="M139" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> for a feature <inline-formula><mml:math id="M140" display="inline"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> calculates a feature's contribution to the model’s prediction:

          <disp-formula id="App1.Ch1.S3.E7" content-type="numbered"><label>C7</label><mml:math id="M141" display="block"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo><mml:mo>=</mml:mo><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>S</mml:mi><mml:mo>⊆</mml:mo><mml:mi>N</mml:mi><mml:mo>∖</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mi>i</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:munder><mml:mstyle displaystyle="true"><mml:mfrac style="display"><mml:mrow><mml:mo>|</mml:mo><mml:mi>S</mml:mi><mml:mo>|</mml:mo><mml:mi mathvariant="normal">!</mml:mi><mml:mo>(</mml:mo><mml:mo>|</mml:mo><mml:mi>N</mml:mi><mml:mo>|</mml:mo><mml:mo>-</mml:mo><mml:mo>|</mml:mo><mml:mi>S</mml:mi><mml:mo>|</mml:mo><mml:mo>-</mml:mo><mml:mn mathvariant="normal">1</mml:mn><mml:mo>)</mml:mo><mml:mi mathvariant="normal">!</mml:mi></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>N</mml:mi><mml:mo>|</mml:mo><mml:mi mathvariant="normal">!</mml:mi></mml:mrow></mml:mfrac></mml:mstyle><mml:mfenced open="(" close=")"><mml:mrow><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mi>S</mml:mi><mml:mo>∪</mml:mo><mml:mo mathvariant="italic">{</mml:mo><mml:mi>i</mml:mi><mml:mo mathvariant="italic">}</mml:mo></mml:mrow></mml:msub><mml:mo>)</mml:mo><mml:mo>-</mml:mo><mml:mi>f</mml:mi><mml:mo>(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>S</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mfenced></mml:mrow></mml:math></disp-formula>

        where <inline-formula><mml:math id="M142" display="inline"><mml:mrow><mml:msub><mml:mi mathvariant="italic">ϕ</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>(</mml:mo><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math></inline-formula> represents the SHAP value of feature <inline-formula><mml:math id="M143" display="inline"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M144" display="inline"><mml:mi>f</mml:mi></mml:math></inline-formula> denotes the model’s predictive function, <inline-formula><mml:math id="M145" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> is the set of all features, and <inline-formula><mml:math id="M146" display="inline"><mml:mi>S</mml:mi></mml:math></inline-formula> is any subset of <inline-formula><mml:math id="M147" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> excluding feature <inline-formula><mml:math id="M148" display="inline"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula>. Here, <inline-formula><mml:math id="M149" display="inline"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>S</mml:mi></mml:msub></mml:mrow></mml:math></inline-formula> represents the input under the given feature set <inline-formula><mml:math id="M150" display="inline"><mml:mi>S</mml:mi></mml:math></inline-formula>, and <inline-formula><mml:math id="M151" display="inline"><mml:mrow><mml:mo>|</mml:mo><mml:mi>N</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M152" display="inline"><mml:mrow><mml:mo>|</mml:mo><mml:mi>S</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:math></inline-formula> correspond to the sample sizes of sets <inline-formula><mml:math id="M153" display="inline"><mml:mi>N</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M154" display="inline"><mml:mi>S</mml:mi></mml:math></inline-formula>.</p>
</app>
  </app-group><notes notes-type="codeavailability"><title>Code availability</title>

      <p id="d2e4191">Code available on request.</p>
  </notes><notes notes-type="dataavailability"><title>Data availability</title>

      <p id="d2e4197">The datasets used in this paper are all publicly available. They can be downloaded from the National River Flow Archive, the online Met Office HadUKP Regional Dataset and corresponding shapefiles, the CAMELS-GB dataset by <xref ref-type="bibr" rid="bib1.bibx16" id="text.100"/> and <xref ref-type="bibr" rid="bib1.bibx18" id="text.101"/>, and the Met Office Weather Patterns by <xref ref-type="bibr" rid="bib1.bibx62" id="text.102"/>.</p>
  </notes><notes notes-type="authorcontribution"><title>Author contributions</title>

      <p id="d2e4212">EF designed the experiments, wrote the code, and conducted the analysis under the supervision of HC, MB, and LS. MB, HC, and LS revised and edited the manuscript.</p>
  </notes><notes notes-type="competinginterests"><title>Competing interests</title>

      <p id="d2e4218">At least one of the (co-)authors is a member of the editorial board of <italic>Hydrology and Earth System Sciences</italic>. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.</p>
  </notes><notes notes-type="disclaimer"><title>Disclaimer</title>

      <p id="d2e4227">Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.</p>
  </notes><ack><title>Acknowledgements</title><p id="d2e4233">EF would like to thank the UKRI Natural Environmental Research Council, for funding the work conducted in this paper. The award no. is NE/S007474/1. EF would like to thank the data providers Gemma Coxon and Yanchen Zheng, and the helpful comments from the reviewers that improved the manuscript. HC was funded by Natural Environment Research Council grant no. NE/P018238/1, through a Leverhulme Trust Research Leadership Award, and through the EERIE project (Grant Agreement no. 101081383) funded by the European Union. University of Oxford's contribution to EERIE is funded by UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee (grant no. 10049639). LS is supported by UKRI (UKRI2054). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Climate Infrastructure and Environment Executive Agency (CINEA). Neither the European Union nor the granting authority can be held responsible for them.</p></ack><notes notes-type="financialsupport"><title>Financial support</title>

      <p id="d2e4238">This research has been supported by the Natural Environment Research Council (grant no. NE/S007474/1).</p>
  </notes><notes notes-type="reviewstatement"><title>Review statement</title>

      <p id="d2e4244">This paper was edited by Alberto Guadagnini and reviewed by two anonymous referees.</p>
  </notes><ref-list>
    <title>References</title>

      <ref id="bib1.bibx1"><label>Ansell et al.(2006)Ansell, Jones, Allan, Lister, Parker, Brunet, Moberg, Jacobeit, Brohan, Rayner, Aguilar, Alexandersson, Barriendos, Brandsma, Cox, Della-Marta, Drebs, Founda, Gerstengarbe, Hickey, Jónsson, Luterbacher, Nordli, Oesterle, Petrakis, Philipp, Rodwell, Saladie, Sigro, Slonosky, Srnec, Swail, García-Suárez, Tuomenvirta, Wang, Wanner, Werner, Wheeler, and Xoplaki</label><mixed-citation>Ansell, T. J., Jones, P. D., Allan, R. J., Lister, D., Parker, D. E., Brunet, M., Moberg, A., Jacobeit, J., Brohan, P., Rayner, N. A., Aguilar, E., Alexandersson, H., Barriendos, M., Brandsma, T., Cox, N. J., Della-Marta, P. M., Drebs, A., Founda, D., Gerstengarbe, F., Hickey, K., Jónsson, T., Luterbacher, J., Nordli, Ø., Oesterle, H., Petrakis, M., Philipp, A., Rodwell, M. J., Saladie, O., Sigro, J., Slonosky, V., Srnec, L., Swail, V., García-Suárez, A. M., Tuomenvirta, H., Wang, X., Wanner, H., Werner, P., Wheeler, D., and Xoplaki, E.: Daily Mean Sea Level Pressure Reconstructions for the European–North Atlantic Region for the Period 1850–2003, J. Climate, 19, 2717–2742, <ext-link xlink:href="https://doi.org/10.1175/JCLI3775.1" ext-link-type="DOI">10.1175/JCLI3775.1</ext-link>, 2006.</mixed-citation></ref>
      <ref id="bib1.bibx2"><label>Bárdossy and Filiz(2005)</label><mixed-citation>Bárdossy, A. and Filiz, F.: Identification of flood producing atmospheric circulation patterns, J. Hydrol., 313, 48–57, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2005.02.006" ext-link-type="DOI">10.1016/j.jhydrol.2005.02.006</ext-link>, 2005.</mixed-citation></ref>
      <ref id="bib1.bibx3"><label>Bartens et al.(2024)Bartens, Shehu, and Haberlandt</label><mixed-citation>Bartens, A., Shehu, B., and Haberlandt, U.: Flood frequency analysis using mean daily flows vs. instantaneous peak flows, Hydrol. Earth Syst. Sci., 28, 1687–1709, <ext-link xlink:href="https://doi.org/10.5194/hess-28-1687-2024" ext-link-type="DOI">10.5194/hess-28-1687-2024</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx4"><label>Beck and Philipp(2010)</label><mixed-citation>Beck, C. and Philipp, A.: Evaluation and comparison of circulation type classifications for the European domain, Phys. Chem. Earth Pt. A/B/C, 35, 374–387, <ext-link xlink:href="https://doi.org/10.1016/J.PCE.2010.01.001" ext-link-type="DOI">10.1016/J.PCE.2010.01.001</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx5"><label>Berghuijs et al.(2016)Berghuijs, Woods, Hutton, and Sivapalan</label><mixed-citation>Berghuijs, W. R., Woods, R. A., Hutton, C. J., and Sivapalan, M.: Dominant flood generating mechanisms across the United States, Geophys. Res. Lett., 43, 4382–4390, <ext-link xlink:href="https://doi.org/10.1002/2016GL068070" ext-link-type="DOI">10.1002/2016GL068070</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx6"><label>Berghuijs et al.(2019)Berghuijs, Harrigan, Molnar, Slater, and Kirchner</label><mixed-citation>Berghuijs, W. R., Harrigan, S., Molnar, P., Slater, L. J., and Kirchner, J. W.: The Relative Importance of Different Flood-Generating Mechanisms Across Europe, Water Resour. Res., 55, 4582–4593, <ext-link xlink:href="https://doi.org/10.1029/2019WR024841" ext-link-type="DOI">10.1029/2019WR024841</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx7"><label>Bertola et al.(2020)Bertola, Viglione, Lun, Hall, and Blöschl</label><mixed-citation>Bertola, M., Viglione, A., Lun, D., Hall, J., and Blöschl, G.: Flood trends in Europe: are changes in small and big floods different?, Hydrol. Earth Syst. Sci., 24, 1805–1822, <ext-link xlink:href="https://doi.org/10.5194/hess-24-1805-2020" ext-link-type="DOI">10.5194/hess-24-1805-2020</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx8"><label>Blöschl et al.(2019)Blöschl, Hall, Viglione, Perdigão, Parajka, Merz, Lun, Arheimer, Aronica, Bilibashi, Boháč, Bonacci, Borga, Čanjevac, Castellarin, Chirico, Claps, Frolova, Ganora, Gorbachova, Gül, Hannaford, Harrigan, Kireeva, Kiss, Kjeldsen, Kohnová, Koskela, Ledvinka, Macdonald, Mavrova-Guirguinova, Mediero, Merz, Molnar, Montanari, Murphy, Osuch, Ovcharuk, Radevski, Salinas, Sauquet, Šraj, Szolgay, Volpi, Wilson, Zaimi, and Živković</label><mixed-citation>Blöschl, G., Hall, J., Viglione, A., Perdigão, R. A., Parajka, J., Merz, B., Lun, D., Arheimer, B., Aronica, G. T., Bilibashi, A., Boháč, M., Bonacci, O., Borga, M., Čanjevac, I., Castellarin, A., Chirico, G. B., Claps, P., Frolova, N., Ganora, D., Gorbachova, L., Gül, A., Hannaford, J., Harrigan, S., Kireeva, M., Kiss, A., Kjeldsen, T. R., Kohnová, S., Koskela, J. J., Ledvinka, O., Macdonald, N., Mavrova-Guirguinova, M., Mediero, L., Merz, R., Molnar, P., Montanari, A., Murphy, C., Osuch, M., Ovcharuk, V., Radevski, I., Salinas, J. L., Sauquet, E., Šraj, M., Szolgay, J., Volpi, E., Wilson, D., Zaimi, K., and Živković, N.: Changing climate both increases and decreases European river floods, Nature, 573, 108–111, <ext-link xlink:href="https://doi.org/10.1038/S41586-019-1495-6" ext-link-type="DOI">10.1038/S41586-019-1495-6</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx9"><label>Botache et al.(2023)Botache, Dingel, Huhnstock, Ehresmann, and Sick</label><mixed-citation>Botache, D., Dingel, K., Huhnstock, R., Ehresmann, A., and Sick, B.: Unraveling the Complexity of Splitting Sequential Data: Tackling Challenges in Video and Time Series Analysis, arXiv [preprint], <ext-link xlink:href="https://doi.org/10.48550/arXiv.2307.14294" ext-link-type="DOI">10.48550/arXiv.2307.14294</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx10"><label>Breiman(2001)</label><mixed-citation>Breiman, L.: Random forests, Mach. Learn., 45, 5–32, <ext-link xlink:href="https://doi.org/10.1023/A:1010933404324" ext-link-type="DOI">10.1023/A:1010933404324</ext-link>, 2001.</mixed-citation></ref>
      <ref id="bib1.bibx11"><label>Brown et al.(2023)Brown, Robinson, Kay, Chapman, Bell, and Blyth</label><mixed-citation>Brown, M., Robinson, E., Kay, A., Chapman, R., Bell, V., and Blyth, E.: Potential evapotranspiration derived from HadUK-Grid 1km gridded climate observations 1969–2022 (Hydro-PE HadUK-Grid), <ext-link xlink:href="https://doi.org/10.5285/BEB62085-BA81-480C-9ED0-2D31C27FF196" ext-link-type="DOI">10.5285/BEB62085-BA81-480C-9ED0-2D31C27FF196</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx12"><label>Brunner and Dougherty(2022)</label><mixed-citation>Brunner, M. I. and Dougherty, E. M.: Varying Importance of Storm Types and Antecedent Conditions for Local and Regional Floods, Water Resour. Res., 58, <ext-link xlink:href="https://doi.org/10.1029/2022WR033249" ext-link-type="DOI">10.1029/2022WR033249</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx13"><label>Brunner and Slater(2022)</label><mixed-citation>Brunner, M. I. and Slater, L. J.: Extreme floods in Europe: going beyond observations using reforecast ensemble pooling, Hydrol. Earth Syst. Sci., 26, 469–482, <ext-link xlink:href="https://doi.org/10.5194/hess-26-469-2022" ext-link-type="DOI">10.5194/hess-26-469-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx14"><label>Brunner et al.(2021)Brunner, Slater, Tallaksen, and Clark</label><mixed-citation>Brunner, M. I., Slater, L., Tallaksen, L. M., and Clark, M.: Challenges in modeling and predicting floods and droughts: A review, Wiley Interdisciplinary Reviews: Water, 8, e1520, <ext-link xlink:href="https://doi.org/10.1002/WAT2.1520" ext-link-type="DOI">10.1002/WAT2.1520</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx15"><label>Chicco et al.(2021)Chicco, Warrens, and Jurman</label><mixed-citation>Chicco, D., Warrens, M. J., and Jurman, G.: The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation, PeerJ Computer Science, 7, 1–24, <ext-link xlink:href="https://doi.org/10.7717/PEERJ-CS.623/SUPP-1" ext-link-type="DOI">10.7717/PEERJ-CS.623/SUPP-1</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx16"><label>Coxon et al.(2020)Coxon, Addor, Bloomfield, Freer, Fry, Hannaford, Howden, Lane, Lewis, Robinson, Wagener, and Woods</label><mixed-citation>Coxon, G., Addor, N., Bloomfield, J. P., Freer, J., Fry, M., Hannaford, J., Howden, N. J. K., Lane, R., Lewis, M., Robinson, E. L., Wagener, T., and Woods, R.: CAMELS-GB: hydrometeorological time series and landscape attributes for 671 catchments in Great Britain, Earth Syst. Sci. Data, 12, 2459–2483, <ext-link xlink:href="https://doi.org/10.5194/essd-12-2459-2020" ext-link-type="DOI">10.5194/essd-12-2459-2020</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx17"><label>Coxon et al.(2024)Coxon, McMillan, Bloomfield, Bolotin, Dean, Kelleher, Slater, and Zheng</label><mixed-citation>Coxon, G., McMillan, H., Bloomfield, J. P., Bolotin, L., Dean, J. F., Kelleher, C., Slater, L., and Zheng, Y.: Wastewater discharges and urban land cover dominate urban hydrology signals across England and Wales, Environ. Res. Lett., 19, 084016, <ext-link xlink:href="https://doi.org/10.1088/1748-9326/AD5BF2" ext-link-type="DOI">10.1088/1748-9326/AD5BF2</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx18"><label>Coxon et al.(2025)Coxon, Zheng, Barbedo, Cooper, Fileni, Fowler, Fry, Green, Gribbin, Harfoot, Lewis, Gondim, Neto, Qiu, Salwey, and Wendt</label><mixed-citation>Coxon, G., Zheng, Y., Barbedo, R., Cooper, H., Fileni, F., Fowler, H. J., Fry, M., Green, A., Gribbin, T., Harfoot, H., Lewis, E., Gondim, G., Neto, R., Qiu, X., Salwey, S., and Wendt, D. E.: CAMELS-GB v2: hydrometeorological time series and landscape attributes for 671 catchments in Great Britain, Earth Syst. Sci. Data Discuss. [preprint], <ext-link xlink:href="https://doi.org/10.5194/essd-2025-608" ext-link-type="DOI">10.5194/essd-2025-608</ext-link>, in review, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx19"><label>Cutler et al.(2012)Cutler, Cutler, and Stevens</label><mixed-citation>Cutler, A., Cutler, D. R., and Stevens, J. R.: Random Forests, Ensemble Machine Learning,  157–175, <ext-link xlink:href="https://doi.org/10.1007/978-1-4419-9326-7_5" ext-link-type="DOI">10.1007/978-1-4419-9326-7_5</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx20"><label>Duckstein et al.(1993)Duckstein, Bárdossy, and Bogárdi</label><mixed-citation>Duckstein, L., Bárdossy, A., and Bogárdi, I.: Linkage between the occurrence of daily atmospheric circulation patterns and floods: an Arizona case study, J. Hydrol., 143, 413–428, <ext-link xlink:href="https://doi.org/10.1016/0022-1694(93)90202-K" ext-link-type="DOI">10.1016/0022-1694(93)90202-K</ext-link>, 1993.</mixed-citation></ref>
      <ref id="bib1.bibx21"><label>Fabiano et al.(2021)Fabiano, Meccia, Davini, Ghinassi, and Corti</label><mixed-citation>Fabiano, F., Meccia, V. L., Davini, P., Ghinassi, P., and Corti, S.: A regime view of future atmospheric circulation changes in northern mid-latitudes, Weather Clim. Dynam., 2, 163–180, <ext-link xlink:href="https://doi.org/10.5194/wcd-2-163-2021" ext-link-type="DOI">10.5194/wcd-2-163-2021</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx22"><label>Fawagreh et al.(2014)Fawagreh, Gaber, and Elyan</label><mixed-citation>Fawagreh, K., Gaber, M. M., and Elyan, E.: Random forests: From early developments to recent advancements, Systems Science and Control Engineering, 2, 602–609, <ext-link xlink:href="https://doi.org/10.1080/21642583.2014.956265" ext-link-type="DOI">10.1080/21642583.2014.956265</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx23"><label>Fileni et al.(2023)Fileni, Fowler, Lewis, McLay, and Yang</label><mixed-citation>Fileni, F., Fowler, H. J., Lewis, E., McLay, F., and Yang, L.: A quality-control framework for sub-daily flow and level data for hydrological modelling in Great Britain, Hydrol. Res., 54, 1357–1367, <ext-link xlink:href="https://doi.org/10.2166/NH.2023.045" ext-link-type="DOI">10.2166/NH.2023.045</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx24"><label>Fleming et al.(2021)Fleming, Watson, Ellenson, Cannon, and Vesselinov</label><mixed-citation>Fleming, S. W., Watson, J. R., Ellenson, A., Cannon, A. J., and Vesselinov, V. C.: Machine learning in Earth and environmental science requires education and research policy reforms, Nat. Geosci., 14, 878–880, <ext-link xlink:href="https://doi.org/10.1038/s41561-021-00865-3" ext-link-type="DOI">10.1038/s41561-021-00865-3</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx25"><label>Frame et al.(2022)Frame, Kratzert, Klotz, Gauch, Shelev, Gilon, Qualls, Gupta, and Nearing</label><mixed-citation>Frame, J. M., Kratzert, F., Klotz, D., Gauch, M., Shalev, G., Gilon, O., Qualls, L. M., Gupta, H. V., and Nearing, G. S.: Deep learning rainfall–runoff predictions of extreme events, Hydrol. Earth Syst. Sci., 26, 3377–3392, <ext-link xlink:href="https://doi.org/10.5194/hess-26-3377-2022" ext-link-type="DOI">10.5194/hess-26-3377-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx26"><label>Graham et al.(2014)Graham, Mathur, and Baldwin</label><mixed-citation>Graham, Y., Mathur, N., and Baldwin, T.: Randomized Significance Tests in Machine Translation, Proceedings of the Annual Meeting of the Association for Computational Linguistics,  266–274, <ext-link xlink:href="https://doi.org/10.3115/V1/W14-3333" ext-link-type="DOI">10.3115/V1/W14-3333</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx27"><label>Griffin et al.(2019)Griffin, Vesuviano, and Stewart</label><mixed-citation>Griffin, A., Vesuviano, G., and Stewart, E.: Have trends changed over time? A study of UK peak flow data and sensitivity to observation period, Nat. Hazards Earth Syst. Sci., 19, 2157–2167, <ext-link xlink:href="https://doi.org/10.5194/nhess-19-2157-2019" ext-link-type="DOI">10.5194/nhess-19-2157-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx28"><label>Griffin et al.(2024)</label><mixed-citation>Griffin, A., Kay, A. L., Sayers, P., Bell, V., Stewart, E., and Carr, S.: Widespread flooding dynamics under climate change: characterising floods using grid-based hydrological modelling and regional climate projections, Hydrol. Earth Syst. Sci., 28, 2635–2650, <ext-link xlink:href="https://doi.org/10.5194/hess-28-2635-2024" ext-link-type="DOI">10.5194/hess-28-2635-2024</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx29"><label>Griffin et al.(2025)Griffin, Vesuviano, Wilson, Sefton, Turner, Armitage, and Suman</label><mixed-citation>Griffin, A., Vesuviano, G., Wilson, D., Sefton, C., Turner, S., Armitage, R., and Suman, G.: Putting the English Flooding of 2019–2021 in the Context of Antecedent Conditions, J. Flood Risk Manag., 18, e70016, <ext-link xlink:href="https://doi.org/10.1111/JFR3.70016" ext-link-type="DOI">10.1111/JFR3.70016</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx30"><label>Hakim et al.(2024)Hakim, Gernowo, and Nirwansyah</label><mixed-citation>Hakim, D. K., Gernowo, R., and Nirwansyah, A. W.: Flood prediction with time series data mining: Systematic review, Natural Hazards Research, 4, 194–220, <ext-link xlink:href="https://doi.org/10.1016/J.NHRES.2023.10.001" ext-link-type="DOI">10.1016/J.NHRES.2023.10.001</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx31"><label>Harrigan et al.(2018)Harrigan, Hannaford, Muchan, and Marsh</label><mixed-citation>Harrigan, S., Hannaford, J., Muchan, K., and Marsh, T. J.: Designation and trend analysis of the updated UK Benchmark Network of river flow stations: the UKBN2 dataset, Hydrol. Res., 49, 552–567, <ext-link xlink:href="https://doi.org/10.2166/NH.2017.058" ext-link-type="DOI">10.2166/NH.2017.058</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx32"><label>Hendry et al.(2019)Hendry, Haigh, Nicholls, Winter, Neal, Wahl, Joly-Lauge, and Darby</label><mixed-citation>Hendry, A., Haigh, I. D., Nicholls, R. J., Winter, H., Neal, R., Wahl, T., Joly-Laugel, A., and Darby, S. E.: Assessing the characteristics and drivers of compound flooding events around the UK coast, Hydrol. Earth Syst. Sci., 23, 3117–3139, <ext-link xlink:href="https://doi.org/10.5194/hess-23-3117-2019" ext-link-type="DOI">10.5194/hess-23-3117-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx33"><label>Hollis et al.(2019)Hollis, McCarthy, Kendon, Legg, and Simpson</label><mixed-citation>Hollis, D., McCarthy, M., Kendon, M., Legg, T., and Simpson, I.: HadUK-Grid – A new UK dataset of gridded climate observations, Geosci. Data J., 6, 151–159, <ext-link xlink:href="https://doi.org/10.1002/GDJ3.78" ext-link-type="DOI">10.1002/GDJ3.78</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx34"><label>Horner et al.(2018)Horner, Renard, Le Coz, Branger, McMillan, and Pierrefeu</label><mixed-citation>Horner, I., Renard, B., Le Coz, J., Branger, F., McMillan, H. K., and Pierrefeu, G.: Impact of Stage Measurement Errors on Streamflow Uncertainty, Water Resour. Res., 54, 1952–1976, <ext-link xlink:href="https://doi.org/10.1002/2017WR022039" ext-link-type="DOI">10.1002/2017WR022039</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx35"><label>Huang et al.(2020)Huang, Charlton-Perez, Lee, Neal, Sarran, and Sun</label><mixed-citation>Huang, W. T. K., Charlton-Perez, A., Lee, R. W., Neal, R., Sarran, C., and Sun, T.: Weather regimes and patterns associated with temperature-related excess mortality in the UK: a pathway to sub-seasonal risk forecasting, Environ. Res. Lett., 15, 124052, <ext-link xlink:href="https://doi.org/10.1088/1748-9326/ABCBBA" ext-link-type="DOI">10.1088/1748-9326/ABCBBA</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx36"><label>Jiang et al.(2022)Jiang, Bevacqua, and Zscheischler</label><mixed-citation>Jiang, S., Bevacqua, E., and Zscheischler, J.: River flooding mechanisms and their changes in Europe revealed by explainable machine learning, Hydrol. Earth Syst. Sci., 26, 6339–6359, <ext-link xlink:href="https://doi.org/10.5194/hess-26-6339-2022" ext-link-type="DOI">10.5194/hess-26-6339-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx37"><label>Kratzert et al.(2018)Kratzert, Klotz, Brenner, Schulz, and Herrnegger</label><mixed-citation>Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005–6022, <ext-link xlink:href="https://doi.org/10.5194/hess-22-6005-2018" ext-link-type="DOI">10.5194/hess-22-6005-2018</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx38"><label>Kratzert et al.(2019)Kratzert, Klotz, Shalev, Klambauer, Hochreiter, and Nearing</label><mixed-citation>Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., and Nearing, G.: Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets, Hydrol. Earth Syst. Sci., 23, 5089–5110, <ext-link xlink:href="https://doi.org/10.5194/hess-23-5089-2019" ext-link-type="DOI">10.5194/hess-23-5089-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx39"><label>Kratzert et al.(2022)Kratzert, Gauch, Nearing, and Klotz</label><mixed-citation>Kratzert, F., Gauch, M., Nearing, G., and Klotz, D.: NeuralHydrology – A Python library for Deep Learning research in hydrology, J. Open Source Softw., 7, 4050, <ext-link xlink:href="https://doi.org/10.21105/JOSS.04050" ext-link-type="DOI">10.21105/JOSS.04050</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx40"><label>Kratzert et al.(2024)Kratzert, Gauch, Klotz, and Nearing</label><mixed-citation>Kratzert, F., Gauch, M., Klotz, D., and Nearing, G.: HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin, Hydrol. Earth Syst. Sci., 28, 4187–4201, <ext-link xlink:href="https://doi.org/10.5194/hess-28-4187-2024" ext-link-type="DOI">10.5194/hess-28-4187-2024</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx41"><label>Lamane et al.(2024)Lamane, Mouhir, Moussadek, Baghdad, Kisi, and El Bilali</label><mixed-citation>Lamane, H., Mouhir, L., Moussadek, R., Baghdad, B., Kisi, O., and El Bilali, A.: Interpreting machine learning models based on SHAP values in predicting suspended sediment concentration, Int. J. Sediment Res., <ext-link xlink:href="https://doi.org/10.1016/J.IJSRC.2024.10.002" ext-link-type="DOI">10.1016/J.IJSRC.2024.10.002</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx42"><label>Lamb(1972)</label><mixed-citation>Lamb, H. H.: British Isles weather types and a register of daily sequence of circulation patterns, 1861–1971, <uri>https://openlibrary.org/works/OL3523120W/British_Isles_weather_types_and_a_register_of_the_daily_sequence_of_circulation_patterns_1861-1971</uri>, 1972.</mixed-citation></ref>
      <ref id="bib1.bibx43"><label>Lane et al.(2019)Lane, Coxon, Freer, Wagener, Johnes, Bloomfield, Greene, Macleod, and Reaney</label><mixed-citation>Lane, R. A., Coxon, G., Freer, J. E., Wagener, T., Johnes, P. J., Bloomfield, J. P., Greene, S., Macleod, C. J. A., and Reaney, S. M.: Benchmarking the predictive capability of hydrological models for river flow and flood peak predictions across over 1000 catchments in Great Britain, Hydrol. Earth Syst. Sci., 23, 4011–4032, <ext-link xlink:href="https://doi.org/10.5194/hess-23-4011-2019" ext-link-type="DOI">10.5194/hess-23-4011-2019</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx44"><label>Lavers et al.(2010)Lavers, Prudhomme, and Hannah</label><mixed-citation>Lavers, D., Prudhomme, C., and Hannah, D. M.: Large-scale climate, precipitation and British river flows: Identifying hydroclimatological connections and dynamics, J. Hydrol., 395, 242–255, <ext-link xlink:href="https://doi.org/10.1016/J.JHYDROL.2010.10.036" ext-link-type="DOI">10.1016/J.JHYDROL.2010.10.036</ext-link>, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx45"><label>Lavers et al.(2012)Lavers, Villarini, Allan, Wood, and Wade</label><mixed-citation>Lavers, D. A., Villarini, G., Allan, R. P., Wood, E. F., and Wade, A. J.: The detection of atmospheric rivers in atmospheric reanalyses and their links to British winter floods and the large-scale climatic circulation, J. Geophys. Res.-Atmos., 117, 20106, <ext-link xlink:href="https://doi.org/10.1029/2012JD018027" ext-link-type="DOI">10.1029/2012JD018027</ext-link>, 2012.</mixed-citation></ref>
      <ref id="bib1.bibx46"><label>Lavers et al.(2020)Lavers, Ralph, Richardson, and Pappenberger</label><mixed-citation>Lavers, D. A., Ralph, F. M., Richardson, D. S., and Pappenberger, F.: Improved forecasts of atmospheric rivers through systematic reconnaissance, better modelling, and insights on conversion of rain to flooding, Commun. Earth Environ., 1, 1–7, <ext-link xlink:href="https://doi.org/10.1038/s43247-020-00042-1" ext-link-type="DOI">10.1038/s43247-020-00042-1</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx47"><label>Ledingham et al.(2019)Ledingham, Archer, Lewis, Fowler, and Kilsby</label><mixed-citation>Ledingham, J., Archer, D., Lewis, E., Fowler, H., and Kilsby, C.: Contrasting seasonality of storm rainfall and flood runoff in the UK and some implications for rainfall-runoff methods of flood estimation, Hydrol. Res., 50, 1309–1323, <ext-link xlink:href="https://doi.org/10.2166/NH.2019.040" ext-link-type="DOI">10.2166/NH.2019.040</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx48"><label>Lees et al.(2021)Lees, Buechel, Anderson, Slater, Reece, Coxon, and Dadson</label><mixed-citation>Lees, T., Buechel, M., Anderson, B., Slater, L., Reece, S., Coxon, G., and Dadson, S. J.: Benchmarking data-driven rainfall–runoff models in Great Britain: a comparison of long short-term memory (LSTM)-based models with four lumped conceptual models, Hydrol. Earth Syst. Sci., 25, 5517–5534, <ext-link xlink:href="https://doi.org/10.5194/hess-25-5517-2021" ext-link-type="DOI">10.5194/hess-25-5517-2021</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx49"><label>Lees et al.(2022)Lees, Reece, Kratzert, Klotz, Gauch, De Bruijn, Kumar Sahu, Greve, Slater, and Dadson</label><mixed-citation>Lees, T., Reece, S., Kratzert, F., Klotz, D., Gauch, M., De Bruijn, J., Kumar Sahu, R., Greve, P., Slater, L., and Dadson, S. J.: Hydrological concept formation inside long short-term memory (LSTM) networks, Hydrol. Earth Syst. Sci., 26, 3079–3101, <ext-link xlink:href="https://doi.org/10.5194/hess-26-3079-2022" ext-link-type="DOI">10.5194/hess-26-3079-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx50"><label>Ley et al.(2024)Ley, Bormann, and Casper</label><mixed-citation>Ley, A., Bormann, H., and Casper, M.: Linking explainable artificial intelligence and soil moisture dynamics in a machine learning streamflow model, Hydrol. Res., 55, 613–627, <ext-link xlink:href="https://doi.org/10.2166/NH.2024.003" ext-link-type="DOI">10.2166/NH.2024.003</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx51"><label>Liu et al.(2022)Liu, Feng, Gu, Zhang, Beck, Zhang, and Yan</label><mixed-citation>Liu, J., Feng, S., Gu, X., Zhang, Y., Beck, H. E., Zhang, J., and Yan, S.: Global changes in floods and their drivers, J. Hydrol., 614, 128553, <ext-link xlink:href="https://doi.org/10.1016/J.JHYDROL.2022.128553" ext-link-type="DOI">10.1016/J.JHYDROL.2022.128553</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx52"><label>Lundberg and Lee(2017)</label><mixed-citation>Lundberg, S. and Lee, S.: A unified approach to interpreting model predictions , Advances in Neural Information for Processing Systems, 30, <uri>https://proceedings.neurips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf</uri>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx53"><label>Lundberg et al.(2020)Lundberg, Erion, Chen, DeGrave, Prutkin, Nair, Katz, Himmelfarb, Bansal, and Lee</label><mixed-citation>Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., and Lee, S. I.: From local explanations to global understanding with explainable AI for trees, Nature Machine Intelligence, 2, 56–67, <ext-link xlink:href="https://doi.org/10.1038/s42256-019-0138-9" ext-link-type="DOI">10.1038/s42256-019-0138-9</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx54"><label>Mailhot et al.(2013)Mailhot, Lachance-Cloutier, Talbot, and Favre</label><mixed-citation>Mailhot, A., Lachance-Cloutier, S., Talbot, G., and Favre, A. C.: Regional estimates of intense rainfall based on the Peak-Over-Threshold (POT) approach, J. Hydrol., 476, 188–199, <ext-link xlink:href="https://doi.org/10.1016/J.JHYDROL.2012.10.036" ext-link-type="DOI">10.1016/J.JHYDROL.2012.10.036</ext-link>, 2013.</mixed-citation></ref>
      <ref id="bib1.bibx55"><label>Massari et al.(2023)Massari, Pellet, Tramblay, Crow, Gründemann, Hascoetf, Penna, Modanesi, Brocca, Camici, and Marra</label><mixed-citation>Massari, C., Pellet, V., Tramblay, Y., Crow, W. T., Gründemann, G. J., Hascoetf, T., Penna, D., Modanesi, S., Brocca, L., Camici, S., and Marra, F.: On the relation between antecedent basin conditions and runoff coefficient for European floods, J. Hydrol., 625, 130012, <ext-link xlink:href="https://doi.org/10.1016/J.JHYDROL.2023.130012" ext-link-type="DOI">10.1016/J.JHYDROL.2023.130012</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx56"><label>Meira Neto et al.(2020)Meira Neto, Roy, de Oliveira, and Troch</label><mixed-citation>Meira Neto, A. A., Roy, T., de Oliveira, P. T. S., and Troch, P. A.: An Aridity Index-Based Formulation of Streamflow Components, Water Resour. Res., 56, e2020WR027123, <ext-link xlink:href="https://doi.org/10.1029/2020WR027123" ext-link-type="DOI">10.1029/2020WR027123</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx57"><label>Merz et al.(2016)Merz, Nguyen, and Vorogushyn</label><mixed-citation>Merz, B., Nguyen, V. D., and Vorogushyn, S.: Temporal clustering of floods in Germany: Do flood-rich and flood-poor periods exist?, J. Hydrol., 541, 824–838, <ext-link xlink:href="https://doi.org/10.1016/J.JHYDROL.2016.07.041" ext-link-type="DOI">10.1016/J.JHYDROL.2016.07.041</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx58"><label>Merz and Blöschl(2003)</label><mixed-citation>Merz, R. and Blöschl, G.: A process typology of regional floods, Water Resour. Res., 39, <ext-link xlink:href="https://doi.org/10.1029/2002WR001952" ext-link-type="DOI">10.1029/2002WR001952</ext-link>, 2003.</mixed-citation></ref>
      <ref id="bib1.bibx59"><label>Morlot et al.(2014)Morlot, Perret, Favre, and Jalbert</label><mixed-citation>Morlot, T., Perret, C., Favre, A. C., and Jalbert, J.: Dynamic rating curve assessment for hydrometric stations and computation of the associated uncertainties: Quality and station management indicators, J. Hydrol., 517, 173–186, <ext-link xlink:href="https://doi.org/10.1016/J.JHYDROL.2014.05.007" ext-link-type="DOI">10.1016/J.JHYDROL.2014.05.007</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx60"><label>Mushtaq et al.(2024)Mushtaq, Akhtar, Hashmi, Masood, and Saeed</label><mixed-citation>Mushtaq, H., Akhtar, T., Hashmi, M. Z. u. R., Masood, A., and Saeed, F.: Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate sensitive upper Indus catchments, Theor. Appl. Climatol., 155, 5525–5542, <ext-link xlink:href="https://doi.org/10.1007/s00704-024-04932-8" ext-link-type="DOI">10.1007/s00704-024-04932-8</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx61"><label>Nariya et al.(2023)Nariya, Mills, Sorger, and Sokolov</label><mixed-citation>Nariya, M. K., Mills, C. E., Sorger, P. K., and Sokolov, A.: Paired evaluation of machine-learning models characterizes effects of confounders and outliers, Patterns, 4, 100791, <ext-link xlink:href="https://doi.org/10.1016/J.PATTER.2023.100791" ext-link-type="DOI">10.1016/J.PATTER.2023.100791</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx62"><label>Neal et al.(2016)Neal, Fereday, Crocker, and Comer</label><mixed-citation>Neal, R., Fereday, D., Crocker, R., and Comer, R. E.: A flexible approach to defining weather patterns and their application in weather forecasting over Europe, Meteorol. Appl., 23, 389–400, <ext-link xlink:href="https://doi.org/10.1002/met.1563" ext-link-type="DOI">10.1002/met.1563</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx63"><label>Neal et al.(2018)Neal, Dankers, Saulter, Lane, Millard, Robbins, and Price</label><mixed-citation>Neal, R., Dankers, R., Saulter, A., Lane, A., Millard, J., Robbins, G., and Price, D.: Use of probabilistic medium- to long-range weather-pattern forecasts for identifying periods with an increased likelihood of coastal flooding around the UK, Meteorol. Appl., 25, 534–547, <ext-link xlink:href="https://doi.org/10.1002/MET.1719" ext-link-type="DOI">10.1002/MET.1719</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx64"><label>Nevo et al.(2022)Nevo, Morin, Gerzi Rosenthal, Metzger, Barshai, Weitzner, Voloshin, Kratzert, Elidan, Dror, Begelman, Nearing, Shalev, Noga, Shavitt, Yuklea, Royz, Giladi, Peled Levi, Reich, Gilon, Maor, Timnat, Shechter, Anisimov, Gigi, Levin, Moshe, Ben-Haim, Hassidim, and Matias</label><mixed-citation>Nevo, S., Morin, E., Gerzi Rosenthal, A., Metzger, A., Barshai, C., Weitzner, D., Voloshin, D., Kratzert, F., Elidan, G., Dror, G., Begelman, G., Nearing, G., Shalev, G., Noga, H., Shavitt, I., Yuklea, L., Royz, M., Giladi, N., Peled Levi, N., Reich, O., Gilon, O., Maor, R., Timnat, S., Shechter, T., Anisimov, V., Gigi, Y., Levin, Y., Moshe, Z., Ben-Haim, Z., Hassidim, A., and Matias, Y.: Flood forecasting with machine learning models in an operational framework, Hydrol. Earth Syst. Sci., 26, 4013–4032, <ext-link xlink:href="https://doi.org/10.5194/hess-26-4013-2022" ext-link-type="DOI">10.5194/hess-26-4013-2022</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx65"><label>Nied et al.(2014)Nied, Pardowitz, Nissen, Ulbrich, Hundecha, and Merz</label><mixed-citation>Nied, M., Pardowitz, T., Nissen, K., Ulbrich, U., Hundecha, Y., and Merz, B.: On the relationship between hydro-meteorological patterns and flood types, J. Hydrol., 519, 3249–3262, <ext-link xlink:href="https://doi.org/10.1016/j.jhydrol.2014.09.089" ext-link-type="DOI">10.1016/j.jhydrol.2014.09.089</ext-link>, 2014.</mixed-citation></ref>
      <ref id="bib1.bibx66"><label>NRFA(2023)</label><mixed-citation>NRFA: National River Flow Archive (NRFA): River flow and catchment shapefiles for Great Britain, <uri>https://nrfa.ceh.ac.uk/</uri> (last access: 15 April 2026), 2023.</mixed-citation></ref>
      <ref id="bib1.bibx67"><label>O'Brien(2007)</label><mixed-citation>O'Brien, R. M.: A Caution Regarding Rules of Thumb for Variance Inflation Factors, Qual. Quant., 41, 673–690, <ext-link xlink:href="https://doi.org/10.1007/S11135-006-9018-6" ext-link-type="DOI">10.1007/S11135-006-9018-6</ext-link>, 2007.</mixed-citation></ref>
      <ref id="bib1.bibx68"><label>Ojala and Garriga(2010)</label><mixed-citation> Ojala, M. and Garriga, G. C.: Permutation Tests for Studying Classifier Performance, J. Mach. Learn. Res., 11, 1833–1863, 2010.</mixed-citation></ref>
      <ref id="bib1.bibx69"><label>Pan et al.(2022)Pan, Rahman, Haddad, and Ouarda</label><mixed-citation>Pan, X., Rahman, A., Haddad, K., and Ouarda, T. B.: Peaks-over-threshold model in flood frequency analysis: a scoping review, Stoch. Env. Res. Risk A., 36, 2419–2435, <ext-link xlink:href="https://doi.org/10.1007/S00477-022-02174-6" ext-link-type="DOI">10.1007/S00477-022-02174-6</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx70"><label>Pedregosa et al.(2011)Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Müller, Nothman, Louppe, Prettenhofer, Weiss, Dubourg, Vanderplas, Cournapeau, Brucher, and Perrot</label><mixed-citation> Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Müller, A., Nothman, J., Louppe, G., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Cournapeau, D., Brucher, M., and Perrot, M.: Scikit-learn: Machine Learning in Python, J. Mach. Learn. Res., 12, 2825–2830, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx71"><label>Perks et al.(2023)Perks, Bernie, Lowe, and Neal</label><mixed-citation>Perks, R. J., Bernie, D., Lowe, J., and Neal, R.: The influence of future weather pattern changes and projected sea-level rise on coastal flood impacts around the UK, Climatic Change, 176, 1–21, <ext-link xlink:href="https://doi.org/10.1007/S10584-023-03496-2" ext-link-type="DOI">10.1007/S10584-023-03496-2</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx72"><label>Pope et al.(2021)Pope, Brown, Fung, Hanlon, Neal, Palin, and Reid</label><mixed-citation>Pope, J. O., Brown, K., Fung, F., Hanlon, H. M., Neal, R., Palin, E. J., and Reid, A.: Investigation of future climate change over the british isles using weather patterns, Clim. Dynam., 58, 2405–2419, <ext-link xlink:href="https://doi.org/10.1007/s00382-021-06031-0" ext-link-type="DOI">10.1007/s00382-021-06031-0</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx73"><label>Prudhomme and Genevier(2011)</label><mixed-citation>Prudhomme, C. and Genevier, M.: Can atmospheric circulation be linked to flooding in Europe?, Hydrol. Process., 25, 1180–1190, <ext-link xlink:href="https://doi.org/10.1002/HYP.7879" ext-link-type="DOI">10.1002/HYP.7879</ext-link>, 2011.</mixed-citation></ref>
      <ref id="bib1.bibx74"><label>Richardson et al.(2018)Richardson, Fowler, Kilsby, and Neal</label><mixed-citation>Richardson, D., Fowler, H. J., Kilsby, C. G., and Neal, R.: A new precipitation and drought climatology based on weather patterns, Int. J. Climatol., 38, 630–648, <ext-link xlink:href="https://doi.org/10.1002/JOC.5199" ext-link-type="DOI">10.1002/JOC.5199</ext-link>, 2018.</mixed-citation></ref>
      <ref id="bib1.bibx75"><label>Richardson et al.(2020)Richardson, Neal, Dankers, Mylne, Cowling, Clements, and Millard</label><mixed-citation>Richardson, D., Neal, R., Dankers, R., Mylne, K., Cowling, R., Clements, H., and Millard, J.: Linking weather patterns to regional extreme precipitation for highlighting potential flood events in medium- to long-range forecasts, Meteorol. Appl., 27, <ext-link xlink:href="https://doi.org/10.1002/met.1931" ext-link-type="DOI">10.1002/met.1931</ext-link>, 2020.</mixed-citation></ref>
      <ref id="bib1.bibx76"><label>Rodding Kjeldsen and Prosdocimi(2023)</label><mixed-citation>Rodding Kjeldsen, T. and Prosdocimi, I.: Use of peak over threshold data for flood frequency estimation: An application at the UK national scale, J. Hydrol., 626, 130235, <ext-link xlink:href="https://doi.org/10.1016/J.JHYDROL.2023.130235" ext-link-type="DOI">10.1016/J.JHYDROL.2023.130235</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx77"><label>Rosso(2015)</label><mixed-citation>Rosso, G.: Extreme Value Theory for Time Series using Peak-Over-Threshold method-Gianluca Rosso (2015) Extreme Value Theory for Time Series using Peak-Over-Threshold method, <uri>https://api.semanticscholar.org/CorpusID:88521862</uri>, 2015.</mixed-citation></ref>
      <ref id="bib1.bibx78"><label>Schlef et al.(2019)Schlef, Moradkhani, and Lall</label><mixed-citation>Schlef, K. E., Moradkhani, H., and Lall, U.: Atmospheric Circulation Patterns Associated with Extreme United States Floods Identified via Machine Learning, Sci. Rep.-UK, 9, 1–12, <ext-link xlink:href="https://doi.org/10.1038/s41598-019-43496-w" ext-link-type="DOI">10.1038/s41598-019-43496-w</ext-link>, 2019.</mixed-citation></ref>
      <ref id="bib1.bibx79"><label>Scussolini et al.(2024)Scussolini, Luu, Philip, Berghuijs, Eilander, Aerts, Kew, van Oldenborgh, Toonen, Volkholz, and Coumou</label><mixed-citation>Scussolini, P., Luu, L. N., Philip, S., Berghuijs, W. R., Eilander, D., Aerts, J. C., Kew, S. F., van Oldenborgh, G. J., Toonen, W. H., Volkholz, J., and Coumou, D.: Challenges in the attribution of river flood events, WIRes Clim. Change, 15, e874, <ext-link xlink:href="https://doi.org/10.1002/WCC.874" ext-link-type="DOI">10.1002/WCC.874</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx80"><label>Sefton et al.(2021)Sefton, Muchan, Parry, Matthews, Barker, Turner, and Hannaford</label><mixed-citation>Sefton, C., Muchan, K., Parry, S., Matthews, B., Barker, L. J., Turner, S., and Hannaford, J.: The 2019/2020 floods in the UK: a hydrological appraisal, Weather, 76, 378–384, <ext-link xlink:href="https://doi.org/10.1002/WEA.3993" ext-link-type="DOI">10.1002/WEA.3993</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx81"><label>Sillmann et al.(2017)Sillmann, Thorarinsdottir, Keenlyside, Schaller, Alexander, Hegerl, Seneviratne, Vautard, Zhang, and Zwiers</label><mixed-citation>Sillmann, J., Thorarinsdottir, T., Keenlyside, N., Schaller, N., Alexander, L. V., Hegerl, G., Seneviratne, S. I., Vautard, R., Zhang, X., and Zwiers, F. W.: Understanding, modeling and predicting weather and climate extremes: Challenges and opportunities, Weather and Climate Extremes, 18, 65–74, <ext-link xlink:href="https://doi.org/10.1016/J.WACE.2017.10.003" ext-link-type="DOI">10.1016/J.WACE.2017.10.003</ext-link>, 2017.</mixed-citation></ref>
      <ref id="bib1.bibx82"><label>Slater et al.(2024)Slater, Coxon, Brunner, McMillan, Yu, Zheng, Khouakhi, Moulds, and Berghuijs</label><mixed-citation>Slater, L., Coxon, G., Brunner, M., McMillan, H., Yu, L., Zheng, Y., Khouakhi, A., Moulds, S., and Berghuijs, W.: Spatial Sensitivity of River Flooding to Changes in Climate and Land Cover Through Explainable AI, Earths Future, 12, e2023EF004035, <ext-link xlink:href="https://doi.org/10.1029/2023EF004035" ext-link-type="DOI">10.1029/2023EF004035</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx83"><label>Slater et al.(2025)Slater, Blougouras, Deng, Deng, Ford, Hoek Van Dijke, Huang, Jiang, Liu, Moulds, Schepen, Yin, and Zhang</label><mixed-citation>Slater, L., Blougouras, G., Deng, L., Deng, Q., Ford, E., Hoek Van Dijke, A., Huang, F., Jiang, S., Liu, Y., Moulds, S., Schepen, A., Yin, J., and Zhang, B.: Challenges and opportunities of ML and explainable AI in large-sample hydrology, Philos. T. R. Soc. A, 383, <ext-link xlink:href="https://doi.org/10.1098/rsta.2024.0287" ext-link-type="DOI">10.1098/rsta.2024.0287</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx84"><label>Staudinger et al.(2025)Staudinger, Kauzlaric, Mas, Evin, Hingray, and Viviroli</label><mixed-citation>Staudinger, M., Kauzlaric, M., Mas, A., Evin, G., Hingray, B., and Viviroli, D.: The role of antecedent conditions in translating precipitation events into extreme floods at the catchment scale and in a large-basin context, Nat. Hazards Earth Syst. Sci., 25, 247–265, <ext-link xlink:href="https://doi.org/10.5194/nhess-25-247-2025" ext-link-type="DOI">10.5194/nhess-25-247-2025</ext-link>, 2025.</mixed-citation></ref>
      <ref id="bib1.bibx85"><label>Tabari(2021)</label><mixed-citation>Tabari, H.: Extreme value analysis dilemma for climate change impact assessment on global flood and extreme precipitation, J. Hydrol., 593, 125932, <ext-link xlink:href="https://doi.org/10.1016/J.JHYDROL.2020.125932" ext-link-type="DOI">10.1016/J.JHYDROL.2020.125932</ext-link>, 2021.</mixed-citation></ref>
      <ref id="bib1.bibx86"><label>Tarasova et al.(2023)Tarasova, Lun, Merz, Blöschl, Basso, Bertola, Miniussi, Rakovec, Samaniego, Thober, and Kumar</label><mixed-citation>Tarasova, L., Lun, D., Merz, R., Blöschl, G., Basso, S., Bertola, M., Miniussi, A., Rakovec, O., Samaniego, L., Thober, S., and Kumar, R.: Shifts in flood generation processes exacerbate regional flood anomalies in Europe, Communications Earth &amp; Environment, 4, 1–12, <ext-link xlink:href="https://doi.org/10.1038/s43247-023-00714-8" ext-link-type="DOI">10.1038/s43247-023-00714-8</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx87"><label>Towler et al.(2023)Towler, Foks, Dugger, Dickinson, Essaid, Gochis, Viger, and Zhang</label><mixed-citation>Towler, E., Foks, S. S., Dugger, A. L., Dickinson, J. E., Essaid, H. I., Gochis, D., Viger, R. J., and Zhang, Y.: Benchmarking high-resolution hydrologic model performance of long-term retrospective streamflow simulations in the contiguous United States, Hydrol. Earth Syst. Sci., 27, 1809–1825, <ext-link xlink:href="https://doi.org/10.5194/hess-27-1809-2023" ext-link-type="DOI">10.5194/hess-27-1809-2023</ext-link>, 2023.</mixed-citation></ref>
      <ref id="bib1.bibx88"><label>van Hamel and Brunner(2024)</label><mixed-citation>van Hamel, A. and Brunner, M. I.: Trends and Drivers of Water Temperature Extremes in Mountain Rivers, Water Resour. Res., 60, e2024WR037518, <ext-link xlink:href="https://doi.org/10.1029/2024WR037518" ext-link-type="DOI">10.1029/2024WR037518</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx89"><label>Wang et al.(2016)Wang, Li, Pu, Wen, Shugart, Xiong, and Jin</label><mixed-citation>Wang, Y., Li, Y., Pu, W., Wen, K., Shugart, Y. Y., Xiong, M., and Jin, L.: Random Bits Forest: a Strong Classifier/Regressor for Big Data, Sci. Rep.-UK, 6, 1–8, <ext-link xlink:href="https://doi.org/10.1038/srep30086" ext-link-type="DOI">10.1038/srep30086</ext-link>, 2016.</mixed-citation></ref>
      <ref id="bib1.bibx90"><label>Westerberg et al.(2022)Westerberg, Sikorska-Senoner, Viviroli, Vis, and Seibert</label><mixed-citation>Westerberg, I. K., Sikorska-Senoner, A. E., Viviroli, D., Vis, M., and Seibert, J.: Hydrological model calibration with uncertain discharge data, Hydrolog. Sci. J., 67, 2441–2456, <ext-link xlink:href="https://doi.org/10.1080/02626667.2020.1735638" ext-link-type="DOI">10.1080/02626667.2020.1735638</ext-link>, 2022.</mixed-citation></ref>
      <ref id="bib1.bibx91"><label>Wilby(1993)</label><mixed-citation>Wilby, R. L.: The influence of variable weather patterns on river water quantity and quality regimes, Int. J. Climatol., 13, 447–459, <ext-link xlink:href="https://doi.org/10.1002/JOC.3370130408" ext-link-type="DOI">10.1002/JOC.3370130408</ext-link>, 1993. </mixed-citation></ref>
      <ref id="bib1.bibx92"><label>Xu et al.(2024)Xu, Lin, Hu, Chen, Zhang, Xiao, and Xu</label><mixed-citation>Xu, Y., Lin, K., Hu, C., Chen, X., Zhang, J., Xiao, M., and Xu, C.-Y.: Uncovering the Dynamic Drivers of Floods Through Interpretable Deep Learning, Earths Future, 12, e2024EF004751, <ext-link xlink:href="https://doi.org/10.1029/2024EF004751" ext-link-type="DOI">10.1029/2024EF004751</ext-link>, 2024.</mixed-citation></ref>
      <ref id="bib1.bibx93"><label>Yuan and Lozano-Durán(2024)</label><mixed-citation>Yuan, Y. and Lozano-Durán, A.: Limits to extreme event forecasting in chaotic systems, Physica D, 467, 134246, <ext-link xlink:href="https://doi.org/10.1016/J.PHYSD.2024.134246" ext-link-type="DOI">10.1016/J.PHYSD.2024.134246</ext-link>, 2024.</mixed-citation></ref>

  </ref-list></back>
    <!--<article-title-html>Interpretable feature incorporation machine-learning framework for flood magnitude estimation</article-title-html>
<abstract-html/>
<ref-html id="bib1.bib1"><label>Ansell et al.(2006)Ansell, Jones, Allan, Lister, Parker, Brunet, Moberg, Jacobeit, Brohan, Rayner, Aguilar, Alexandersson, Barriendos, Brandsma, Cox, Della-Marta, Drebs, Founda, Gerstengarbe, Hickey, Jónsson, Luterbacher, Nordli, Oesterle, Petrakis, Philipp, Rodwell, Saladie, Sigro, Slonosky, Srnec, Swail, García-Suárez, Tuomenvirta, Wang, Wanner, Werner, Wheeler, and Xoplaki</label><mixed-citation>
       Ansell, T. J., Jones, P. D., Allan, R. J., Lister, D., Parker, D. E., Brunet, M., Moberg, A., Jacobeit, J., Brohan, P., Rayner, N. A., Aguilar, E., Alexandersson, H., Barriendos, M., Brandsma, T., Cox, N. J., Della-Marta, P. M., Drebs, A., Founda, D., Gerstengarbe, F., Hickey, K., Jónsson, T., Luterbacher, J., Nordli, Ø., Oesterle, H., Petrakis, M., Philipp, A., Rodwell, M. J., Saladie, O., Sigro, J., Slonosky, V., Srnec, L., Swail, V., García-Suárez, A. M., Tuomenvirta, H., Wang, X., Wanner, H., Werner, P., Wheeler, D., and Xoplaki, E.: Daily Mean Sea Level Pressure Reconstructions for the European–North Atlantic Region for the Period 1850–2003, J. Climate, 19, 2717–2742, <a href="https://doi.org/10.1175/JCLI3775.1" target="_blank">https://doi.org/10.1175/JCLI3775.1</a>, 2006.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib2"><label>Bárdossy and Filiz(2005)</label><mixed-citation>
       Bárdossy, A. and Filiz, F.: Identification of flood producing atmospheric circulation patterns, J. Hydrol., 313, 48–57, <a href="https://doi.org/10.1016/j.jhydrol.2005.02.006" target="_blank">https://doi.org/10.1016/j.jhydrol.2005.02.006</a>, 2005.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib3"><label>Bartens et al.(2024)Bartens, Shehu, and Haberlandt</label><mixed-citation>
       Bartens, A., Shehu, B., and Haberlandt, U.: Flood frequency analysis using mean daily flows vs. instantaneous peak flows, Hydrol. Earth Syst. Sci., 28, 1687–1709, <a href="https://doi.org/10.5194/hess-28-1687-2024" target="_blank">https://doi.org/10.5194/hess-28-1687-2024</a>, 2024. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib4"><label>Beck and Philipp(2010)</label><mixed-citation>
       Beck, C. and Philipp, A.: Evaluation and comparison of circulation type classifications for the European domain, Phys. Chem. Earth Pt. A/B/C, 35, 374–387, <a href="https://doi.org/10.1016/J.PCE.2010.01.001" target="_blank">https://doi.org/10.1016/J.PCE.2010.01.001</a>, 2010.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib5"><label>Berghuijs et al.(2016)Berghuijs, Woods, Hutton, and Sivapalan</label><mixed-citation>
       Berghuijs, W. R., Woods, R. A., Hutton, C. J., and Sivapalan, M.: Dominant flood generating mechanisms across the United States, Geophys. Res. Lett., 43, 4382–4390, <a href="https://doi.org/10.1002/2016GL068070" target="_blank">https://doi.org/10.1002/2016GL068070</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib6"><label>Berghuijs et al.(2019)Berghuijs, Harrigan, Molnar, Slater, and Kirchner</label><mixed-citation>
       Berghuijs, W. R., Harrigan, S., Molnar, P., Slater, L. J., and Kirchner, J. W.: The Relative Importance of Different Flood-Generating Mechanisms Across Europe, Water Resour. Res., 55, 4582–4593, <a href="https://doi.org/10.1029/2019WR024841" target="_blank">https://doi.org/10.1029/2019WR024841</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib7"><label>Bertola et al.(2020)Bertola, Viglione, Lun, Hall, and Blöschl</label><mixed-citation>
       Bertola, M., Viglione, A., Lun, D., Hall, J., and Blöschl, G.: Flood trends in Europe: are changes in small and big floods different?, Hydrol. Earth Syst. Sci., 24, 1805–1822, <a href="https://doi.org/10.5194/hess-24-1805-2020" target="_blank">https://doi.org/10.5194/hess-24-1805-2020</a>, 2020. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib8"><label>Blöschl et al.(2019)Blöschl, Hall, Viglione, Perdigão, Parajka, Merz, Lun, Arheimer, Aronica, Bilibashi, Boháč, Bonacci, Borga, Čanjevac, Castellarin, Chirico, Claps, Frolova, Ganora, Gorbachova, Gül, Hannaford, Harrigan, Kireeva, Kiss, Kjeldsen, Kohnová, Koskela, Ledvinka, Macdonald, Mavrova-Guirguinova, Mediero, Merz, Molnar, Montanari, Murphy, Osuch, Ovcharuk, Radevski, Salinas, Sauquet, Šraj, Szolgay, Volpi, Wilson, Zaimi, and Živković</label><mixed-citation>
       Blöschl, G., Hall, J., Viglione, A., Perdigão, R. A., Parajka, J., Merz, B., Lun, D., Arheimer, B., Aronica, G. T., Bilibashi, A., Boháč, M., Bonacci, O., Borga, M., Čanjevac, I., Castellarin, A., Chirico, G. B., Claps, P., Frolova, N., Ganora, D., Gorbachova, L., Gül, A., Hannaford, J., Harrigan, S., Kireeva, M., Kiss, A., Kjeldsen, T. R., Kohnová, S., Koskela, J. J., Ledvinka, O., Macdonald, N., Mavrova-Guirguinova, M., Mediero, L., Merz, R., Molnar, P., Montanari, A., Murphy, C., Osuch, M., Ovcharuk, V., Radevski, I., Salinas, J. L., Sauquet, E., Šraj, M., Szolgay, J., Volpi, E., Wilson, D., Zaimi, K., and Živković, N.: Changing climate both increases and decreases European river floods, Nature, 573, 108–111, <a href="https://doi.org/10.1038/S41586-019-1495-6" target="_blank">https://doi.org/10.1038/S41586-019-1495-6</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib9"><label>Botache et al.(2023)Botache, Dingel, Huhnstock, Ehresmann, and Sick</label><mixed-citation>
       Botache, D., Dingel, K., Huhnstock, R., Ehresmann, A., and Sick, B.: Unraveling the Complexity of Splitting Sequential Data: Tackling Challenges in Video and Time Series Analysis, arXiv [preprint], <a href="https://doi.org/10.48550/arXiv.2307.14294" target="_blank">https://doi.org/10.48550/arXiv.2307.14294</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib10"><label>Breiman(2001)</label><mixed-citation>
       Breiman, L.: Random forests, Mach. Learn., 45, 5–32, <a href="https://doi.org/10.1023/A:1010933404324" target="_blank">https://doi.org/10.1023/A:1010933404324</a>, 2001.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib11"><label>Brown et al.(2023)Brown, Robinson, Kay, Chapman, Bell, and Blyth</label><mixed-citation>
       Brown, M., Robinson, E., Kay, A., Chapman, R., Bell, V., and Blyth, E.: Potential evapotranspiration derived from HadUK-Grid 1km gridded climate observations 1969–2022 (Hydro-PE HadUK-Grid), <a href="https://doi.org/10.5285/BEB62085-BA81-480C-9ED0-2D31C27FF196" target="_blank">https://doi.org/10.5285/BEB62085-BA81-480C-9ED0-2D31C27FF196</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib12"><label>Brunner and Dougherty(2022)</label><mixed-citation>
      
Brunner, M. I. and Dougherty, E. M.: Varying Importance of Storm Types and Antecedent Conditions for Local and Regional Floods, Water Resour. Res., 58, <a href="https://doi.org/10.1029/2022WR033249" target="_blank">https://doi.org/10.1029/2022WR033249</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib13"><label>Brunner and Slater(2022)</label><mixed-citation>
      
Brunner, M. I. and Slater, L. J.: Extreme floods in Europe: going beyond observations using reforecast ensemble pooling, Hydrol. Earth Syst. Sci., 26, 469–482, <a href="https://doi.org/10.5194/hess-26-469-2022" target="_blank">https://doi.org/10.5194/hess-26-469-2022</a>, 2022. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib14"><label>Brunner et al.(2021)Brunner, Slater, Tallaksen, and Clark</label><mixed-citation>
       Brunner, M. I., Slater, L., Tallaksen, L. M., and Clark, M.: Challenges in modeling and predicting floods and droughts: A review, Wiley Interdisciplinary Reviews: Water, 8, e1520, <a href="https://doi.org/10.1002/WAT2.1520" target="_blank">https://doi.org/10.1002/WAT2.1520</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib15"><label>Chicco et al.(2021)Chicco, Warrens, and Jurman</label><mixed-citation>
       Chicco, D., Warrens, M. J., and Jurman, G.: The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation, PeerJ Computer Science, 7, 1–24, <a href="https://doi.org/10.7717/PEERJ-CS.623/SUPP-1" target="_blank">https://doi.org/10.7717/PEERJ-CS.623/SUPP-1</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib16"><label>Coxon et al.(2020)Coxon, Addor, Bloomfield, Freer, Fry, Hannaford, Howden, Lane, Lewis, Robinson, Wagener, and Woods</label><mixed-citation>
       Coxon, G., Addor, N., Bloomfield, J. P., Freer, J., Fry, M., Hannaford, J., Howden, N. J. K., Lane, R., Lewis, M., Robinson, E. L., Wagener, T., and Woods, R.: CAMELS-GB: hydrometeorological time series and landscape attributes for 671 catchments in Great Britain, Earth Syst. Sci. Data, 12, 2459–2483, <a href="https://doi.org/10.5194/essd-12-2459-2020" target="_blank">https://doi.org/10.5194/essd-12-2459-2020</a>, 2020. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib17"><label>Coxon et al.(2024)Coxon, McMillan, Bloomfield, Bolotin, Dean, Kelleher, Slater, and Zheng</label><mixed-citation>
       Coxon, G., McMillan, H., Bloomfield, J. P., Bolotin, L., Dean, J. F., Kelleher, C., Slater, L., and Zheng, Y.: Wastewater discharges and urban land cover dominate urban hydrology signals across England and Wales, Environ. Res. Lett., 19, 084016, <a href="https://doi.org/10.1088/1748-9326/AD5BF2" target="_blank">https://doi.org/10.1088/1748-9326/AD5BF2</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib18"><label>Coxon et al.(2025)Coxon, Zheng, Barbedo, Cooper, Fileni, Fowler, Fry, Green, Gribbin, Harfoot, Lewis, Gondim, Neto, Qiu, Salwey, and Wendt</label><mixed-citation>
       Coxon, G., Zheng, Y., Barbedo, R., Cooper, H., Fileni, F., Fowler, H. J., Fry, M., Green, A., Gribbin, T., Harfoot, H., Lewis, E., Gondim, G., Neto, R., Qiu, X., Salwey, S., and Wendt, D. E.: CAMELS-GB v2: hydrometeorological time series and landscape attributes for 671 catchments in Great Britain, Earth Syst. Sci. Data Discuss. [preprint], <a href="https://doi.org/10.5194/essd-2025-608" target="_blank">https://doi.org/10.5194/essd-2025-608</a>, in review, 2025.  
    </mixed-citation></ref-html>
<ref-html id="bib1.bib19"><label>Cutler et al.(2012)Cutler, Cutler, and Stevens</label><mixed-citation>
       Cutler, A., Cutler, D. R., and Stevens, J. R.: Random Forests, Ensemble Machine Learning,  157–175, <a href="https://doi.org/10.1007/978-1-4419-9326-7_5" target="_blank">https://doi.org/10.1007/978-1-4419-9326-7_5</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib20"><label>Duckstein et al.(1993)Duckstein, Bárdossy, and Bogárdi</label><mixed-citation>
       Duckstein, L., Bárdossy, A., and Bogárdi, I.: Linkage between the occurrence of daily atmospheric circulation patterns and floods: an Arizona case study, J. Hydrol., 143, 413–428, <a href="https://doi.org/10.1016/0022-1694(93)90202-K" target="_blank">https://doi.org/10.1016/0022-1694(93)90202-K</a>, 1993.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib21"><label>Fabiano et al.(2021)Fabiano, Meccia, Davini, Ghinassi, and Corti</label><mixed-citation>
       Fabiano, F., Meccia, V. L., Davini, P., Ghinassi, P., and Corti, S.: A regime view of future atmospheric circulation changes in northern mid-latitudes, Weather Clim. Dynam., 2, 163–180, <a href="https://doi.org/10.5194/wcd-2-163-2021" target="_blank">https://doi.org/10.5194/wcd-2-163-2021</a>, 2021. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib22"><label>Fawagreh et al.(2014)Fawagreh, Gaber, and Elyan</label><mixed-citation>
       Fawagreh, K., Gaber, M. M., and Elyan, E.: Random forests: From early developments to recent advancements, Systems Science and Control Engineering, 2, 602–609, <a href="https://doi.org/10.1080/21642583.2014.956265" target="_blank">https://doi.org/10.1080/21642583.2014.956265</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib23"><label>Fileni et al.(2023)Fileni, Fowler, Lewis, McLay, and Yang</label><mixed-citation>
       Fileni, F., Fowler, H. J., Lewis, E., McLay, F., and Yang, L.: A quality-control framework for sub-daily flow and level data for hydrological modelling in Great Britain, Hydrol. Res., 54, 1357–1367, <a href="https://doi.org/10.2166/NH.2023.045" target="_blank">https://doi.org/10.2166/NH.2023.045</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib24"><label>Fleming et al.(2021)Fleming, Watson, Ellenson, Cannon, and Vesselinov</label><mixed-citation>
       Fleming, S. W., Watson, J. R., Ellenson, A., Cannon, A. J., and Vesselinov, V. C.: Machine learning in Earth and environmental science requires education and research policy reforms, Nat. Geosci., 14, 878–880, <a href="https://doi.org/10.1038/s41561-021-00865-3" target="_blank">https://doi.org/10.1038/s41561-021-00865-3</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib25"><label>Frame et al.(2022)Frame, Kratzert, Klotz, Gauch, Shelev, Gilon, Qualls, Gupta, and Nearing</label><mixed-citation>
       Frame, J. M., Kratzert, F., Klotz, D., Gauch, M., Shalev, G., Gilon, O., Qualls, L. M., Gupta, H. V., and Nearing, G. S.: Deep learning rainfall–runoff predictions of extreme events, Hydrol. Earth Syst. Sci., 26, 3377–3392, <a href="https://doi.org/10.5194/hess-26-3377-2022" target="_blank">https://doi.org/10.5194/hess-26-3377-2022</a>, 2022. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib26"><label>Graham et al.(2014)Graham, Mathur, and Baldwin</label><mixed-citation>
       Graham, Y., Mathur, N., and Baldwin, T.: Randomized Significance Tests in Machine Translation, Proceedings of the Annual Meeting of the Association for Computational Linguistics,  266–274, <a href="https://doi.org/10.3115/V1/W14-3333" target="_blank">https://doi.org/10.3115/V1/W14-3333</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib27"><label>Griffin et al.(2019)Griffin, Vesuviano, and Stewart</label><mixed-citation>
       Griffin, A., Vesuviano, G., and Stewart, E.: Have trends changed over time? A study of UK peak flow data and sensitivity to observation period, Nat. Hazards Earth Syst. Sci., 19, 2157–2167, <a href="https://doi.org/10.5194/nhess-19-2157-2019" target="_blank">https://doi.org/10.5194/nhess-19-2157-2019</a>, 2019. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib28"><label>Griffin et al.(2024)</label><mixed-citation>
      Griffin, A., Kay, A. L., Sayers, P., Bell, V., Stewart, E., and Carr, S.: Widespread flooding dynamics under climate change: characterising floods using grid-based hydrological modelling and regional climate projections, Hydrol. Earth Syst. Sci., 28, 2635–2650, <a href="https://doi.org/10.5194/hess-28-2635-2024" target="_blank">https://doi.org/10.5194/hess-28-2635-2024</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib29"><label>Griffin et al.(2025)Griffin, Vesuviano, Wilson, Sefton, Turner, Armitage, and Suman</label><mixed-citation>
       Griffin, A., Vesuviano, G., Wilson, D., Sefton, C., Turner, S., Armitage, R., and Suman, G.: Putting the English Flooding of 2019–2021 in the Context of Antecedent Conditions, J. Flood Risk Manag., 18, e70016, <a href="https://doi.org/10.1111/JFR3.70016" target="_blank">https://doi.org/10.1111/JFR3.70016</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib30"><label>Hakim et al.(2024)Hakim, Gernowo, and Nirwansyah</label><mixed-citation>
       Hakim, D. K., Gernowo, R., and Nirwansyah, A. W.: Flood prediction with time series data mining: Systematic review, Natural Hazards Research, 4, 194–220, <a href="https://doi.org/10.1016/J.NHRES.2023.10.001" target="_blank">https://doi.org/10.1016/J.NHRES.2023.10.001</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib31"><label>Harrigan et al.(2018)Harrigan, Hannaford, Muchan, and Marsh</label><mixed-citation>
       Harrigan, S., Hannaford, J., Muchan, K., and Marsh, T. J.: Designation and trend analysis of the updated UK Benchmark Network of river flow stations: the UKBN2 dataset, Hydrol. Res., 49, 552–567, <a href="https://doi.org/10.2166/NH.2017.058" target="_blank">https://doi.org/10.2166/NH.2017.058</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib32"><label>Hendry et al.(2019)Hendry, Haigh, Nicholls, Winter, Neal, Wahl, Joly-Lauge, and Darby</label><mixed-citation>
       Hendry, A., Haigh, I. D., Nicholls, R. J., Winter, H., Neal, R., Wahl, T., Joly-Laugel, A., and Darby, S. E.: Assessing the characteristics and drivers of compound flooding events around the UK coast, Hydrol. Earth Syst. Sci., 23, 3117–3139, <a href="https://doi.org/10.5194/hess-23-3117-2019" target="_blank">https://doi.org/10.5194/hess-23-3117-2019</a>, 2019. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib33"><label>Hollis et al.(2019)Hollis, McCarthy, Kendon, Legg, and Simpson</label><mixed-citation>
       Hollis, D., McCarthy, M., Kendon, M., Legg, T., and Simpson, I.: HadUK-Grid – A new UK dataset of gridded climate observations, Geosci. Data J., 6, 151–159, <a href="https://doi.org/10.1002/GDJ3.78" target="_blank">https://doi.org/10.1002/GDJ3.78</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib34"><label>Horner et al.(2018)Horner, Renard, Le Coz, Branger, McMillan, and Pierrefeu</label><mixed-citation>
       Horner, I., Renard, B., Le Coz, J., Branger, F., McMillan, H. K., and Pierrefeu, G.: Impact of Stage Measurement Errors on Streamflow Uncertainty, Water Resour. Res., 54, 1952–1976, <a href="https://doi.org/10.1002/2017WR022039" target="_blank">https://doi.org/10.1002/2017WR022039</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib35"><label>Huang et al.(2020)Huang, Charlton-Perez, Lee, Neal, Sarran, and Sun</label><mixed-citation>
       Huang, W. T. K., Charlton-Perez, A., Lee, R. W., Neal, R., Sarran, C., and Sun, T.: Weather regimes and patterns associated with temperature-related excess mortality in the UK: a pathway to sub-seasonal risk forecasting, Environ. Res. Lett., 15, 124052, <a href="https://doi.org/10.1088/1748-9326/ABCBBA" target="_blank">https://doi.org/10.1088/1748-9326/ABCBBA</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib36"><label>Jiang et al.(2022)Jiang, Bevacqua, and Zscheischler</label><mixed-citation>
       Jiang, S., Bevacqua, E., and Zscheischler, J.: River flooding mechanisms and their changes in Europe revealed by explainable machine learning, Hydrol. Earth Syst. Sci., 26, 6339–6359, <a href="https://doi.org/10.5194/hess-26-6339-2022" target="_blank">https://doi.org/10.5194/hess-26-6339-2022</a>, 2022. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib37"><label>Kratzert et al.(2018)Kratzert, Klotz, Brenner, Schulz, and Herrnegger</label><mixed-citation>
       Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005–6022, <a href="https://doi.org/10.5194/hess-22-6005-2018" target="_blank">https://doi.org/10.5194/hess-22-6005-2018</a>, 2018.  
    </mixed-citation></ref-html>
<ref-html id="bib1.bib38"><label>Kratzert et al.(2019)Kratzert, Klotz, Shalev, Klambauer, Hochreiter, and Nearing</label><mixed-citation>
       Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., and Nearing, G.: Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets, Hydrol. Earth Syst. Sci., 23, 5089–5110, <a href="https://doi.org/10.5194/hess-23-5089-2019" target="_blank">https://doi.org/10.5194/hess-23-5089-2019</a>, 2019. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib39"><label>Kratzert et al.(2022)Kratzert, Gauch, Nearing, and Klotz</label><mixed-citation>
       Kratzert, F., Gauch, M., Nearing, G., and Klotz, D.: NeuralHydrology – A Python library for Deep Learning research in hydrology, J. Open Source Softw., 7, 4050, <a href="https://doi.org/10.21105/JOSS.04050" target="_blank">https://doi.org/10.21105/JOSS.04050</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib40"><label>Kratzert et al.(2024)Kratzert, Gauch, Klotz, and Nearing</label><mixed-citation>
       Kratzert, F., Gauch, M., Klotz, D., and Nearing, G.: HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin, Hydrol. Earth Syst. Sci., 28, 4187–4201, <a href="https://doi.org/10.5194/hess-28-4187-2024" target="_blank">https://doi.org/10.5194/hess-28-4187-2024</a>, 2024. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib41"><label>Lamane et al.(2024)Lamane, Mouhir, Moussadek, Baghdad, Kisi, and El Bilali</label><mixed-citation>
       Lamane, H., Mouhir, L., Moussadek, R., Baghdad, B., Kisi, O., and El Bilali, A.: Interpreting machine learning models based on SHAP values in predicting suspended sediment concentration, Int. J. Sediment Res., <a href="https://doi.org/10.1016/J.IJSRC.2024.10.002" target="_blank">https://doi.org/10.1016/J.IJSRC.2024.10.002</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib42"><label>Lamb(1972)</label><mixed-citation>
       Lamb, H. H.: British Isles weather types and a register of daily sequence of circulation patterns, 1861–1971, <a href="https://openlibrary.org/works/OL3523120W/British_Isles_weather_types_and_a_register_of_the_daily_sequence_of_circulation_patterns_1861-1971" target="_blank"/>, 1972.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib43"><label>Lane et al.(2019)Lane, Coxon, Freer, Wagener, Johnes, Bloomfield, Greene, Macleod, and Reaney</label><mixed-citation>
       Lane, R. A., Coxon, G., Freer, J. E., Wagener, T., Johnes, P. J., Bloomfield, J. P., Greene, S., Macleod, C. J. A., and Reaney, S. M.: Benchmarking the predictive capability of hydrological models for river flow and flood peak predictions across over 1000 catchments in Great Britain, Hydrol. Earth Syst. Sci., 23, 4011–4032, <a href="https://doi.org/10.5194/hess-23-4011-2019" target="_blank">https://doi.org/10.5194/hess-23-4011-2019</a>, 2019. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib44"><label>Lavers et al.(2010)Lavers, Prudhomme, and Hannah</label><mixed-citation>
       Lavers, D., Prudhomme, C., and Hannah, D. M.: Large-scale climate, precipitation and British river flows: Identifying hydroclimatological connections and dynamics, J. Hydrol., 395, 242–255, <a href="https://doi.org/10.1016/J.JHYDROL.2010.10.036" target="_blank">https://doi.org/10.1016/J.JHYDROL.2010.10.036</a>, 2010.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib45"><label>Lavers et al.(2012)Lavers, Villarini, Allan, Wood, and Wade</label><mixed-citation>
       Lavers, D. A., Villarini, G., Allan, R. P., Wood, E. F., and Wade, A. J.: The detection of atmospheric rivers in atmospheric reanalyses and their links to British winter floods and the large-scale climatic circulation, J. Geophys. Res.-Atmos., 117, 20106, <a href="https://doi.org/10.1029/2012JD018027" target="_blank">https://doi.org/10.1029/2012JD018027</a>, 2012.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib46"><label>Lavers et al.(2020)Lavers, Ralph, Richardson, and Pappenberger</label><mixed-citation>
       Lavers, D. A., Ralph, F. M., Richardson, D. S., and Pappenberger, F.: Improved forecasts of atmospheric rivers through systematic reconnaissance, better modelling, and insights on conversion of rain to flooding, Commun. Earth Environ., 1, 1–7, <a href="https://doi.org/10.1038/s43247-020-00042-1" target="_blank">https://doi.org/10.1038/s43247-020-00042-1</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib47"><label>Ledingham et al.(2019)Ledingham, Archer, Lewis, Fowler, and Kilsby</label><mixed-citation>
       Ledingham, J., Archer, D., Lewis, E., Fowler, H., and Kilsby, C.: Contrasting seasonality of storm rainfall and flood runoff in the UK and some implications for rainfall-runoff methods of flood estimation, Hydrol. Res., 50, 1309–1323, <a href="https://doi.org/10.2166/NH.2019.040" target="_blank">https://doi.org/10.2166/NH.2019.040</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib48"><label>Lees et al.(2021)Lees, Buechel, Anderson, Slater, Reece, Coxon, and Dadson</label><mixed-citation>
       Lees, T., Buechel, M., Anderson, B., Slater, L., Reece, S., Coxon, G., and Dadson, S. J.: Benchmarking data-driven rainfall–runoff models in Great Britain: a comparison of long short-term memory (LSTM)-based models with four lumped conceptual models, Hydrol. Earth Syst. Sci., 25, 5517–5534, <a href="https://doi.org/10.5194/hess-25-5517-2021" target="_blank">https://doi.org/10.5194/hess-25-5517-2021</a>, 2021. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib49"><label>Lees et al.(2022)Lees, Reece, Kratzert, Klotz, Gauch, De Bruijn, Kumar Sahu, Greve, Slater, and Dadson</label><mixed-citation>
       Lees, T., Reece, S., Kratzert, F., Klotz, D., Gauch, M., De Bruijn, J., Kumar Sahu, R., Greve, P., Slater, L., and Dadson, S. J.: Hydrological concept formation inside long short-term memory (LSTM) networks, Hydrol. Earth Syst. Sci., 26, 3079–3101, <a href="https://doi.org/10.5194/hess-26-3079-2022" target="_blank">https://doi.org/10.5194/hess-26-3079-2022</a>, 2022. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib50"><label>Ley et al.(2024)Ley, Bormann, and Casper</label><mixed-citation>
       Ley, A., Bormann, H., and Casper, M.: Linking explainable artificial intelligence and soil moisture dynamics in a machine learning streamflow model, Hydrol. Res., 55, 613–627, <a href="https://doi.org/10.2166/NH.2024.003" target="_blank">https://doi.org/10.2166/NH.2024.003</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib51"><label>Liu et al.(2022)Liu, Feng, Gu, Zhang, Beck, Zhang, and Yan</label><mixed-citation>
       Liu, J., Feng, S., Gu, X., Zhang, Y., Beck, H. E., Zhang, J., and Yan, S.: Global changes in floods and their drivers, J. Hydrol., 614, 128553, <a href="https://doi.org/10.1016/J.JHYDROL.2022.128553" target="_blank">https://doi.org/10.1016/J.JHYDROL.2022.128553</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib52"><label>Lundberg and Lee(2017)</label><mixed-citation>
      
Lundberg, S. and Lee, S.: A unified approach to interpreting model predictions , Advances in Neural Information for Processing Systems, 30, <a href="https://proceedings.neurips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf" target="_blank"/>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib53"><label>Lundberg et al.(2020)Lundberg, Erion, Chen, DeGrave, Prutkin, Nair, Katz, Himmelfarb, Bansal, and Lee</label><mixed-citation>
       Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., and Lee, S. I.: From local explanations to global understanding with explainable AI for trees, Nature Machine Intelligence, 2, 56–67, <a href="https://doi.org/10.1038/s42256-019-0138-9" target="_blank">https://doi.org/10.1038/s42256-019-0138-9</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib54"><label>Mailhot et al.(2013)Mailhot, Lachance-Cloutier, Talbot, and Favre</label><mixed-citation>
       Mailhot, A., Lachance-Cloutier, S., Talbot, G., and Favre, A. C.: Regional estimates of intense rainfall based on the Peak-Over-Threshold (POT) approach, J. Hydrol., 476, 188–199, <a href="https://doi.org/10.1016/J.JHYDROL.2012.10.036" target="_blank">https://doi.org/10.1016/J.JHYDROL.2012.10.036</a>, 2013.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib55"><label>Massari et al.(2023)Massari, Pellet, Tramblay, Crow, Gründemann, Hascoetf, Penna, Modanesi, Brocca, Camici, and Marra</label><mixed-citation>
       Massari, C., Pellet, V., Tramblay, Y., Crow, W. T., Gründemann, G. J., Hascoetf, T., Penna, D., Modanesi, S., Brocca, L., Camici, S., and Marra, F.: On the relation between antecedent basin conditions and runoff coefficient for European floods, J. Hydrol., 625, 130012, <a href="https://doi.org/10.1016/J.JHYDROL.2023.130012" target="_blank">https://doi.org/10.1016/J.JHYDROL.2023.130012</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib56"><label>Meira Neto et al.(2020)Meira Neto, Roy, de Oliveira, and Troch</label><mixed-citation>
       Meira Neto, A. A., Roy, T., de Oliveira, P. T. S., and Troch, P. A.: An Aridity Index-Based Formulation of Streamflow Components, Water Resour. Res., 56, e2020WR027123, <a href="https://doi.org/10.1029/2020WR027123" target="_blank">https://doi.org/10.1029/2020WR027123</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib57"><label>Merz et al.(2016)Merz, Nguyen, and Vorogushyn</label><mixed-citation>
       Merz, B., Nguyen, V. D., and Vorogushyn, S.: Temporal clustering of floods in Germany: Do flood-rich and flood-poor periods exist?, J. Hydrol., 541, 824–838, <a href="https://doi.org/10.1016/J.JHYDROL.2016.07.041" target="_blank">https://doi.org/10.1016/J.JHYDROL.2016.07.041</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib58"><label>Merz and Blöschl(2003)</label><mixed-citation>
       Merz, R. and Blöschl, G.: A process typology of regional floods, Water Resour. Res., 39, <a href="https://doi.org/10.1029/2002WR001952" target="_blank">https://doi.org/10.1029/2002WR001952</a>, 2003.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib59"><label>Morlot et al.(2014)Morlot, Perret, Favre, and Jalbert</label><mixed-citation>
       Morlot, T., Perret, C., Favre, A. C., and Jalbert, J.: Dynamic rating curve assessment for hydrometric stations and computation of the associated uncertainties: Quality and station management indicators, J. Hydrol., 517, 173–186, <a href="https://doi.org/10.1016/J.JHYDROL.2014.05.007" target="_blank">https://doi.org/10.1016/J.JHYDROL.2014.05.007</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib60"><label>Mushtaq et al.(2024)Mushtaq, Akhtar, Hashmi, Masood, and Saeed</label><mixed-citation>
       Mushtaq, H., Akhtar, T., Hashmi, M. Z. u. R., Masood, A., and Saeed, F.: Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate sensitive upper Indus catchments, Theor. Appl. Climatol., 155, 5525–5542, <a href="https://doi.org/10.1007/s00704-024-04932-8" target="_blank">https://doi.org/10.1007/s00704-024-04932-8</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib61"><label>Nariya et al.(2023)Nariya, Mills, Sorger, and Sokolov</label><mixed-citation>
       Nariya, M. K., Mills, C. E., Sorger, P. K., and Sokolov, A.: Paired evaluation of machine-learning models characterizes effects of confounders and outliers, Patterns, 4, 100791, <a href="https://doi.org/10.1016/J.PATTER.2023.100791" target="_blank">https://doi.org/10.1016/J.PATTER.2023.100791</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib62"><label>Neal et al.(2016)Neal, Fereday, Crocker, and Comer</label><mixed-citation>
       Neal, R., Fereday, D., Crocker, R., and Comer, R. E.: A flexible approach to defining weather patterns and their application in weather forecasting over Europe, Meteorol. Appl., 23, 389–400, <a href="https://doi.org/10.1002/met.1563" target="_blank">https://doi.org/10.1002/met.1563</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib63"><label>Neal et al.(2018)Neal, Dankers, Saulter, Lane, Millard, Robbins, and Price</label><mixed-citation>
       Neal, R., Dankers, R., Saulter, A., Lane, A., Millard, J., Robbins, G., and Price, D.: Use of probabilistic medium- to long-range weather-pattern forecasts for identifying periods with an increased likelihood of coastal flooding around the UK, Meteorol. Appl., 25, 534–547, <a href="https://doi.org/10.1002/MET.1719" target="_blank">https://doi.org/10.1002/MET.1719</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib64"><label>Nevo et al.(2022)Nevo, Morin, Gerzi Rosenthal, Metzger, Barshai, Weitzner, Voloshin, Kratzert, Elidan, Dror, Begelman, Nearing, Shalev, Noga, Shavitt, Yuklea, Royz, Giladi, Peled Levi, Reich, Gilon, Maor, Timnat, Shechter, Anisimov, Gigi, Levin, Moshe, Ben-Haim, Hassidim, and Matias</label><mixed-citation>
       Nevo, S., Morin, E., Gerzi Rosenthal, A., Metzger, A., Barshai, C., Weitzner, D., Voloshin, D., Kratzert, F., Elidan, G., Dror, G., Begelman, G., Nearing, G., Shalev, G., Noga, H., Shavitt, I., Yuklea, L., Royz, M., Giladi, N., Peled Levi, N., Reich, O., Gilon, O., Maor, R., Timnat, S., Shechter, T., Anisimov, V., Gigi, Y., Levin, Y., Moshe, Z., Ben-Haim, Z., Hassidim, A., and Matias, Y.: Flood forecasting with machine learning models in an operational framework, Hydrol. Earth Syst. Sci., 26, 4013–4032, <a href="https://doi.org/10.5194/hess-26-4013-2022" target="_blank">https://doi.org/10.5194/hess-26-4013-2022</a>, 2022. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib65"><label>Nied et al.(2014)Nied, Pardowitz, Nissen, Ulbrich, Hundecha, and Merz</label><mixed-citation>
       Nied, M., Pardowitz, T., Nissen, K., Ulbrich, U., Hundecha, Y., and Merz, B.: On the relationship between hydro-meteorological patterns and flood types, J. Hydrol., 519, 3249–3262, <a href="https://doi.org/10.1016/j.jhydrol.2014.09.089" target="_blank">https://doi.org/10.1016/j.jhydrol.2014.09.089</a>, 2014.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib66"><label>NRFA(2023)</label><mixed-citation>
       NRFA: National River Flow Archive (NRFA): River flow and catchment shapefiles for Great Britain, <a href="https://nrfa.ceh.ac.uk/" target="_blank"/> (last access: 15 April 2026), 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib67"><label>O'Brien(2007)</label><mixed-citation>
       O'Brien, R. M.: A Caution Regarding Rules of Thumb for Variance Inflation Factors, Qual. Quant., 41, 673–690, <a href="https://doi.org/10.1007/S11135-006-9018-6" target="_blank">https://doi.org/10.1007/S11135-006-9018-6</a>, 2007.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib68"><label>Ojala and Garriga(2010)</label><mixed-citation>
      
Ojala, M. and Garriga, G. C.: Permutation Tests for Studying Classifier Performance, J. Mach. Learn. Res., 11, 1833–1863, 2010.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib69"><label>Pan et al.(2022)Pan, Rahman, Haddad, and Ouarda</label><mixed-citation>
       Pan, X., Rahman, A., Haddad, K., and Ouarda, T. B.: Peaks-over-threshold model in flood frequency analysis: a scoping review, Stoch. Env. Res. Risk A., 36, 2419–2435, <a href="https://doi.org/10.1007/S00477-022-02174-6" target="_blank">https://doi.org/10.1007/S00477-022-02174-6</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib70"><label>Pedregosa et al.(2011)Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Müller, Nothman, Louppe, Prettenhofer, Weiss, Dubourg, Vanderplas, Cournapeau, Brucher, and Perrot</label><mixed-citation>
       Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Müller, A., Nothman, J., Louppe, G., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Cournapeau, D., Brucher, M., and Perrot, M.: Scikit-learn: Machine Learning in Python, J. Mach. Learn. Res., 12, 2825–2830, 2011.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib71"><label>Perks et al.(2023)Perks, Bernie, Lowe, and Neal</label><mixed-citation>
       Perks, R. J., Bernie, D., Lowe, J., and Neal, R.: The influence of future weather pattern changes and projected sea-level rise on coastal flood impacts around the UK, Climatic Change, 176, 1–21, <a href="https://doi.org/10.1007/S10584-023-03496-2" target="_blank">https://doi.org/10.1007/S10584-023-03496-2</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib72"><label>Pope et al.(2021)Pope, Brown, Fung, Hanlon, Neal, Palin, and Reid</label><mixed-citation>
       Pope, J. O., Brown, K., Fung, F., Hanlon, H. M., Neal, R., Palin, E. J., and Reid, A.: Investigation of future climate change over the british isles using weather patterns, Clim. Dynam., 58, 2405–2419, <a href="https://doi.org/10.1007/s00382-021-06031-0" target="_blank">https://doi.org/10.1007/s00382-021-06031-0</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib73"><label>Prudhomme and Genevier(2011)</label><mixed-citation>
      
Prudhomme, C. and Genevier, M.: Can atmospheric circulation be linked to flooding in Europe?, Hydrol. Process., 25, 1180–1190, <a href="https://doi.org/10.1002/HYP.7879" target="_blank">https://doi.org/10.1002/HYP.7879</a>, 2011.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib74"><label>Richardson et al.(2018)Richardson, Fowler, Kilsby, and Neal</label><mixed-citation>
       Richardson, D., Fowler, H. J., Kilsby, C. G., and Neal, R.: A new precipitation and drought climatology based on weather patterns, Int. J. Climatol., 38, 630–648, <a href="https://doi.org/10.1002/JOC.5199" target="_blank">https://doi.org/10.1002/JOC.5199</a>, 2018.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib75"><label>Richardson et al.(2020)Richardson, Neal, Dankers, Mylne, Cowling, Clements, and Millard</label><mixed-citation>
       Richardson, D., Neal, R., Dankers, R., Mylne, K., Cowling, R., Clements, H., and Millard, J.: Linking weather patterns to regional extreme precipitation for highlighting potential flood events in medium- to long-range forecasts, Meteorol. Appl., 27, <a href="https://doi.org/10.1002/met.1931" target="_blank">https://doi.org/10.1002/met.1931</a>, 2020.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib76"><label>Rodding Kjeldsen and Prosdocimi(2023)</label><mixed-citation>
      
Rodding Kjeldsen, T. and Prosdocimi, I.: Use of peak over threshold data for flood frequency estimation: An application at the UK national scale, J. Hydrol., 626, 130235, <a href="https://doi.org/10.1016/J.JHYDROL.2023.130235" target="_blank">https://doi.org/10.1016/J.JHYDROL.2023.130235</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib77"><label>Rosso(2015)</label><mixed-citation>
       Rosso, G.: Extreme Value Theory for Time Series using Peak-Over-Threshold method-Gianluca Rosso (2015) Extreme Value Theory for Time Series using Peak-Over-Threshold method, <a href="https://api.semanticscholar.org/CorpusID:88521862" target="_blank"/>, 2015.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib78"><label>Schlef et al.(2019)Schlef, Moradkhani, and Lall</label><mixed-citation>
       Schlef, K. E., Moradkhani, H., and Lall, U.: Atmospheric Circulation Patterns Associated with Extreme United States Floods Identified via Machine Learning, Sci. Rep.-UK, 9, 1–12, <a href="https://doi.org/10.1038/s41598-019-43496-w" target="_blank">https://doi.org/10.1038/s41598-019-43496-w</a>, 2019.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib79"><label>Scussolini et al.(2024)Scussolini, Luu, Philip, Berghuijs, Eilander, Aerts, Kew, van Oldenborgh, Toonen, Volkholz, and Coumou</label><mixed-citation>
       Scussolini, P., Luu, L. N., Philip, S., Berghuijs, W. R., Eilander, D., Aerts, J. C., Kew, S. F., van Oldenborgh, G. J., Toonen, W. H., Volkholz, J., and Coumou, D.: Challenges in the attribution of river flood events, WIRes Clim. Change, 15, e874, <a href="https://doi.org/10.1002/WCC.874" target="_blank">https://doi.org/10.1002/WCC.874</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib80"><label>Sefton et al.(2021)Sefton, Muchan, Parry, Matthews, Barker, Turner, and Hannaford</label><mixed-citation>
       Sefton, C., Muchan, K., Parry, S., Matthews, B., Barker, L. J., Turner, S., and Hannaford, J.: The 2019/2020 floods in the UK: a hydrological appraisal, Weather, 76, 378–384, <a href="https://doi.org/10.1002/WEA.3993" target="_blank">https://doi.org/10.1002/WEA.3993</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib81"><label>Sillmann et al.(2017)Sillmann, Thorarinsdottir, Keenlyside, Schaller, Alexander, Hegerl, Seneviratne, Vautard, Zhang, and Zwiers</label><mixed-citation>
       Sillmann, J., Thorarinsdottir, T., Keenlyside, N., Schaller, N., Alexander, L. V., Hegerl, G., Seneviratne, S. I., Vautard, R., Zhang, X., and Zwiers, F. W.: Understanding, modeling and predicting weather and climate extremes: Challenges and opportunities, Weather and Climate Extremes, 18, 65–74, <a href="https://doi.org/10.1016/J.WACE.2017.10.003" target="_blank">https://doi.org/10.1016/J.WACE.2017.10.003</a>, 2017.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib82"><label>Slater et al.(2024)Slater, Coxon, Brunner, McMillan, Yu, Zheng, Khouakhi, Moulds, and Berghuijs</label><mixed-citation>
       Slater, L., Coxon, G., Brunner, M., McMillan, H., Yu, L., Zheng, Y., Khouakhi, A., Moulds, S., and Berghuijs, W.: Spatial Sensitivity of River Flooding to Changes in Climate and Land Cover Through Explainable AI, Earths Future, 12, e2023EF004035, <a href="https://doi.org/10.1029/2023EF004035" target="_blank">https://doi.org/10.1029/2023EF004035</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib83"><label>Slater et al.(2025)Slater, Blougouras, Deng, Deng, Ford, Hoek Van Dijke, Huang, Jiang, Liu, Moulds, Schepen, Yin, and Zhang</label><mixed-citation>
       Slater, L., Blougouras, G., Deng, L., Deng, Q., Ford, E., Hoek Van Dijke, A., Huang, F., Jiang, S., Liu, Y., Moulds, S., Schepen, A., Yin, J., and Zhang, B.: Challenges and opportunities of ML and explainable AI in large-sample hydrology, Philos. T. R. Soc. A, 383, <a href="https://doi.org/10.1098/rsta.2024.0287" target="_blank">https://doi.org/10.1098/rsta.2024.0287</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib84"><label>Staudinger et al.(2025)Staudinger, Kauzlaric, Mas, Evin, Hingray, and Viviroli</label><mixed-citation>
       Staudinger, M., Kauzlaric, M., Mas, A., Evin, G., Hingray, B., and Viviroli, D.: The role of antecedent conditions in translating precipitation events into extreme floods at the catchment scale and in a large-basin context, Nat. Hazards Earth Syst. Sci., 25, 247–265, <a href="https://doi.org/10.5194/nhess-25-247-2025" target="_blank">https://doi.org/10.5194/nhess-25-247-2025</a>, 2025.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib85"><label>Tabari(2021)</label><mixed-citation>
       Tabari, H.: Extreme value analysis dilemma for climate change impact assessment on global flood and extreme precipitation, J. Hydrol., 593, 125932, <a href="https://doi.org/10.1016/J.JHYDROL.2020.125932" target="_blank">https://doi.org/10.1016/J.JHYDROL.2020.125932</a>, 2021.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib86"><label>Tarasova et al.(2023)Tarasova, Lun, Merz, Blöschl, Basso, Bertola, Miniussi, Rakovec, Samaniego, Thober, and Kumar</label><mixed-citation>
       Tarasova, L., Lun, D., Merz, R., Blöschl, G., Basso, S., Bertola, M., Miniussi, A., Rakovec, O., Samaniego, L., Thober, S., and Kumar, R.: Shifts in flood generation processes exacerbate regional flood anomalies in Europe, Communications Earth &amp; Environment, 4, 1–12, <a href="https://doi.org/10.1038/s43247-023-00714-8" target="_blank">https://doi.org/10.1038/s43247-023-00714-8</a>, 2023.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib87"><label>Towler et al.(2023)Towler, Foks, Dugger, Dickinson, Essaid, Gochis, Viger, and Zhang</label><mixed-citation>
       Towler, E., Foks, S. S., Dugger, A. L., Dickinson, J. E., Essaid, H. I., Gochis, D., Viger, R. J., and Zhang, Y.: Benchmarking high-resolution hydrologic model performance of long-term retrospective streamflow simulations in the contiguous United States, Hydrol. Earth Syst. Sci., 27, 1809–1825, <a href="https://doi.org/10.5194/hess-27-1809-2023" target="_blank">https://doi.org/10.5194/hess-27-1809-2023</a>, 2023. 
    </mixed-citation></ref-html>
<ref-html id="bib1.bib88"><label>van Hamel and Brunner(2024)</label><mixed-citation>
       van Hamel, A. and Brunner, M. I.: Trends and Drivers of Water Temperature Extremes in Mountain Rivers, Water Resour. Res., 60, e2024WR037518, <a href="https://doi.org/10.1029/2024WR037518" target="_blank">https://doi.org/10.1029/2024WR037518</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib89"><label>Wang et al.(2016)Wang, Li, Pu, Wen, Shugart, Xiong, and Jin</label><mixed-citation>
       Wang, Y., Li, Y., Pu, W., Wen, K., Shugart, Y. Y., Xiong, M., and Jin, L.: Random Bits Forest: a Strong Classifier/Regressor for Big Data, Sci. Rep.-UK, 6, 1–8, <a href="https://doi.org/10.1038/srep30086" target="_blank">https://doi.org/10.1038/srep30086</a>, 2016.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib90"><label>Westerberg et al.(2022)Westerberg, Sikorska-Senoner, Viviroli, Vis, and Seibert</label><mixed-citation>
       Westerberg, I. K., Sikorska-Senoner, A. E., Viviroli, D., Vis, M., and Seibert, J.: Hydrological model calibration with uncertain discharge data, Hydrolog. Sci. J., 67, 2441–2456, <a href="https://doi.org/10.1080/02626667.2020.1735638" target="_blank">https://doi.org/10.1080/02626667.2020.1735638</a>, 2022.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib91"><label>Wilby(1993)</label><mixed-citation>
       Wilby, R. L.: The influence of variable weather patterns on river water quantity and quality regimes, Int. J. Climatol., 13, 447–459, <a href="https://doi.org/10.1002/JOC.3370130408" target="_blank">https://doi.org/10.1002/JOC.3370130408</a>, 1993.


    </mixed-citation></ref-html>
<ref-html id="bib1.bib92"><label>Xu et al.(2024)Xu, Lin, Hu, Chen, Zhang, Xiao, and Xu</label><mixed-citation>
       Xu, Y., Lin, K., Hu, C., Chen, X., Zhang, J., Xiao, M., and Xu, C.-Y.: Uncovering the Dynamic Drivers of Floods Through Interpretable Deep Learning, Earths Future, 12, e2024EF004751, <a href="https://doi.org/10.1029/2024EF004751" target="_blank">https://doi.org/10.1029/2024EF004751</a>, 2024.

    </mixed-citation></ref-html>
<ref-html id="bib1.bib93"><label>Yuan and Lozano-Durán(2024)</label><mixed-citation>
       Yuan, Y. and Lozano-Durán, A.: Limits to extreme event forecasting in chaotic systems, Physica D, 467, 134246, <a href="https://doi.org/10.1016/J.PHYSD.2024.134246" target="_blank">https://doi.org/10.1016/J.PHYSD.2024.134246</a>, 2024.

    </mixed-citation></ref-html>--></article>
