Compound flood (CF) modeling enables the simulation of nonlinear water level dynamics in which concurrent or successive flood drivers synergize, producing larger impacts than those from individual drivers. However, CF modeling is subject to four main sources of uncertainty: (i) the initial condition, (ii) the forcing (or boundary) conditions, (iii) the model parameters, and (iv) the model structure. These sources of uncertainty, if not quantified and effectively reduced, cascade in series throughout the modeling chain and compromise the accuracy of CF hazard assessments. Here, we characterize cascading uncertainty using linked process-based and machine learning (PB–ML) models for a well-known CF event, namely, Hurricane Harvey in Galveston Bay, TX. For this, we run a set of hydrodynamic model scenarios to quantify isolated and cascading uncertainty in terms of maximum water level residuals; additionally, we track the evolution of residuals during the onset, peak, and dissipation of Hurricane Harvey. We then develop multiple linear regression (MLR) and PB–ML models to estimate the relative and cumulative contribution of the four sources of uncertainty to total uncertainty over time. Results from this study show that the proposed PB–ML model captures “hidden” nonlinear associations and interactions among the sources of uncertainty, thereby outperforming conventional MLR models. The model structure and forcing conditions are the main sources of uncertainty in CF modeling, and their corresponding model scenarios, or input features, contribute to 56 % of variance reduction in the estimation of maximum water level residuals. Following these results, we conclude that PB–ML models are a feasible alternative for quantifying cascading uncertainty in CF modeling.

It is estimated that nearly half (46 %) of the gross domestic product (GDP) in the US is generated in coastal shoreline counties that are frequently exposed to multiple flood hazards (NOAA Digital Coast, 2020). Similarly, nearly 129 million people in the US (39 % of its population) currently live in low-lying areas at risk of inland and coastal flooding (NOAA, 2022). In the past 5 years (2018–2023), the National Center for Environmental Information has reported 489 fatalities and over USD 327 billion of total damages as a result of tropical cyclones, during which heavy rainfall and storm surge exacerbate coastal flood impacts (NCEI, 2023). Terrestrial and coastal flood drivers of (non-)extreme nature that either coincide or unfold in close succession trigger compound flood (CF) events such as those already evinced in the US history, i.e., hurricanes Katrina (2005), Sandy (2012), Harvey (2017), Florence (2018), Ida (2021), Ian (2022), and Idalia (2023). CF events in low-lying areas are typically associated with tropical or extratropical cyclones for which rainfall–runoff, wind-driven storm surge, or both can be classified as dominant flood hazard drivers (Bevacqua et al., 2020; Bilskie and Hagen, 2018; Eilander et al., 2020; Ganguli and Merz, 2019b). In addition, the role of waves, tides, and nonlinear interactions on extreme water levels (WLs) can be crucial for the accurate simulation and/or prediction of CF events, as reported in several studies (Ganguli and Merz, 2019a; Hsu et al., 2023; Nasr et al., 2021; Serafin et al., 2017).

CF modeling can be performed via multivariate statistical analysis (Bensi et al., 2020; Jalili Pirani and Najafi, 2023; Sadegh et al., 2018), process-based modeling (Bates et al., 2021; Sanders et al., 2023; Santiago-Collazo et al., 2019), and even “hybrid” methods that link statistical and process-based models to alleviate computational burden by focusing on the most likely pair-wise forcing conditions given the statistical dependence among flood drivers (Abbaszadeh et al., 2022; Gori et al., 2020; Moftakhari et al., 2019; Serafin et al., 2019). Statistical analyses enable the prediction of future CF events, the reliability of which largely depends on the length of data records. This means that a detailed CF hazard assessment over a given spatial domain requires the availability of both data records and computational resources for handling large datasets. For hindcasting purposes, CF events are simulated using process-based models, as they can incorporate physical features in the underlying digital elevation model (DEM), including local hydrodynamic attributes and geomorphologic characteristics, i.e., tidal and riverine channels, artificial waterways, and flood infrastructure (Marsooli and Wang, 2020; Muñoz et al., 2020; Salehi, 2018). Another advantage of process-based modeling is the ability to simulate complex WL dynamics such as backwater effects, tidal propagation, and overtopping in estuarine environments and urban settings that are usually ignored in point-based statistical analyses (Gallien et al., 2018; Kumbier et al., 2018; Leijnse et al., 2021). Also, process-based models can simulate complex CF dynamics in coastal to inland transition zones where hydrological and coastal processes determine flood extent, duration, and inundation depth (Bilskie et al., 2021; Jafarzadegan et al., 2023; Peña et al., 2022). Nevertheless, CF modeling is subject to uncertainties that interact and cascade in series throughout the modeling chain if they are not treated appropriately (Beven et al., 2005; Hasan Tanim and Goharian, 2021; Meresa et al., 2021).

In general, uncertainties in process-based modeling can be classified into four main sources: (i) the initial condition, (ii) the forcing (or boundary) conditions, (iii) the model parameters, and (iv) the model structure (Beven et al., 2005; Moradkhani et al., 2018; Vrugt, 2016). Initial and forcing conditions are essentially model inputs to any process-based models; however, their isolated effects on WL dynamics are often analyzed separately, as reported in diverse hydrological (Abbaszadeh et al., 2019; Jafarzadegan et al., 2021a; Kohanpur et al., 2023) and coastal studies (Bakhtyar et al., 2020; Marsooli and Wang, 2020; Muñoz et al., 2022a). On the other hand, model parameters and structure are intrinsic to the process-based models under consideration, and they can differ depending on the physical process and forcing drivers to be solved (e.g., hydrological or coastal models). The first source of uncertainty involves inaccuracies in the geometry of the system, which is spatially represented with light detection and ranging (lidar) elevation data. These inaccuracies also include bathymetric (Cea and French, 2012; Neal et al., 2021; Parodi et al., 2020) and topographic errors, such as those reported in tidal wetland regions (Alizad et al., 2018; Cooper et al., 2019; Rogers et al., 2018). Elevation errors in coastal wetlands can reach values up to 0.65 m and are usually estimated as the vertical difference between lidar-derived DEMs and ground-truth elevation collected during real-time kinematic surveys (Medeiros et al., 2015; Rogers et al., 2016).

Uncertainty stemming from forcing or boundary conditions is linked with the characteristics of instrument and sensors that measure WL or streamflow, such as analog-to-digital recorders and acoustic Doppler current, respectively (NOAA, 2000; USGS, 2021). Notably, this uncertainty can also arise from a posteriori assumptions (or generalizations) in operational hurricane-induced coastal flood forecasting. For example, the Coastal Emergency Risks Assessment (CERA) portal provides real-time storm surge, wave, and flood guidance for the Gulf and Atlantic coasts of the US under the assumption that river flow and local rainfall contributions to flooding are relatively small compared with that driven by storm surge (CERA, 2023). Although this assumption might be valid for non-estuarine regions, ignoring nonlinear interactions among flood drivers in freshwater-influenced stretches of the coast can lead to an underestimation of CF hazards, especially in coastal-to-inland transition zones characterized by tidally influenced rivers (Bakhtyar et al., 2020; Yin et al., 2021; Muñoz et al., 2022b). Nevertheless, we acknowledge the ongoing work of CERA to incorporate freshwater inflow in CF simulations and flood guidance, as demonstrated in a pilot study in Louisiana.

Another important source of uncertainty in CF modeling is associated with model parameters, such as the antecedent soil moisture condition (e.g., infiltration capacity) and Manning's roughness coefficient, the latter of which is present in the bottom stress component of the momentum equation (see Sect. 2.3). Although soil moisture might influence CF dynamics, especially at the onset of flood events, modelers often assume that soils are already saturated for practical purposes. In contrast, Manning's roughness coefficient helps account for bed friction exerted by the vegetation, seabed, riverbed, and sinuosity and irregularity of channel cross sections (Attari and Hosseini, 2019; Bhola et al., 2019; Yen, 2002). Thus, hydrodynamic models rely on a rigorous static (or dynamic) calibration of roughness coefficients to capture the onset, peak, and dissipation of WLs as well as CF dynamics (Jafarzadegan et al., 2021a; Liu et al., 2018; Mayo et al., 2014). However, conducting model calibration is a computationally intensive procedure that requires a suitable strategy to explore and exploit the parameter space, such as Monte Carlo and Latin hypercube sampling techniques (Helton and Davis, 2003; Kuczera and Parent, 1998). For that reason, flood hazard assessments often assume stationarity of model parameters under the premise that calibrated roughness coefficients for a specific event are adequate for a range of unseen flood scenarios (Domeneghetti et al., 2013; Meresa et al., 2021).

The fourth source of uncertainty refers to limitations or a priori (theoretical) assumptions that are necessary to simplify the representation of oceanic, hydrological, and meteorological processes in regard to flood generation and routing (Moradkhani et al., 2018; Nearing et al., 2016; Pappenberger et al., 2006). Moreover, uncertainty derived from the model structure accounts for model coupling approaches, such as one-way, two-way, tightly, and fully coupled (Bilskie et al., 2021; Muñoz et al., 2021; Santiago-Collazo et al., 2019), as well as the model configuration, which refers to inherent “reduced-physics” schemes to solve the conservation of mass and momentum equations (see Sect. 2.3). For example, reduced-physics numerical schemes are devised to ignore local acceleration, pressure gradient, viscosity, and/or Coriolis terms (Brunner, 2016; Leijnse et al., 2021; Lesser et al., 2004). Nonetheless, such schemes are designed to optimize the modeling procedure, i.e., reduce the computational cost or time required by high-fidelity process-based models while ensuring an acceptable accuracy in the simulation of WL and CF dynamics.

Methods for uncertainty quantification vary with respect to complexity and application and have been discussed in detail in recent review studies (Abbaszadeh et al., 2022; Beven et al., 2018; Xu et al., 2023). These methods include linear associations and first-order second-moment approximations (Taylor et al., 2015; Thompson et al., 2008), generalized likelihood estimations (Aronica et al., 2002; Domeneghetti et al., 2013), sensitivity analyses (Alipour et al., 2022; Hall et al., 2005; Savage et al., 2016), multi-model ensemble methods (Duan et al., 2007; Kodra et al., 2020; Madadgar and Moradkhani, 2014; Najafi and Moradkhani, 2016), and data assimilation (Abbaszadeh et al., 2019; Moradkhani et al., 2018; Pathiraja et al., 2018).

In recent years, researchers have explored linked process-based and machine learning (PB–ML) models for uncertainty analysis. Hu et al. (2019) developed an integrated framework consisting of ML and reduced-order models for rapid flood prediction and uncertainty quantification. Specifically, they reported that forcing conditions (e.g., incoming waves) are the main source of uncertainty for predicting water surface elevation resulting from tsunamis. Moreover, they quantified such an uncertainty via prescriptive analytics in long short-term memory (LSTM) networks, i.e., inverse functions. Anaraki et al. (2021) proposed a hybrid modeling framework that combines hydrological models and ML for flood frequency analysis under climate change conditions. They indicated that the selection of hydrological models (e.g., model structure) is a critical source of uncertainty based on fuzzy and analysis of variance methods. Chaudhary et al. (2022) developed a deep learning ensemble model that is trained with hydrodynamic model outputs to predict urban flood hazards at high spatial resolution. They estimated total predictive uncertainty in terms of aleatory and epistemic uncertainty by focusing on model inputs and model parameters (e.g., deep learning model's weights). Also, they reported that both sources of uncertainty follow the pattern of maximum water depth residuals and that aleatory and epistemic uncertainty are sharper and fuzzier for higher residual values, respectively.

Nevertheless, there is a fundamental gap in terms of understanding the evolution of uncertainty sources in CF modeling as well as their cascading effects propagating in the modeling chain and ultimately leading to total uncertainty. Notably, there is a need for a robust and computationally efficient methodology that enables a proper characterization of the spatiotemporal evolution of uncertainty throughout CF events. Here, we aim at characterizing the spatiotemporal evolution of uncertainty during a well-known CF event, namely, Hurricane Harvey in Galveston Bay, TX. For this, we develop a PB–ML model framework that combines two different hydrodynamic models as well as (non-)linear regression methods in order to quantify isolated and cascading uncertainty in terms of maximum WL residuals. Also, we leverage the regression models to track the evolution of WL residuals during the onset, peak, and dissipation of Hurricane Harvey. Based on a rigorously trained PB–ML model, we are able to estimate the relative and cumulative contribution of the four sources of uncertainty to total uncertainty over time.

The following sections describe the publicly available data used to develop two different hydrodynamic models for Galveston Bay, namely, Delft3D Flexible Mesh (Delft3D-FM) and the US Army Corps of Engineers' River Analysis System (HEC-RAS), as well as linear and nonlinear regression models. We then introduce the proposed PB–ML framework to characterize uncertainty in CF events, discuss the results, and provide key remarks in the conclusion section.

Model domain of Galveston Bay, TX.

We select Galveston Bay (G-Bay) as the study area to leverage multiple spatiotemporal datasets and official reports that help calibrate and validate hydrodynamic models (Valle-Levinson et al., 2020; Sebastian et al., 2021; Rego and Li, 2010; East et al., 2008). G-Bay is the seventh largest estuary in the US and connects Houston, TX, with the Gulf of Mexico via a complex system consisting of bayous, interior bays, and rivers (Fig. 1a). G-Bay is a shallow estuary of 2 m depth, 56 km length, and 31 km width (on average) that comprises an area of approximately 1600 km

We simulate two CF events in G-Bay, namely, Hurricane Ike and Hurricane Harvey, that hit the Gulf of Mexico in September 2008 and August 2017, respectively (Fig. 1a). These hurricanes were selected not only because they were the most recent and relevant CF events in G-Bay but also because they were driven by dominant coastal (storm surge) and terrestrial (rainfall–runoff) flood drivers, respectively. Hurricane Ike made landfall as a Category-2 event on the Saffir–Simpson scale in the eastern part of Galveston Island, TX, on 13 September 2008. Ike produced storm surges up to 4 m near Sabine Pass and 480 mm of cumulative precipitation over southeastern TX that together led to maximum inundation depths up to 3 m a.g.l. (above ground level) in Galveston County (Berg, 2009; Rego and Li, 2010). Hurricane Harvey, on the other hand, reached Category 4 near Rockport, TX, on 24 August 2017 and made a second landfall near Cameron, LA, on 29 August 2017. Harvey generated total cumulative precipitation amounts ranging from 0.64 m up to 1.52 m over southeastern Texas and subsequent pluvial flooding in the upper river reaches of the Buffalo Bayou river with maximum inundation depths of 3 m a.g.l. (Blake and Zelinsky, 2018). In addition to heavy rainfall, a wind-driven storm surge triggered compound coastal flooding over the region that lasted 3–8 d (Valle-Levinson et al., 2020; Huang et al., 2021).

We use publicly available data to develop and calibrate hydrodynamics models of G-Bay (Fig. 1b, c). To resemble physical conditions prior to Hurricane Ike, we consider the legacy “Galveston, Texas Coastal Digital Elevation Model” obtained from the NOAA's National Geophysical Data Center (

Forcing or boundary conditions (BCs) consist of data time series of WL and river discharge that are obtained from the NOAA's Tides & Currents portal (

We develop hydrodynamic models using two different model software packages in order to simulate compound coastal flooding. We then analyze the uncertainty stemming from model structural inadequacy reflected in the model configuration and numerical scheme. Specifically, we set up 2D models in Delft3D-FM (version 2021.3) and HEC-RAS (version 6.3). Both models have been widely used in pluvial, fluvial, and coastal flood studies and have achieved satisfactory results (Bakhtyar et al., 2020; Liu et al., 2018; Muñoz et al., 2021; Shustikova et al., 2019). Delft3D-FM can be set up in 2D (depth-averaged) mode to solve the continuity (Eq. 1) and Reynolds-averaged Navier–Stokes equations (Eqs. 2 and 3) for incompressible fluids, uniform density, and vertical length scales that are significantly smaller that the horizontal ones (Lesser et al., 2004; Roelvink and Van Banning, 1995). In a similar way, HEC-RAS solves 2D unsteady flow, and recent model developments (e.g., version 6.3 onwards) include gridded wind and precipitation forcing input in the momentum conservation equations (USACE, 2023).

The first hydrodynamic model is developed in Delft3D-FM using an unstructured finite-volume grid that consists of triangular cells with a spatially varying size. Unstructured grids help capture geomorphological and urban features with greater detail than conventional nested, structured grids (Kumar et al., 2009; Muñoz et al., 2022a). These features include the G-Bay entrance, artificial channels in Houston, intracoastal waterways, lateral floodplains, wetland regions, and bottleneck-like connections between G-Bay and both the Buffalo Bayou and San Jacinto rivers (Fig. 1c). Triangular cell sizes are set to increase from 3 km at the open-ocean boundary in the Gulf of Mexico up to 5 m in Harris County. This ensures a detailed simulation of CF dynamics in natural and urban settings. Similarly, the second hydrodynamic model is developed in 2D HEC-RAS using an unstructured finite-volume grid. The mesh consists of polygons of varying cell size and the numerical scheme to solve the shallow water equations is set to the Eulerian–Lagrangian (SWL-ELM) method. This, in turn, ensures that the model solves all terms in Eqs. (1)–(3), except for atmospheric pressure due to the current model capabilities of 2D HEC-RAS. In addition, we force the mesh generation with a cell size and spatial distribution similar to that of the Delft3D-FM model. Although there is no a straightforward procedure to transfer the mesh properties and/or spatial characteristics between the two hydrodynamic models, we ensure that geomorphological and urban features are correctly delineated by conducting an extensive mesh refinement in critical locations, as suggested in similar studies (Muñoz et al., 2021; Shustikova et al., 2019). The time step is controlled by the Courant–Friedrichs–Lewy condition, with a maximum value of 0.7 for both models. Also, model outputs are generated with an hourly interval for calibration and validation purposes.

After the mesh generation process, we consider multiple USGS river discharge stations in the G-Bay model as upstream BCs, including Whiteoak Bayou (USGS 08074500), Buffalo Bayou (USGS 08073700), Brays Bayou (USGS 08075000), Sims Bayou (USGS 08075500), Berry Bayou (USGS 08075605), Greens Bayou (USGS 08076000), Hunting Bayou (USGS 08075770), Vince Bayou (USGS 08075730), Clear Creek (USGS 08077600), Goose Creek (USGS 08067525), Cedar Creek (USGS 08067500), and Trinity River (USGS 08067252). Also, due to the lack of available river-gauge stations located immediately downstream of Lake Houston dam, river flow from Lake Houston dam to the San Jacinto River is estimated as the sum of upstream freshwater input to the lake (Fig. 1). Such an estimation is realistic because the dam is not operating as a flood control structure anymore and was overflowed by extreme river discharge as a result of Hurricane Harvey (Sebastian et al., 2021; Valle-Levinson et al., 2020). Regarding the downstream BCs, we force the G-Bay model with storm tides using WL records from two tide-gauge stations, namely, Freeport Harbor (NOAA 8772471) and Galveston Bay Entrance (NOAA 8771341). These stations are located offshore and are a good proxy for coastal WL propagating from the open-ocean boundary. To account for WL variability arising from atmospheric variables, we include reanalysis data of 10 m wind velocity and atmospheric (sea level) pressure in the model simulations. Note that the latest versions of 2D HEC-RAS allow the user to simulate wind speeds even though the atmospheric pressure component is yet to be implemented in Eqs. (2) and (3).

Lastly, we retrieve rainfall data from a relative dense rain-gauge network of the Harris County Flood Warning System. These data have been validated and further used to estimate flood damages in G-Bay (Sebastian et al., 2021). In addition, we complement these data with “total precipitation” reanalysis datasets from ERA5 in order to estimate rainfall patterns in coastal areas beyond Harris County and over the Gulf of Mexico (Fig. 1). Specifically, we use the “inverse distance weight” as the interpolation method in ArcGIS with an output cell size of 1 km (e.g., shortest Euclidean distance between existing rain gauges), a search radius of 5 points, and a power function of 2. The interpolation method as well as the aforementioned values follow those suggested in other studies and are validated through sensitivity analysis (Sebastian et al., 2021; Ahrens, 2006; Otieno et al., 2014).

G-Bay is influenced by multiple flood drivers, including local rainfall, river discharge, and storm tides. At low river flow rates, tides propagate from the ocean boundary in the landward direction where they attenuate and eventually vanish due to bottom friction (Hoitink and Jay, 2016; Bolla Pittaluga et al., 2015). Therefore, we ensure that the model setup (Sect. 2.3.1) is adequate to simulate tidal propagation across the model domain. Specifically, we simulate tidal dynamics in Delft3D-FM by setting barotropic tides from the TPXO 8.0 global inverse tide model as forcing data at the open-ocean boundary (Egbert and Erofeeva, 2002). We then run 100 ensemble model realizations for a 1-year simulation window without incorporating any additional forcing (e.g., only tides) and considering a range of plausible Manning's roughness values (

Manning's roughness values tested for model calibration in Delft3D-FM/2D HEC-RAS.

Evaluation of tidal propagation in Galveston Bay.

Furthermore, we conduct harmonic analyses and retrieve tidal amplitudes and phases for the main tidal constituents in G-Bay including K2, S2, M2, N2, K1, S1, P1, O1, and Q1 (NOAA, 2000). We then compare observed and simulated tidal characteristics at each tide-gauge station using a

Next, we calibrate

Model calibration at selected tide-gauge stations in Galveston Bay. Model performance is evaluated in terms of the RMSE, NSE, and KGE for

To validate the hydrodynamic models, we generate composite maps representing maximum WLs within the simulation period of Hurricane Ike and Hurricane Harvey and compare those maps against USGS's high-water marks collected in the aftermath of both hurricanes (Fig. 4a). We evaluate the accuracy of the composite maps by comparing observed and simulated maximum WLs (Fig. 4b). Data points that fall along the

Validation of the Delft3D-FM and 2D HEC-RAS models in Galveston Bay.

We propose five scenarios to analyze the effects of isolated and total uncertainty on CF hazard assessment for Hurricane Harvey (Table 2). The first scenario focuses on the initial condition of the system, including topographic and bathymetric data in coastal DEMs. Recently, Holmquist and Windham-Myers (2022) produced a DEM of relative tidal marsh elevation for the conterminous US using land cover classes derived from the 2010 Coastal Change Analysis Program (C-CAP). As this DEM accounts for elevation errors within coastal wetlands, we conveniently evaluate uncertainty from the initial condition of the system using the aforementioned DEM and NOAA's CUDEM in hydrodynamic simulations (see Sect. 2.2). The second scenario represents uncertainty derived from forcing conditions that are often neglected in real-time hurricane-induced flood forecasts and advisories (CERA, 2023). Specifically, such flood forecasts assume that riverine flow and local rainfall contributions to flooding are relatively small compared with those driven by storm surge. Following this reasoning, we will analyze the impact of local rainfall and riverine flow in CF dynamics by turning off those two forcing conditions in hydrodynamic simulations.

Model scenarios for simulating isolated and total uncertainty in G-Bay.

The third scenario analyzes uncertainty stemming from model parameters, including bed channel and floodplain roughness coefficients associated with land cover classes from the NLCD (Fig. 1a, Table 1). Here, we simulate compound coastal flooding triggered by Hurricane Harvey using two sets of optimal (calibrated)

The proposed model scenarios are further modified to quantify their relative contribution to total uncertainty using regression models. Here, we report uncertainties in terms of WL residuals computed between each scenario and the best model setup for simulating Hurricane Harvey. Such a model setup consists of the calibrated Delft3D-FM model that accounts for elevation errors within coastal wetlands and additionally incorporates the effects or river flow and local rainfall in CF modeling (Figs. 3, 4). As 2D HEC-RAS generates raster-based flood maps, we extract WLs at each grid node of Delft3D-FM to compute the corresponding WL residuals. This helps ensure consistency in uncertainty analysis over the model domain. Also, note that WL residuals evolve in time and space, and their magnitude attributed to the sources of uncertainty is represented by the four model scenarios. Therefore, we first compute the maximum WL residual across the model domain for the entire simulation period (e.g., 10 d window) as well as time-evolving residuals with an interval of 6 h. This time interval is set to comply with the timing of hurricane advisories of the National Hurricane Center and, thus, enable the construction of 40 datasets within the 10 d simulation window, i.e., WL forecasts and advisories every 6 h.

WL residuals obtained from physically based model simulations are used as input features for multiple linear regression (MLR) and nonlinear models (Fig. 5). First, we fit the input features using an MLR model and considered it as a benchmark model to further evaluate the benefits of nonlinear models including physics-informed machine learning (PB–ML). The goal of the MLR model is to estimate total uncertainty as the target variable in terms of WL residuals. As we are dealing with WL residuals in meters that have a comparable order of magnitude, we do not scale or normalize the input features prior to the fitting process. This is also convenient for the evaluation purposes of the fitted model and the relative importance quantification of each source of uncertainty based on fitted regression weights. We do, however, identify and remove outliers in the input features, especially those arising at the edges of the mesh and those around the BCs. We use the “statsmodels API” package in Python to conduct a robust fitting of input features (

Schematic of multiple linear regression and process-based machine learning models to quantify cascading uncertainty. Input features and target variable are reported in terms of water level residuals derived from hydrodynamic simulations of Hurricane Harvey. The target variable contains all sources of uncertainty and their implicit cascading effects.

We conducted a preliminary analysis to identify the best nonlinear ML model that predicts total uncertainty (not shown here for brevity). Among those models, we notice that a random forest regressor model outperforms support vector machine and artificial neural networks (e.g., multiple-layer perceptron model), and the latter agrees with results of multiple flood studies (Chen et al., 2020; Mosavi et al., 2018; Schoppa et al., 2020). Random forest (RF) is a nonparametric ensemble algorithm that builds multiple decision trees based on random bootstrapped samples through replacement (Breiman, 2001). The advantage of RF over other nonlinear regression models lies in its simplicity and easy implementation for efficient regression and classification tasks. Also, RF connects input and target features with complex and nonlinear associations and provides estimates of feature importance to predict the target variable (Alipour et al., 2020; Wang et al., 2015). We develop an RF regressor model and conduct a thorough model evaluation in Python using the “scikit-learn” package. WL residuals from each scenario are set as input (S1–S4) and target (S5) features, and our analysis is focused on both time-evolving and maximum residuals across the model domain. We split input features into training (80 %) and validation (20 %) datasets using the total number of data points after outlier removal. In the context of hydrodynamic modeling, outliers are unrealistic WLs emerging around upstream and downstream BC lines as well as the edges of the model domain. Such values are extreme values, either positive or negative, that do not reflect WL dynamics within the model domain. Therefore, we masked out such values using a buffer polygon in ArcGIS and proceeded with the training and validation dataset using realistic WLs (e.g., 1 093 501 data points). Those data points represent the number of grid nodes generated in Delft3D-FM (see Sect. 2.3.1). Next, we conduct hyperparameter tuning to build decision trees and estimate optimal (calibrated) values for each parameter using an HPC system (Table 3).

Hyperparameter grid and optimal (calibrated) parameter values for the RF regressor model.

We use the scikit-learn package to find optimal values and also account for overfitting issues through a cross-validation (CV) process. For this, we generate an initial grid of parameter values within a specified range using the LHS technique (Helton and Davis, 2003). Then, we use a “

Scenarios S1 to S4 are designed to analyze the effect of each source of uncertainty on CF hazard assessment (Figs. 6, S3). We conveniently display maximum WL residuals across the model domain where positive or negative values indicate an overestimation or underestimation of WLs, respectively. Scenario S1 accounts for elevation errors in lidar-derived DEMs that lead to a complex heterogenous pattern with both over- and underestimation of maximum WL residuals (Fig. 6a, b). This scenario displays an overestimation of WLs in the tributaries of the Buffalo Bayou river, the Houston and Galveston navigation channels, and intracoastal waterways in the lower part of G-Bay. Such an overestimation can be explained by inconsistencies in bathymetric data between the tidal marsh DEM and NOAA's CUDEM. In addition, this scenario shows an underestimation of WLs in the upper part of G-Bay, including the head of the Buffalo Bayou's tributaries and the San Jacinto River that is surrounded by coastal wetlands (Fig. 1a). Wetlands are natural buffers that dissipate extreme WLs and attenuate storm surge with a rate of 1.7–25 cm km

Effect of isolated and total uncertainty on compound flood hazard assessment in Galveston Bay. Maximum water level residuals represent model scenarios with uncertainty stemming from the

Scenario S2 focuses on uncertainty derived from forcing conditions including river flow and local rainfall. This scenario leads to an underestimation of WLs across the model domain (Fig. S3a, b). The effect of neglecting forcing conditions on CF hazard assessment is more evident on the northwest side of G-Bay where CF was driven by heavy rainfall and extreme river flow triggering urban flooding in Harris County. In fact, Hurricane Harvey caused urban flooding in Houston city due to an unprecedent rainfall depth greater than 1.5 m as well as an extreme river flow conveyed by the Buffalo Bayou and San Jacinto rivers (Blake and Zelinsky, 2018; Sebastian et al., 2021; Valle-Levinson et al., 2020). Scenario S3 analyzes the influence of model parameters that are conventionally calibrated for historical flood events (e.g., Hurricane Ike) and used as a proxy for simulating future flood events (Domeneghetti et al., 2013; Meresa et al., 2021). This scenario exhibits an overestimation of WLs that is particularly evident on the northwest side of G-Bay (Fig. S3c, d). Such an overestimation can be related to the peak WL of Hurricane Ike (

Scenario S4 accounts for uncertainty derived from the model structure and capabilities of 2D HEC-RAS compared with those of Delft3D-FM. This scenario leads to a complex heterogenous pattern with both over- and underestimation of maximum WL residuals across the model domain (Fig. 6c, d). Specifically, this scenario displays an overestimation of WLs in the San Jacinto River, Moses and Anahuac lakes, and Freeport, whereas it results in an underestimation in Harris County, the Buffalo Bayou river and its tributaries, the Houston and Galveston navigation channels, and intracoastal waterways in the lower part of G-Bay. This complex pattern highlights the capability (or inability) of the hydrodynamic models to account for atmospheric pressure in the conservation of momentum (Eqs. 2 and 3) and to capture coastal geomorphological and urban features in the mesh generation process, i.e., structured vs. unstructured grids (Bates, 2022, 2023).

Lastly, scenario S5 is named total uncertainty because it accounts for the isolated and cascading effects of the sources of uncertainty on spatiotemporal CF hazard assessment (Fig. 6e, f). This scenario displays an overall underestimation of maximum WL residuals on the northwest side of G-Bay, which is similar to the pattern observed in scenario S2 (e.g., forcing conditions). Likewise, it displays both over- and underestimation of WL residuals in the lower part of G-Bay, resembling the patterns of scenarios S4 and S1, respectively (e.g., model structure and input data). In contrast, the overestimation pattern of scenario S3 is not visually reflected in scenario S5. This discrepancy is explained in the following section.

We fit and train MLR and ML models using maximum WL residuals as an indicator of CF hazard in G-Bay (Table 4). Regression weights are statistically significant for all four scenarios (

Multiple linear regression fitting on maximum water level residuals.

Isolated and total uncertainty reported in terms of water level residuals (in meters).

Confidence intervals are obtained from the statsmodels API package available in Python.

Overall, the absolute magnitude of regression weights agrees well with the rank resulting from either Pearson's or Kendall's correlation coefficients. This suggests that uncertainty stemming from the forcing conditions and model structure are crucial for estimating total uncertainty in CF hazard assessment (Fig. 7g). Other flood studies have shown similar results and demonstrated that uncertainty stemming from forcing conditions is even more important than that of the remaining sources, especially for flood prediction and inundation mapping in riverine systems (Alipour et al., 2020; Jafarzadegan et al., 2023; Pappenberger et al., 2008; Savage et al., 2016). The initial condition of the system and model parameters are also relevant sources of uncertainty, but they show a relatively low regression weight. Note that if WL residuals of scenarios S1, S3, and S4 are kept invariant, any perturbations of scenario S2 will result in a nearly identical response of scenario S5 as well as a negative offset of 12 cm. Although MLR models help analyze the influence of each input feature to estimating total uncertainty, they do not capture hidden associations and/or interactions among the input features. This, in turn, reduces the effectiveness of MLR models to characterize cascading effects on total uncertainty.

To overcome this limitation, we determine whether or not the proposed PB–ML model (e.g., RF regressor) outperforms the MLR model based on the aforementioned evaluation metrics. In this regard, the score metrics evince a substantial improvement by the PB–ML model: the RMSE decreases to 0.28 m, whereas

The advantage of permutation over feature importance analysis is that the former method circumvents any overfitting issues by focusing the analysis on validation data (

We track the trajectory of the four sources of uncertainty during the onset, peak, and dissipation of Hurricane Harvey using a 6 h interval for the entire simulation period (Fig. 8a). For this, we fit MLR and train PB–ML models for each of the 40 datasets containing WL residuals and report relative and cumulative contributions as well as models' performance in terms of the

Evolution of water level residuals as a proxy for total uncertainty during the onset, peak, and dissipation of Hurricane Harvey.

These results are somehow similar to other studies that analyzed how the influence of forcing (boundary) conditions and model parameters change during flood events (Alipour et al., 2022; Jafarzadegan et al., 2021b; Savage et al., 2016). Although those studies identify forcing conditions as the most influential factor for flood inundation mapping, uncertainty stemming from model structure is not explicitly analyzed; however, it is recognized as a determinant factor of the results. Lastly, the evaluation metrics computed for the 40 datasets indicate that RF regressor models (dashed line) outperform the benchmark MLR models (solid line) in the simulation period (Fig. 8c). Note that the

In the present study, we characterize isolated and cascading uncertainty during the onset, peak, and dissipation of Hurricane Harvey in Galveston Bay, TX. For this, we develop two hydrodynamic models (e.g., Delft3D-FM and 2D HEC-RAS) to simulate compound coastal flooding and conduct compound flood (CF) hazard assessment. The calibrated and validated models help simulate a set of scenarios that reflect uncertainties stemming from the initial condition, forcing conditions, model parameters, and model structure. We then train a physics-informed machine learning model (PB–ML) to estimate total uncertainty in terms of water level (WL) residuals and evaluate the model's performance with respect to a benchmark multiple linear regression (MLR) model. The effects of isolated uncertainty on CF hazard assessment match the spatial patterns observed in the total uncertainty scenario across the model domain, especially for the scenarios that reflect uncertainty from the initial condition of the system, forcing conditions, and model structure. Conversely, the scenarios representing total uncertainty and that from model parameters exhibit a negative correlation, resulting in a discrepancy of spatial patterns across the model domain. Nevertheless, we estimate that the forcing conditions, model structure, initial condition, and model parameters contribute to 56 %, 20 %, 18 %, and 6 % of variance reduction in the PB–ML model, respectively. These values agree well with the rank of regression weights estimated with the MLR regression model, helping support the conclusion that the forcing (boundary) condition is the main contributor to total uncertainty in CF hazard assessment.

Regarding CF modeling, we observe an interplay of relative importance where model structure and the initial condition of the system are the main sources of total uncertainty at the onset of Hurricane Harvey. However, their relative importance drops around the peak WL because forcing conditions become a more relevant source of uncertainty until the dissipation of WLs. Also, the importance of model parameters remains almost invariant during the onset, peak, and dissipation of Hurricane Harvey. Nonetheless, model structure is the main contributor to variance reduction (49 %) followed by the forcing conditions (23 %), initial condition (20 %), and model parameters (8 %). Lastly, MLR models are not suitable to characterize total uncertainty, as their performance is sensitive to the peak WL, as evinced in the evaluation metrics (e.g., RMSE,

Delft3D and HEC-RAS are freely available open-source hydrodynamic models; the model source code is available from

All data used in this study are publicly available (see Sec. 2.2 for details). Specifically, the legacy “Galveston, Texas Coastal Digital Elevation Model” was obtained from NOAA's National Geophysical Data Center (

The supplement related to this article is available online at:

DFM contributed to conceptualization, methodology, validation, formal analysis, writing the original draft, and visualization. HMof and HMor edited the original draft of the manuscript and supervised the research project.

The contact author has declared that none of the authors has any competing interests.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

This article is part of the special issue “Methodological innovations for the analysis and management of compound risk and multi-risk, including climate-related and geophysical hazards (NHESS/ESD/ESSD/GC/HESS inter-journal SI)”. It is not associated with a conference.

The authors would like to thank the two anonymous reviewers for their thoughtful comments and input that helped improve the quality of this work.

Partial financial support for this study has been provided by the National Science Foundation, CAS-Climate program (grant nos. 480948 and 2223893). Partial support has also been provided through funding awarded to the Cooperative Institute for Research to Operations in Hydrology (CIROH) via the NOAA cooperative agreement with The University of Alabama (grant no. NA22NWS4320003).

This paper was edited by Silvia De Angeli and reviewed by two anonymous referees.