Numerical modelling is a reliable tool for flood simulations, but accurate solutions are computationally expensive. In recent years, researchers have explored data-driven methodologies based on neural networks to overcome this limitation. However, most models are only used for a specific case study and disregard the dynamic evolution of the flood wave. This limits their generalizability to topographies that the model was not trained on and in time-dependent applications. In this paper, we introduce shallow water equation–graph neural network (SWE–GNN), a hydraulics-inspired surrogate model based on GNNs that can be used for rapid spatio-temporal flood modelling. The model exploits the analogy between finite-volume methods used to solve SWEs and GNNs. For a computational mesh, we create a graph by considering finite-volume cells as nodes and adjacent cells as being connected by edges. The inputs are determined by the topographical properties of the domain and the initial hydraulic conditions. The GNN then determines how fluxes are exchanged between cells via a learned local function. We overcome the time-step constraints by stacking multiple GNN layers, which expand the considered space instead of increasing the time resolution. We also propose a multi-step-ahead loss function along with a curriculum learning strategy to improve the stability and performance. We validate this approach using a dataset of two-dimensional dike breach flood simulations in randomly generated digital elevation models generated with a high-fidelity numerical solver. The SWE–GNN model predicts the spatio-temporal evolution of the flood for unseen topographies with mean average errors in time of 0.04 m for water depths and 0.004 m

Accurate flood models are essential for risk assessment, early warning, and preparedness for flood events. Numerical models can characterize how floods evolve in space and time, with the two-dimensional (2D) hydrodynamic models being the most popular

Data-driven alternatives speed up numerical solvers

To overcome this issue, the community is investigating the generalizability of deep-learning models to different study areas.

To overcome this limitation, we propose SWE–GNN, a deep-learning model merging graph neural networks (GNNs) with the finite-volume methods used to solve the SWEs. GNNs generalize convolutional neural networks to irregular domains such as graphs and have shown promising results for fluid dynamics

We tested our model on dike breach flood simulations due to their time-sensitive nature and the presence of uncertainties in topography and breach formation

We develop a new graph neural network model where the propagation rule and the inputs are taken from the shallow water equations. In particular, the hydraulic variables propagate based on their gradient across neighbouring finite-volume cells.

We improve the model's stability by training it via a multi-step-ahead loss function, which results in stable predictions up to 120 h ahead using only the information of the first hour as initial hydraulic input.

We show that the proposed model can serve as a surrogate for numerical solvers for spatio-temporal flood modelling in unseen topographies and unseen breach locations, with speed-ups of 2 orders of magnitude.

In this section, we describe the theory supporting our proposed model. First, we discuss numerical models for flood modelling; then, we present deep-learning models, focusing on graph neural networks. Throughout the paper, we use the standard vector notation, with

When assuming negligible vertical accelerations, floods can be modelled via the SWEs

The SWE cannot be solved analytically unless some simplifications are enforced. Thus, they are commonly solved via spatio-temporal numerical discretizations, such as the finite-volume method

Schematic representation of an arbitrary triangular volume mesh and its dual graph.

In numerical models with explicit discretization, stability is enforced by satisfying the Courant–Friedrichs–Lewy (CFL) condition, which imposes the numerical propagation speed to be lower than the physical one

Deep learning obtains non-linear high-dimensional representations from data via multiple levels of abstraction

The most general type of neural network is a multi-layer perceptron (MLP). It is formed by stacking linear models followed by a point-wise non-linearity (e.g. rectified linear unit, ReLU,

GNNs use graphs as an inductive bias to tackle the curse of dimensionality. This bias can be relevant for data represented via networks and meshes, as it allows these models to generalize to unseen graphs. That is, the same model can be applied to different topographies discretized by different meshes. GNNs work by propagating features defined on the nodes, based on how they are connected. The propagation rule is then essential in correctly modelling a physical system. However, standard GNNs do not include physics-based rules, meaning that the propagation rules may lead to unrealistic results.

Overview of the proposed SWE–GNN model. The model

We develop a graph neural network in which the computations are based on the shallow water equations. The proposed model takes as input both static and dynamic features that represent the topography of the domain and the hydraulic variables at time

SWE–GNN is an encoder–processor–decoder architecture inspired by

We employ three separate encoders for processing the static node features

We employed as a processor an

The GNN's output represents an embedding of the predicted hydraulic variables at time

Symmetrically to the encoder, the decoder is composed of an MLP

We define input features on the nodes and edges based on the SWE terms (see Eq.

Example of auto-regressive prediction for

The model learns from input–output data pairs. To stabilize the output of the SWE–GNN over time, we employ a multi-step-ahead loss function

To improve the training speed and stability, we also employed a curriculum learning strategy (Algorithm 1). This consists in progressively increasing the prediction horizon in Eq. (

Curriculum learning strategy.

Update the parameters

We considered 130 numerical simulations of dike breach floods run on randomly generated topographies over two squared domains of sizes

Summary of the datasets employed for training (TR), validation (VA), and testing (TE). The uncertainty accounts for the variability across the different simulations in each dataset.

We generated random digital elevation models using the Perlin noise generator

We employed a high-fidelity numerical solver, Delft3D-FM, which solves the full shallow water equations using an implicit scheme on staggered grids and adaptive time steps

We created three datasets with different area sizes and breach locations as summarized in Table

The first dataset consists of 100 DEMs over a squared domain of

The second dataset consists of 20 DEMs over a squared domain of

The third dataset consists of 10 DEMs over a squared domain of

Distribution of the breach locations (red crosses) for datasets 2 and 3.

Unless otherwise mentioned, we selected a temporal resolution of

We trained all models via the Adam optimization algorithm

We trained all the models using Pytorch (version 1.13.1)

We evaluated the performance using the multi-step-ahead RMSE (Eq.

The proposed SWE–GNN model is compared with other deep-learning methods, including the following.

CNN: encoder–decoder convolutional neural network based on U-Net

GAT: graph attention network

GCN: graph convolutional neural network

SWE–GNN

Performance of the deep-learning models over test dataset 1. The provided uncertainty estimates account for the variability across the different simulations in the dataset. Bold results indicate the best performances considering a statistical significance with a

In Table

Comparison of the proposed SWE–GNN model against the CNN for two examples in test datasets 2

We further tested the already trained models on datasets 2 and 3, with unseen topographies, unseen breach locations, larger domain sizes, and longer simulation times, as described in Table

Performance of the deep-learning models over test datasets 2 and 3, respectively, composed of unseen domains with unseen breach locations and unseen domains 4 times bigger than the training ones, also with unseen breach locations. The provided uncertainty estimates account for the variability across different simulations. Bold results indicate the best performances, considering a statistical significance with a

Table

SWE–GNN model predictions for water depth

In Fig.

Over the entire test part of dataset 1, the model achieves MAEs of 0.04 m for water depth and 0.004 m

Temporal evolution of CSI scores, MAE, and RMSE for test dataset 1. The confidence bands refer to 1 standard deviation from the mean.

Relationship between the number of GNN layers and different temporal resolutions in terms of the validation RMSE and validation CSI. As the temporal resolution decreases and, conversely, as the time step increases, the optimal number of GNN layers, in terms of the desired performance level, increases.

Pareto fronts (red-dotted lines) in terms of speed-ups, RMSE, and CSI for a varying number of parameters, for both CPUs and GPUs, for a temporal resolution of

We illustrate the spatio-temporal performance of the model on a test sample in Fig.

We also observe the average performance of the different metrics over time, for the whole test dataset 1, in Fig.

Next, we analysed the relationship between the number of GNN layers and the temporal resolution to validate the hypothesis that the number of layers is correlated with the time steps. Following the CFL condition, we can expand the computational domain by increasing the number of GNN layers in the model instead of decreasing the time steps. We considered several models with an increasing number of GNN layers targeting temporal resolutions of

Finally, we explored different model complexity combinations, expressed by the number of GNN layers and the latent space size, to determine a Pareto front for validation loss and speed-up, which results in a trade-off between fast and accurate models. Figure

Finally, we performed a sensitivity analysis of the role of the multi-step-ahead function (see Eq.

Influence of

Figure

We proposed a deep-learning model for rapid flood modelling, called SWE–GNN, inspired by shallow water equations (SWEs) and graph neural networks (GNNs). The model takes the same inputs as a numerical model, i.e. the spatial discretization of the domain, elevation, slopes, and initial values of the hydraulic variables, and predicts their evolution in time in an auto-regressive manner. The results show that the SWE–GNN can correctly predict the evolution of water depth and discharges with mean average errors in time of 0.04 m and 0.004 m

In line with the hypothesis, GNNs proved to be a valuable tool for spatio-temporal surrogate modelling of floods. The analogy with finite-volume methods is relevant for three reasons. First, it improves the deep-learning model's interpretability, as the weights in the graph propagation rule can be interpreted as an approximate Riemann solver and multiple GNN layers can be seen as intermediate steps of a multi-step method such as Runge–Kutta. Second, the analogy also provides an existing framework to include conservation laws in the model and links two fields that can benefit from each other's advances. For example, multiple spatial and temporal resolutions could be jointly used in place of a fixed one, similarly to

The current analysis was carried out under a constant breach inflow as a boundary condition. Further research should extend the analysis to time-varying boundary conditions to better represent complex real-world scenarios. One solution is to employ ghost cells typical of numerical models

Future research should investigate the new modelling approach in flood risk assessment and emergency preparation. This implies creating ensembles of flood simulations to reflect uncertainties, flood warning and predicting extreme events, and exploring adaptive modelling during floods by incorporating real-time observations. The model should also be validated in real case studies featuring linear elements such as secondary dikes and roads typical of polder areas. Further work could also address breach uncertainty in terms of timing, size, growth, and number of breaches. Moreover, future works should aim at improving the model's Pareto front. To improve the speed-up, one promising research direction would be to employ multi-scale methods that allow one to reduce the number of message-passing operations while still maintaining the same interaction range

In this Appendix, we further detail the different inputs and outputs, the hyperparameters, and the models' architectures used in Sect.

Figure

Table

We compared the proposed model against two benchmark GNNs that employ different propagation rules. Since those models cannot independently process static and dynamic attributes, in contrast to the SWE–GNN, we stacked the node inputs into a single node feature matrix

The GCN employs the normalized Laplacian connectivity matrix to define the edge weights

GAT employs an attention-based mechanism to define the edge weights

The encoder–decoder convolutional neural network is an architecture composed of two parts (Fig.

Detailed inputs and outputs used in the paper considering a regular mesh,

Summary of the hyperparameters and related value ranges employed for the different deep-learning models. The bold values indicate the best configuration in terms of validation loss.

U-Net-based CNN architecture employed in the experiments, with the first embedding dimension of 64 and three encoding blocks. Each block is composed of one convolutional layer, followed by a batch normalization layer, a PReLU activation function, another convolutional layer, and finally a pooling layer. All the blocks with the same dimensions are connected by residual connections indicated by the horizontal lines.

We employed the models trained with different combinations of the number of GNN layers and embedding sizes (Sect.

Pareto fronts on test dataset 3 (red-dotted lines) in terms of speed-ups, RMSE, and CSI for a varying number of parameters for a temporal resolution of

The employed dataset can be found at

The simulations on test datasets 1, 2, and 3, run with the presented model, can be found at

RB: conceptualization, methodology, software, validation, data curation, writing – original draft preparation, visualization, writing – review and editing. EI: supervision, methodology, writing – review and editing, funding acquisition. SNJ: supervision, writing – review and editing. RT: conceptualization, supervision, writing – review and editing, funding acquisition, project administration.

The contact author has declared that none of the authors has any competing interests.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

This work is supported by the TU Delft AI Initiative programme. We thank Ron Bruijns for providing the dataset to carry out the preliminary experiments. We thank Deltares for providing the license for Delft3D to run the numerical simulations.

This paper was edited by Albrecht Weerts and reviewed by Daniel Klotz and one anonymous referee.