the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Enhancing generalizability of data-driven urban flood models by incorporating contextual information
Abstract. Fast urban pluvial flood models are necessary for a range of applications, such as near real-time flood nowcasting or processing large rainfall ensembles for uncertainty analysis. Data-driven models can help overcome the long computational time of traditional flood simulation models, and the state-of-the-art models have shown promising accuracy. Yet the lack of generalizability of urban pluvial flood data-driven models to both terrain and rainfall events still limits their application. These models usually adopt a patch-based framework to overcome multiple bottlenecks, such as data availability and computational and memory constraints. However, this approach does not incorporate contextual information of the terrain surrounding the small image patch (typically 256 m x 256 m). We propose a new deep-learning model that maintains the high-resolution information of the local patch and incorporates a larger context to increase the visual field of the model with the aim of enhancing the generalizability of urban pluvial flood data-driven models. We trained and tested the model in the city of Zurich (Switzerland), at a spatial resolution of 1 m, for 1-hour rainfall events at 5 min temporal resolution. We demonstrate that our model can faithfully represent flood depths for a wide range of rainfall events, with peak rainfall intensities ranging from 42.5 mm h-1 to 161.4 mm h-1. Then, we assessed the model’s terrain generalizability in distinct urban settings, namely Luzern (Switzerland) and Singapore. The model accurately identifies locations of water accumulation, which constitutes an improvement compared to other deep-learning models. Using transfer learning, the model was successfully retrained in the new cities, requiring only a single rainfall event to adapt the model to new terrains while preserving adaptability across diverse rainfall conditions. Our results indicate that by incorporating contextual terrain information into the local patches, our proposed model effectively generates high-resolution urban pluvial flood maps, demonstrating applicability across varied terrains and rainfall events.
- Preprint
(6544 KB) - Metadata XML
-
Supplement
(1473 KB) - BibTeX
- EndNote
Status: open (until 10 May 2024)
-
RC1: 'Comment on hess-2024-63', Anonymous Referee #1, 19 Mar 2024
reply
The authors propose an improved version of a patch-based CNN that can include the terrain features at multiple scales, to predict the maximum inundation depths for several catchments and rainfall events.
Moreover, they show that the model benefits from transfer learning, even with little data.
The paper is overall well presented and with some interesting and original conclusions.
However, there are also several minor changes that I believe would further improve the strengths of the paper and that must be addressed before publication.General comments:
In the introduction, there is no mention of other DL models for flood prediction.
I would add a longer section that includes the recent developments not only for CNN-based models with terrain generalizability.
You could, for example, add some of the following: (Fraehr et al., 2023; Bentivoglio et al., 2023; Berkhahn and Neuweiler, 2024; Liao et al., 2023; Burrichter et al., 2023; He et al., 2023).In the introduction, you argue that there's a lack of generalizability to unseen case studies but then you also cite several papers that tackle this issue. While I agree that generalizability to unseen case studies is still an area of research, I would not frame it exactly as a gap in the research.
This applies also to the generalizability for both terrain and rainfall that you mention in the abstract (lines 3-5) as there are already papers that deal with it, such as do Lago et al. (2023).
Same goes for the idea of "contextual information". I would argue that this idea only/mainly applies to case studies with higher spatial resolution, as you have in your experiments.
Thus in does not seem fit for a gap in the literature. I would try to reformulate this gap by arguing that for high spatial resolution domains would benefit from including the information at lower spatial resolutions, because there are certain patterns in the topography that cannot be captured with patch-based models that use high-resolution data.The paper misses all formulas employed in the model. Despite the sufficiently clear Figures 2 and S1, I would recommend adding the key equations employed by your model and also some equations for the metrics that you employ.
All testing metrics are reported via the MSE, which makes the physical interpretation unclear. I would suggest changing the testing metrics to either root mean squared error or mean absolute error, which can both be in meters instead of meters squared.
It's also not clear to me why you consider the MSE only for water depths larger than 0.1 m. This would make more sense for determining a spatial metric, which might be strongly influenced by those small values. But I would keep the regression metric (whichever you end up considering) evaluated over the whole domain, without thresholds.
Moreover, you should include some metric to assess the spatial accuracy, for example with the critical success index, which was used in several previous studies (e.g., Löwe et al., 2021; do Lago et al., 2023; Bentivoglio et al., 2023).
Please also define all the metrics you employ before you discuss the results.I think sections S1 and S2 can be merged within the main text, since they both include some useful information and they are not that long.
I also think Table S1 should be included in the main manuscript as it shows a valuable comparison with another model, which is generally lacking in the rest of the paper.
On this regard, I believe your paper would benefit by comparing with the study from Guo et al. 2021, which you cite several times. You also state that you outperform this model (line 350), so I would add a longer analysis if you want to claim that, comparing for example different metrics over the whole test dataset, on both models.Specific comments:
Line 6: it is not clear what you refer to with contextual information. This becomes more clear only later on in the paper, but since it's one of your main improvements I would try to clarify it better also in the abstract.
This idea of "context" also emerges when you describe your model in section 2. In a similar fashion, I would clarify better what you mean by context.
For example, in line 76 "... extract and combine the information from the high-resolution local patch, its context and the rainfall times series to emulate the corresponding flood map" the term context seems very generic. You could use the notion of multi-scale spatial features (that you use in line 79) to help you clarify (or substitute) your notion of context.Line 54: I would add in the patch-based methods also the CGAN from do Lago et al. (2023).
line 71: I would add at least a reference here.
End of introduction: not necessarily needed but you could add a paragraph that specifies how the rest of the paper is structured.
Line 103: you mention that you include an RNN because it allows you to model rainfall for different events' duration, yet you only consider events of 1 hour. Moreover, depending on the type of RNN you are using (which is not clear from the architecture) you might have a number of outputs which depend on the length of your input sequence, despite it is true that your RNN in theory works with any input length. How would you deal with input hyetographs that have different durations?Figure 2: I would add a reference to more complete version of the architecture that is in the supplementary material, as it helps clarifying how do the scaled dot-product attention and the multi-context fusion work.
I would also specify somewhere (and not only in the manuscript) that you don't only take the DEM as input, but also other several ones (despite obtained from the DEM).line 107: I think that section S1 can be merged with section 2.5, since it is quite small and would help understanding right away what the normalized accumulated rainfall is.
Section S1: it is not clear if Pmin refers to the event with the least accumulated rainfall for all simulations (training and testing) or just the training ones.
line 108: adding the adjective "locality-aware contextual" to the output of the attention, makes it seem like there is another operation in between. Consider removing it for clarity.
line 110: please define what you consider as upsampling layer.
line 135: i assume that the upscaling is done via a sort of mean pooling, but in the text it's not clear if you are using another strategy.
line 147: i suppose you mean that the model should be equivariant to rotations, i.e., a rotation of the input should result in an equivalent rotation of the output.
It would be interesting to also see what is the effect of adding this data augmentation to your training dataset (either in the results section or in the supplementary material).line 156: normalization and min-max scaling are not equivalent. Min-max scaling is a type of normalization. Please clarify it in the text.
sections 4 and 5.1 can probably be merged with section 3.
lines 210-211: the shape notation seems to range from 1 to 9 instead of from 1 to 3, cfr. Fig. 3. In general, the notation $P_ms$ seems confusing at times.
Figure 5: I am not convinced by the overlapping of pie charts and violin plots. I would either separate them in two figures so that the pie chart becomes clearer or improve the legend for the pie chart.
lines 230-237: I think Table S1 should be included in the main manuscript as it shows a valuable comparison with another model, which is generally lacking in the rest of the paper.
line 245: you mention that the resolution for Singapore is 2m. Does this mean that the large-scales are at 4 and 8 meters now? Shouldn't this affect the performance of your deep learning model, since you are capturing different processes now?
Section 6.1.1: did you also keep the same learning rate? Generally a high learning rate might cause your model's weights to deviate substantially from the pre-trained model.
Figure 7: I would specify that this figure is using the transferred model.
lines 367-368: I got a bit confused with the term different topographical features. You could maybe use some different term such as "a broader variety of topographical features".
lines 368-369: do you have any supporting data for this claim? I don't recall seeing in the manuscript any analysis on the amount of overlapping or over-sampling.
lines 386-392: I am not sure this analysis adds much to the discussion, though I agree that there is a lack of a common benchmark dataset.
Maybe consider also merging discussions and conclusions since there are some overlaps and the conclusions themselves are not too long.
Technical corrections:
line 43: there seems to be an extra "("
References:Bentivoglio, R., Isufi, E., Jonkman, S. N., and Taormina, R.: Rapid spatio-temporal flood modelling via hydraulics-based graph neural networks, Hydrology and Earth System Sciences, 27, 4227–4246, https://doi.org/10.5194/hess-27-4227-2023, 2023.
Berkhahn, S. and Neuweiler, I.: Data driven real-time prediction of urban floods with spatial and temporal distribution, Journal of Hydrology X, 22, 100 167, 2024.
Burrichter, B., Hofmann, J., Koltermann da Silva, J., Niemann, A., and Quirmbach, M.: A Spatiotemporal Deep Learning Approach for Urban Pluvial Flood Forecasting with Multi-Source Data, Water, 15, 1760, 2023.
do Lago, C. A., Giacomoni, M. H., Bentivoglio, R., Taormina, R., Gomes, M. N., and Mendiondo, E. M.: Generalizing rapid flood predictions to unseen urban catchments with conditional generative adversarial networks, Journal of Hydrology, p. 129276, https://doi.org/https://doi.org/10.1016/j.jhydrol.2023.129276, 2023.
Fraehr, N., Wang, Q. J., Wu, W., and Nathan, R.: Supercharging hydrodynamic inundation models for instant flood insight, Nature Water, 1, 835–843, 2023.
He, J., Zhang, L., Xiao, T., Wang, H., and Luo, H.: Deep learning enables super-resolution hydrodynamic flooding process modeling under spatiotemporally varying rainstorms, Water Research, 239, 120 057, 2023.
Liao, Y., Wang, Z., Chen, X., and Lai, C.: Fast simulation and prediction of urban pluvial floods using a deep convolutional neural network model, Journal of Hydrology, 624, 129 945, https://doi.org/https://doi.org/10.1016/j.jhydrol.2023.129945, 2023.
Löwe, R., Böhm, J., Jensen, D. G., Leandro, J., and Rasmussen, S. H.: U-FLOOD – Topographic deep learning for predicting urban pluvial flood water depth, Journal of Hydrology, 603, 126 898, https://doi.org/https://doi.org/10.1016/j.jhydrol.2021.126898, 2021.Citation: https://doi.org/10.5194/hess-2024-63-RC1 -
AC1: 'Reply on RC1', Tabea Cache, 21 Mar 2024
reply
We would like to thank the reviewer for dedicating their time and effort to evaluate our manuscript, and for their favorable assessment. Additionally, we highly appreciate the constructive feedback provided, including the suggestions for improvements. We will incorporate the suggested revisions into the revised manuscript and address them point-by-point when submitting the revised text. We would like to share in advance how we plan to address some of the key comments from the reviewer:
- Recent developments in other DL models for flood predictions, and their generalizability: We agree with the reviewer that the manuscript currently lacks the presentation of recent developments in DL models for flood predictions, especially for non-CNN-based models. A paragraph will be added accordingly to the revised manuscript and will also include a discussion on their generalizability potential and limitations.
- Contextual information in high spatial resolution areas: Thank you for highlighting that the presented contextual information framework mainly applies to case studies that require high spatial resolution. The text will be revised to clarify the necessity of high spatial resolution in the context of flood mapping in urban areas.
- Formulas and metrics: We thank the reviewer for raising this point and agree with the suggestion. We will add the key equations used in the model and some equations for the metrics. Additionally, we will replace the MSE with the RMSE, as suggested by the reviewer, to make the physical interpretation clearer. Regarding the assessment of spatial accuracy, this was measured by the precision score. However, we agree with the reviewer that the use of a metric that is common across various studies could be valuable. We will therefore replace the precision score with the critical success index.
Citation: https://doi.org/10.5194/hess-2024-63-AC1
-
AC1: 'Reply on RC1', Tabea Cache, 21 Mar 2024
reply
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
243 | 73 | 18 | 334 | 28 | 15 | 8 |
- HTML: 243
- PDF: 73
- XML: 18
- Total: 334
- Supplement: 28
- BibTeX: 15
- EndNote: 8
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1