the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Enhancing generalizability of data-driven urban flood models by incorporating contextual information
Abstract. Fast urban pluvial flood models are necessary for a range of applications, such as near real-time flood nowcasting or processing large rainfall ensembles for uncertainty analysis. Data-driven models can help overcome the long computational time of traditional flood simulation models, and the state-of-the-art models have shown promising accuracy. Yet the lack of generalizability of urban pluvial flood data-driven models to both terrain and rainfall events still limits their application. These models usually adopt a patch-based framework to overcome multiple bottlenecks, such as data availability and computational and memory constraints. However, this approach does not incorporate contextual information of the terrain surrounding the small image patch (typically 256 m x 256 m). We propose a new deep-learning model that maintains the high-resolution information of the local patch and incorporates a larger context to increase the visual field of the model with the aim of enhancing the generalizability of urban pluvial flood data-driven models. We trained and tested the model in the city of Zurich (Switzerland), at a spatial resolution of 1 m, for 1-hour rainfall events at 5 min temporal resolution. We demonstrate that our model can faithfully represent flood depths for a wide range of rainfall events, with peak rainfall intensities ranging from 42.5 mm h-1 to 161.4 mm h-1. Then, we assessed the model’s terrain generalizability in distinct urban settings, namely Luzern (Switzerland) and Singapore. The model accurately identifies locations of water accumulation, which constitutes an improvement compared to other deep-learning models. Using transfer learning, the model was successfully retrained in the new cities, requiring only a single rainfall event to adapt the model to new terrains while preserving adaptability across diverse rainfall conditions. Our results indicate that by incorporating contextual terrain information into the local patches, our proposed model effectively generates high-resolution urban pluvial flood maps, demonstrating applicability across varied terrains and rainfall events.
- Preprint
(6544 KB) - Metadata XML
-
Supplement
(1473 KB) - BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on hess-2024-63', Anonymous Referee #1, 19 Mar 2024
The authors propose an improved version of a patch-based CNN that can include the terrain features at multiple scales, to predict the maximum inundation depths for several catchments and rainfall events.
Moreover, they show that the model benefits from transfer learning, even with little data.
The paper is overall well presented and with some interesting and original conclusions.
However, there are also several minor changes that I believe would further improve the strengths of the paper and that must be addressed before publication.General comments:
In the introduction, there is no mention of other DL models for flood prediction.
I would add a longer section that includes the recent developments not only for CNN-based models with terrain generalizability.
You could, for example, add some of the following: (Fraehr et al., 2023; Bentivoglio et al., 2023; Berkhahn and Neuweiler, 2024; Liao et al., 2023; Burrichter et al., 2023; He et al., 2023).In the introduction, you argue that there's a lack of generalizability to unseen case studies but then you also cite several papers that tackle this issue. While I agree that generalizability to unseen case studies is still an area of research, I would not frame it exactly as a gap in the research.
This applies also to the generalizability for both terrain and rainfall that you mention in the abstract (lines 3-5) as there are already papers that deal with it, such as do Lago et al. (2023).
Same goes for the idea of "contextual information". I would argue that this idea only/mainly applies to case studies with higher spatial resolution, as you have in your experiments.
Thus in does not seem fit for a gap in the literature. I would try to reformulate this gap by arguing that for high spatial resolution domains would benefit from including the information at lower spatial resolutions, because there are certain patterns in the topography that cannot be captured with patch-based models that use high-resolution data.The paper misses all formulas employed in the model. Despite the sufficiently clear Figures 2 and S1, I would recommend adding the key equations employed by your model and also some equations for the metrics that you employ.
All testing metrics are reported via the MSE, which makes the physical interpretation unclear. I would suggest changing the testing metrics to either root mean squared error or mean absolute error, which can both be in meters instead of meters squared.
It's also not clear to me why you consider the MSE only for water depths larger than 0.1 m. This would make more sense for determining a spatial metric, which might be strongly influenced by those small values. But I would keep the regression metric (whichever you end up considering) evaluated over the whole domain, without thresholds.
Moreover, you should include some metric to assess the spatial accuracy, for example with the critical success index, which was used in several previous studies (e.g., Löwe et al., 2021; do Lago et al., 2023; Bentivoglio et al., 2023).
Please also define all the metrics you employ before you discuss the results.I think sections S1 and S2 can be merged within the main text, since they both include some useful information and they are not that long.
I also think Table S1 should be included in the main manuscript as it shows a valuable comparison with another model, which is generally lacking in the rest of the paper.
On this regard, I believe your paper would benefit by comparing with the study from Guo et al. 2021, which you cite several times. You also state that you outperform this model (line 350), so I would add a longer analysis if you want to claim that, comparing for example different metrics over the whole test dataset, on both models.Specific comments:
Line 6: it is not clear what you refer to with contextual information. This becomes more clear only later on in the paper, but since it's one of your main improvements I would try to clarify it better also in the abstract.
This idea of "context" also emerges when you describe your model in section 2. In a similar fashion, I would clarify better what you mean by context.
For example, in line 76 "... extract and combine the information from the high-resolution local patch, its context and the rainfall times series to emulate the corresponding flood map" the term context seems very generic. You could use the notion of multi-scale spatial features (that you use in line 79) to help you clarify (or substitute) your notion of context.Line 54: I would add in the patch-based methods also the CGAN from do Lago et al. (2023).
line 71: I would add at least a reference here.
End of introduction: not necessarily needed but you could add a paragraph that specifies how the rest of the paper is structured.
Line 103: you mention that you include an RNN because it allows you to model rainfall for different events' duration, yet you only consider events of 1 hour. Moreover, depending on the type of RNN you are using (which is not clear from the architecture) you might have a number of outputs which depend on the length of your input sequence, despite it is true that your RNN in theory works with any input length. How would you deal with input hyetographs that have different durations?Figure 2: I would add a reference to more complete version of the architecture that is in the supplementary material, as it helps clarifying how do the scaled dot-product attention and the multi-context fusion work.
I would also specify somewhere (and not only in the manuscript) that you don't only take the DEM as input, but also other several ones (despite obtained from the DEM).line 107: I think that section S1 can be merged with section 2.5, since it is quite small and would help understanding right away what the normalized accumulated rainfall is.
Section S1: it is not clear if Pmin refers to the event with the least accumulated rainfall for all simulations (training and testing) or just the training ones.
line 108: adding the adjective "locality-aware contextual" to the output of the attention, makes it seem like there is another operation in between. Consider removing it for clarity.
line 110: please define what you consider as upsampling layer.
line 135: i assume that the upscaling is done via a sort of mean pooling, but in the text it's not clear if you are using another strategy.
line 147: i suppose you mean that the model should be equivariant to rotations, i.e., a rotation of the input should result in an equivalent rotation of the output.
It would be interesting to also see what is the effect of adding this data augmentation to your training dataset (either in the results section or in the supplementary material).line 156: normalization and min-max scaling are not equivalent. Min-max scaling is a type of normalization. Please clarify it in the text.
sections 4 and 5.1 can probably be merged with section 3.
lines 210-211: the shape notation seems to range from 1 to 9 instead of from 1 to 3, cfr. Fig. 3. In general, the notation $P_ms$ seems confusing at times.
Figure 5: I am not convinced by the overlapping of pie charts and violin plots. I would either separate them in two figures so that the pie chart becomes clearer or improve the legend for the pie chart.
lines 230-237: I think Table S1 should be included in the main manuscript as it shows a valuable comparison with another model, which is generally lacking in the rest of the paper.
line 245: you mention that the resolution for Singapore is 2m. Does this mean that the large-scales are at 4 and 8 meters now? Shouldn't this affect the performance of your deep learning model, since you are capturing different processes now?
Section 6.1.1: did you also keep the same learning rate? Generally a high learning rate might cause your model's weights to deviate substantially from the pre-trained model.
Figure 7: I would specify that this figure is using the transferred model.
lines 367-368: I got a bit confused with the term different topographical features. You could maybe use some different term such as "a broader variety of topographical features".
lines 368-369: do you have any supporting data for this claim? I don't recall seeing in the manuscript any analysis on the amount of overlapping or over-sampling.
lines 386-392: I am not sure this analysis adds much to the discussion, though I agree that there is a lack of a common benchmark dataset.
Maybe consider also merging discussions and conclusions since there are some overlaps and the conclusions themselves are not too long.
Technical corrections:
line 43: there seems to be an extra "("
References:Bentivoglio, R., Isufi, E., Jonkman, S. N., and Taormina, R.: Rapid spatio-temporal flood modelling via hydraulics-based graph neural networks, Hydrology and Earth System Sciences, 27, 4227–4246, https://doi.org/10.5194/hess-27-4227-2023, 2023.
Berkhahn, S. and Neuweiler, I.: Data driven real-time prediction of urban floods with spatial and temporal distribution, Journal of Hydrology X, 22, 100 167, 2024.
Burrichter, B., Hofmann, J., Koltermann da Silva, J., Niemann, A., and Quirmbach, M.: A Spatiotemporal Deep Learning Approach for Urban Pluvial Flood Forecasting with Multi-Source Data, Water, 15, 1760, 2023.
do Lago, C. A., Giacomoni, M. H., Bentivoglio, R., Taormina, R., Gomes, M. N., and Mendiondo, E. M.: Generalizing rapid flood predictions to unseen urban catchments with conditional generative adversarial networks, Journal of Hydrology, p. 129276, https://doi.org/https://doi.org/10.1016/j.jhydrol.2023.129276, 2023.
Fraehr, N., Wang, Q. J., Wu, W., and Nathan, R.: Supercharging hydrodynamic inundation models for instant flood insight, Nature Water, 1, 835–843, 2023.
He, J., Zhang, L., Xiao, T., Wang, H., and Luo, H.: Deep learning enables super-resolution hydrodynamic flooding process modeling under spatiotemporally varying rainstorms, Water Research, 239, 120 057, 2023.
Liao, Y., Wang, Z., Chen, X., and Lai, C.: Fast simulation and prediction of urban pluvial floods using a deep convolutional neural network model, Journal of Hydrology, 624, 129 945, https://doi.org/https://doi.org/10.1016/j.jhydrol.2023.129945, 2023.
Löwe, R., Böhm, J., Jensen, D. G., Leandro, J., and Rasmussen, S. H.: U-FLOOD – Topographic deep learning for predicting urban pluvial flood water depth, Journal of Hydrology, 603, 126 898, https://doi.org/https://doi.org/10.1016/j.jhydrol.2021.126898, 2021.Citation: https://doi.org/10.5194/hess-2024-63-RC1 -
AC1: 'Reply on RC1', Tabea Cache, 21 Mar 2024
We would like to thank the reviewer for dedicating their time and effort to evaluate our manuscript, and for their favorable assessment. Additionally, we highly appreciate the constructive feedback provided, including the suggestions for improvements. We will incorporate the suggested revisions into the revised manuscript and address them point-by-point when submitting the revised text. We would like to share in advance how we plan to address some of the key comments from the reviewer:
- Recent developments in other DL models for flood predictions, and their generalizability: We agree with the reviewer that the manuscript currently lacks the presentation of recent developments in DL models for flood predictions, especially for non-CNN-based models. A paragraph will be added accordingly to the revised manuscript and will also include a discussion on their generalizability potential and limitations.
- Contextual information in high spatial resolution areas: Thank you for highlighting that the presented contextual information framework mainly applies to case studies that require high spatial resolution. The text will be revised to clarify the necessity of high spatial resolution in the context of flood mapping in urban areas.
- Formulas and metrics: We thank the reviewer for raising this point and agree with the suggestion. We will add the key equations used in the model and some equations for the metrics. Additionally, we will replace the MSE with the RMSE, as suggested by the reviewer, to make the physical interpretation clearer. Regarding the assessment of spatial accuracy, this was measured by the precision score. However, we agree with the reviewer that the use of a metric that is common across various studies could be valuable. We will therefore replace the precision score with the critical success index.
Citation: https://doi.org/10.5194/hess-2024-63-AC1
-
AC1: 'Reply on RC1', Tabea Cache, 21 Mar 2024
-
RC2: 'Comment on hess-2024-63', Anonymous Referee #2, 11 May 2024
This paper proposes a new machine learning architecture for predicting pluvial flood maps with the aim of achieving better transferability of the models between catchments. The authors also explore the value of transfer earning when applying the model to a new city.
The paper is generally interesting and the suggested approaches are quite innovative. I do however have several major criticisms that I think should be addressed before publishing. These are:
- The paper makes a number of methodological innovations, but their value is not demonstrated anywhere. How do we know if e.g. the combination of three encoders is responsible for the more spatially balanced prediction errors than Guo 2021 or the RNN time series encoder with normalisation? I think some comparisons between old and new model architecture that help us understand what has actually improved + why, should be included in the paper. The transfer learning results could be shortened to make space in the paper.
- The use case of the approach is not fully clear to me. I can agree with climate impact studies, while I think that the value of urbanisation studies has not been demonstrated. I think the latter would require implementing some changes in the terrain or the runoff behaviour of the same city and testing if the model can still predict the flood maps. This can be addressed by either clarifying the framing in the introduction/discussion, or by including additional results.
- Like R1, I also think that some of the supporting information clearly belongs into the paper (Section S2, Take S1)
Detailed comments
Line 58-66: Another approach to including context information is to engineer features that provide this information on the pixel level. For example providing flow accumulations or similar as input. I'm lacking a sentence on why you think this approach is not good enough and some references to related work (e.g. Pham 2020, Zhao 2020, there might be newer work that provides better examples)Line 108: Section S1 belongs into the paper. Why is scaling applied to the RNN output (which will be in some latent space) and not the input?
Line 142: the reason for including imperviousness in some of the other models is that it enables us to distinguish that some pixels generate more runoff than others. That is not related to the terrain features that you describe here. In fact, as far as I can see, you are relying on the "glass surface assumption" when calculating runoff. It would be good to clarify this
Line 153: Actually, the UFlood paper makes quite a big effort to avoid overlaps between training and validation patches. You can criticise it for using the validation data for both test and validation, but the validation data were fully independent from the training data
Line 159: This is very interesting. It implies that the convolution kernels must be learning differences between the features, not the absolute values. I think this should be elaborated more in the paper. Do we also see better performance in the results for Zurich? Could we get the same or better results by using localized terrain data (terrain minus minimum elevation in the patch)?
Line 214: How do you define wet cells? Based on the labels, predictions or both?
Line 302:
Don't we see in the figures that the models even after transfer learning perform quite a bit worse than in Zurich, and therefore we have not yet succeeded in creating models that can be applied in other cities?
I'm addition, in Figure 7 I'm missing the results for Luzern and Singapore without transfer learning, so that we can see the impact of the transfer.References
Pham, B.T., Luu, C., Phong, T. Van, Trinh, P.T., Shirzadi, A., Renoud, S., Asadi, S., Le, H. Van, von Meding, J., Clague, J.J., 2020. Can deep learning algorithms outperform benchmark machine learning algorithms in flood susceptibility modeling? J. Hydrol. 592, 125615. https://doi.org/10.1016/j.jhydrol.2020.125615
Zhao, G., Pang, B., Xu, Z., Peng, D., Zuo, D., 2020. Urban flood susceptibility assessment based on convolutional neural networks. J. Hydrol. 590, 125235. https://doi.org/10.1016/j.jhydrol.2020.125235Citation: https://doi.org/10.5194/hess-2024-63-RC2 -
AC2: 'Reply on RC2', Tabea Cache, 14 May 2024
We would like to thank the reviewer for the thoughtful evaluation of our manuscript and the constructive feedback provided. We believe the suggestions will allow us to improve the manuscript, and we will address them in detail in the revised version. Below, we outline our planned revisions to the major and detailed comments raised by the reviewer.
Major comments:
- Demonstration of the value of methodological innovations: We thank the reviewer for raising this point and agree that the manuscript currently lacks a detailed explanation of the individual methodological innovations. Specifically, the multi-resolution encoders were introduced to increase the visual field of the model relative to terrain, while the normalization of the RNN output was introduced after testing the model for zero-padded rainfall events. This will be explained in the revised version of the manuscript and a thoughtful comparison of the results with and without the RNN output normalization will be included.
- Clarifying use case: We appreciate the reviewer bringing this issue to our attention. By including additional simulations to demonstrate the performance of the model following urban modifications, the manuscript may become overly complicated. We will therefore clarify the framing in the introduction/discussion of the revised manuscript, as suggested by the reviewer.
- Incorporating key supporting information into the paper: We agree with the added value of incorporating some information currently in the supporting material into the manuscript. Specifically, we will bring the sections S1 and S2 in the revised version of the manuscript.
Detailed comments:
We thank the reviewer for pointing these out and will revise the manuscript accordingly.
Citation: https://doi.org/10.5194/hess-2024-63-AC2
Status: closed
-
RC1: 'Comment on hess-2024-63', Anonymous Referee #1, 19 Mar 2024
The authors propose an improved version of a patch-based CNN that can include the terrain features at multiple scales, to predict the maximum inundation depths for several catchments and rainfall events.
Moreover, they show that the model benefits from transfer learning, even with little data.
The paper is overall well presented and with some interesting and original conclusions.
However, there are also several minor changes that I believe would further improve the strengths of the paper and that must be addressed before publication.General comments:
In the introduction, there is no mention of other DL models for flood prediction.
I would add a longer section that includes the recent developments not only for CNN-based models with terrain generalizability.
You could, for example, add some of the following: (Fraehr et al., 2023; Bentivoglio et al., 2023; Berkhahn and Neuweiler, 2024; Liao et al., 2023; Burrichter et al., 2023; He et al., 2023).In the introduction, you argue that there's a lack of generalizability to unseen case studies but then you also cite several papers that tackle this issue. While I agree that generalizability to unseen case studies is still an area of research, I would not frame it exactly as a gap in the research.
This applies also to the generalizability for both terrain and rainfall that you mention in the abstract (lines 3-5) as there are already papers that deal with it, such as do Lago et al. (2023).
Same goes for the idea of "contextual information". I would argue that this idea only/mainly applies to case studies with higher spatial resolution, as you have in your experiments.
Thus in does not seem fit for a gap in the literature. I would try to reformulate this gap by arguing that for high spatial resolution domains would benefit from including the information at lower spatial resolutions, because there are certain patterns in the topography that cannot be captured with patch-based models that use high-resolution data.The paper misses all formulas employed in the model. Despite the sufficiently clear Figures 2 and S1, I would recommend adding the key equations employed by your model and also some equations for the metrics that you employ.
All testing metrics are reported via the MSE, which makes the physical interpretation unclear. I would suggest changing the testing metrics to either root mean squared error or mean absolute error, which can both be in meters instead of meters squared.
It's also not clear to me why you consider the MSE only for water depths larger than 0.1 m. This would make more sense for determining a spatial metric, which might be strongly influenced by those small values. But I would keep the regression metric (whichever you end up considering) evaluated over the whole domain, without thresholds.
Moreover, you should include some metric to assess the spatial accuracy, for example with the critical success index, which was used in several previous studies (e.g., Löwe et al., 2021; do Lago et al., 2023; Bentivoglio et al., 2023).
Please also define all the metrics you employ before you discuss the results.I think sections S1 and S2 can be merged within the main text, since they both include some useful information and they are not that long.
I also think Table S1 should be included in the main manuscript as it shows a valuable comparison with another model, which is generally lacking in the rest of the paper.
On this regard, I believe your paper would benefit by comparing with the study from Guo et al. 2021, which you cite several times. You also state that you outperform this model (line 350), so I would add a longer analysis if you want to claim that, comparing for example different metrics over the whole test dataset, on both models.Specific comments:
Line 6: it is not clear what you refer to with contextual information. This becomes more clear only later on in the paper, but since it's one of your main improvements I would try to clarify it better also in the abstract.
This idea of "context" also emerges when you describe your model in section 2. In a similar fashion, I would clarify better what you mean by context.
For example, in line 76 "... extract and combine the information from the high-resolution local patch, its context and the rainfall times series to emulate the corresponding flood map" the term context seems very generic. You could use the notion of multi-scale spatial features (that you use in line 79) to help you clarify (or substitute) your notion of context.Line 54: I would add in the patch-based methods also the CGAN from do Lago et al. (2023).
line 71: I would add at least a reference here.
End of introduction: not necessarily needed but you could add a paragraph that specifies how the rest of the paper is structured.
Line 103: you mention that you include an RNN because it allows you to model rainfall for different events' duration, yet you only consider events of 1 hour. Moreover, depending on the type of RNN you are using (which is not clear from the architecture) you might have a number of outputs which depend on the length of your input sequence, despite it is true that your RNN in theory works with any input length. How would you deal with input hyetographs that have different durations?Figure 2: I would add a reference to more complete version of the architecture that is in the supplementary material, as it helps clarifying how do the scaled dot-product attention and the multi-context fusion work.
I would also specify somewhere (and not only in the manuscript) that you don't only take the DEM as input, but also other several ones (despite obtained from the DEM).line 107: I think that section S1 can be merged with section 2.5, since it is quite small and would help understanding right away what the normalized accumulated rainfall is.
Section S1: it is not clear if Pmin refers to the event with the least accumulated rainfall for all simulations (training and testing) or just the training ones.
line 108: adding the adjective "locality-aware contextual" to the output of the attention, makes it seem like there is another operation in between. Consider removing it for clarity.
line 110: please define what you consider as upsampling layer.
line 135: i assume that the upscaling is done via a sort of mean pooling, but in the text it's not clear if you are using another strategy.
line 147: i suppose you mean that the model should be equivariant to rotations, i.e., a rotation of the input should result in an equivalent rotation of the output.
It would be interesting to also see what is the effect of adding this data augmentation to your training dataset (either in the results section or in the supplementary material).line 156: normalization and min-max scaling are not equivalent. Min-max scaling is a type of normalization. Please clarify it in the text.
sections 4 and 5.1 can probably be merged with section 3.
lines 210-211: the shape notation seems to range from 1 to 9 instead of from 1 to 3, cfr. Fig. 3. In general, the notation $P_ms$ seems confusing at times.
Figure 5: I am not convinced by the overlapping of pie charts and violin plots. I would either separate them in two figures so that the pie chart becomes clearer or improve the legend for the pie chart.
lines 230-237: I think Table S1 should be included in the main manuscript as it shows a valuable comparison with another model, which is generally lacking in the rest of the paper.
line 245: you mention that the resolution for Singapore is 2m. Does this mean that the large-scales are at 4 and 8 meters now? Shouldn't this affect the performance of your deep learning model, since you are capturing different processes now?
Section 6.1.1: did you also keep the same learning rate? Generally a high learning rate might cause your model's weights to deviate substantially from the pre-trained model.
Figure 7: I would specify that this figure is using the transferred model.
lines 367-368: I got a bit confused with the term different topographical features. You could maybe use some different term such as "a broader variety of topographical features".
lines 368-369: do you have any supporting data for this claim? I don't recall seeing in the manuscript any analysis on the amount of overlapping or over-sampling.
lines 386-392: I am not sure this analysis adds much to the discussion, though I agree that there is a lack of a common benchmark dataset.
Maybe consider also merging discussions and conclusions since there are some overlaps and the conclusions themselves are not too long.
Technical corrections:
line 43: there seems to be an extra "("
References:Bentivoglio, R., Isufi, E., Jonkman, S. N., and Taormina, R.: Rapid spatio-temporal flood modelling via hydraulics-based graph neural networks, Hydrology and Earth System Sciences, 27, 4227–4246, https://doi.org/10.5194/hess-27-4227-2023, 2023.
Berkhahn, S. and Neuweiler, I.: Data driven real-time prediction of urban floods with spatial and temporal distribution, Journal of Hydrology X, 22, 100 167, 2024.
Burrichter, B., Hofmann, J., Koltermann da Silva, J., Niemann, A., and Quirmbach, M.: A Spatiotemporal Deep Learning Approach for Urban Pluvial Flood Forecasting with Multi-Source Data, Water, 15, 1760, 2023.
do Lago, C. A., Giacomoni, M. H., Bentivoglio, R., Taormina, R., Gomes, M. N., and Mendiondo, E. M.: Generalizing rapid flood predictions to unseen urban catchments with conditional generative adversarial networks, Journal of Hydrology, p. 129276, https://doi.org/https://doi.org/10.1016/j.jhydrol.2023.129276, 2023.
Fraehr, N., Wang, Q. J., Wu, W., and Nathan, R.: Supercharging hydrodynamic inundation models for instant flood insight, Nature Water, 1, 835–843, 2023.
He, J., Zhang, L., Xiao, T., Wang, H., and Luo, H.: Deep learning enables super-resolution hydrodynamic flooding process modeling under spatiotemporally varying rainstorms, Water Research, 239, 120 057, 2023.
Liao, Y., Wang, Z., Chen, X., and Lai, C.: Fast simulation and prediction of urban pluvial floods using a deep convolutional neural network model, Journal of Hydrology, 624, 129 945, https://doi.org/https://doi.org/10.1016/j.jhydrol.2023.129945, 2023.
Löwe, R., Böhm, J., Jensen, D. G., Leandro, J., and Rasmussen, S. H.: U-FLOOD – Topographic deep learning for predicting urban pluvial flood water depth, Journal of Hydrology, 603, 126 898, https://doi.org/https://doi.org/10.1016/j.jhydrol.2021.126898, 2021.Citation: https://doi.org/10.5194/hess-2024-63-RC1 -
AC1: 'Reply on RC1', Tabea Cache, 21 Mar 2024
We would like to thank the reviewer for dedicating their time and effort to evaluate our manuscript, and for their favorable assessment. Additionally, we highly appreciate the constructive feedback provided, including the suggestions for improvements. We will incorporate the suggested revisions into the revised manuscript and address them point-by-point when submitting the revised text. We would like to share in advance how we plan to address some of the key comments from the reviewer:
- Recent developments in other DL models for flood predictions, and their generalizability: We agree with the reviewer that the manuscript currently lacks the presentation of recent developments in DL models for flood predictions, especially for non-CNN-based models. A paragraph will be added accordingly to the revised manuscript and will also include a discussion on their generalizability potential and limitations.
- Contextual information in high spatial resolution areas: Thank you for highlighting that the presented contextual information framework mainly applies to case studies that require high spatial resolution. The text will be revised to clarify the necessity of high spatial resolution in the context of flood mapping in urban areas.
- Formulas and metrics: We thank the reviewer for raising this point and agree with the suggestion. We will add the key equations used in the model and some equations for the metrics. Additionally, we will replace the MSE with the RMSE, as suggested by the reviewer, to make the physical interpretation clearer. Regarding the assessment of spatial accuracy, this was measured by the precision score. However, we agree with the reviewer that the use of a metric that is common across various studies could be valuable. We will therefore replace the precision score with the critical success index.
Citation: https://doi.org/10.5194/hess-2024-63-AC1
-
AC1: 'Reply on RC1', Tabea Cache, 21 Mar 2024
-
RC2: 'Comment on hess-2024-63', Anonymous Referee #2, 11 May 2024
This paper proposes a new machine learning architecture for predicting pluvial flood maps with the aim of achieving better transferability of the models between catchments. The authors also explore the value of transfer earning when applying the model to a new city.
The paper is generally interesting and the suggested approaches are quite innovative. I do however have several major criticisms that I think should be addressed before publishing. These are:
- The paper makes a number of methodological innovations, but their value is not demonstrated anywhere. How do we know if e.g. the combination of three encoders is responsible for the more spatially balanced prediction errors than Guo 2021 or the RNN time series encoder with normalisation? I think some comparisons between old and new model architecture that help us understand what has actually improved + why, should be included in the paper. The transfer learning results could be shortened to make space in the paper.
- The use case of the approach is not fully clear to me. I can agree with climate impact studies, while I think that the value of urbanisation studies has not been demonstrated. I think the latter would require implementing some changes in the terrain or the runoff behaviour of the same city and testing if the model can still predict the flood maps. This can be addressed by either clarifying the framing in the introduction/discussion, or by including additional results.
- Like R1, I also think that some of the supporting information clearly belongs into the paper (Section S2, Take S1)
Detailed comments
Line 58-66: Another approach to including context information is to engineer features that provide this information on the pixel level. For example providing flow accumulations or similar as input. I'm lacking a sentence on why you think this approach is not good enough and some references to related work (e.g. Pham 2020, Zhao 2020, there might be newer work that provides better examples)Line 108: Section S1 belongs into the paper. Why is scaling applied to the RNN output (which will be in some latent space) and not the input?
Line 142: the reason for including imperviousness in some of the other models is that it enables us to distinguish that some pixels generate more runoff than others. That is not related to the terrain features that you describe here. In fact, as far as I can see, you are relying on the "glass surface assumption" when calculating runoff. It would be good to clarify this
Line 153: Actually, the UFlood paper makes quite a big effort to avoid overlaps between training and validation patches. You can criticise it for using the validation data for both test and validation, but the validation data were fully independent from the training data
Line 159: This is very interesting. It implies that the convolution kernels must be learning differences between the features, not the absolute values. I think this should be elaborated more in the paper. Do we also see better performance in the results for Zurich? Could we get the same or better results by using localized terrain data (terrain minus minimum elevation in the patch)?
Line 214: How do you define wet cells? Based on the labels, predictions or both?
Line 302:
Don't we see in the figures that the models even after transfer learning perform quite a bit worse than in Zurich, and therefore we have not yet succeeded in creating models that can be applied in other cities?
I'm addition, in Figure 7 I'm missing the results for Luzern and Singapore without transfer learning, so that we can see the impact of the transfer.References
Pham, B.T., Luu, C., Phong, T. Van, Trinh, P.T., Shirzadi, A., Renoud, S., Asadi, S., Le, H. Van, von Meding, J., Clague, J.J., 2020. Can deep learning algorithms outperform benchmark machine learning algorithms in flood susceptibility modeling? J. Hydrol. 592, 125615. https://doi.org/10.1016/j.jhydrol.2020.125615
Zhao, G., Pang, B., Xu, Z., Peng, D., Zuo, D., 2020. Urban flood susceptibility assessment based on convolutional neural networks. J. Hydrol. 590, 125235. https://doi.org/10.1016/j.jhydrol.2020.125235Citation: https://doi.org/10.5194/hess-2024-63-RC2 -
AC2: 'Reply on RC2', Tabea Cache, 14 May 2024
We would like to thank the reviewer for the thoughtful evaluation of our manuscript and the constructive feedback provided. We believe the suggestions will allow us to improve the manuscript, and we will address them in detail in the revised version. Below, we outline our planned revisions to the major and detailed comments raised by the reviewer.
Major comments:
- Demonstration of the value of methodological innovations: We thank the reviewer for raising this point and agree that the manuscript currently lacks a detailed explanation of the individual methodological innovations. Specifically, the multi-resolution encoders were introduced to increase the visual field of the model relative to terrain, while the normalization of the RNN output was introduced after testing the model for zero-padded rainfall events. This will be explained in the revised version of the manuscript and a thoughtful comparison of the results with and without the RNN output normalization will be included.
- Clarifying use case: We appreciate the reviewer bringing this issue to our attention. By including additional simulations to demonstrate the performance of the model following urban modifications, the manuscript may become overly complicated. We will therefore clarify the framing in the introduction/discussion of the revised manuscript, as suggested by the reviewer.
- Incorporating key supporting information into the paper: We agree with the added value of incorporating some information currently in the supporting material into the manuscript. Specifically, we will bring the sections S1 and S2 in the revised version of the manuscript.
Detailed comments:
We thank the reviewer for pointing these out and will revise the manuscript accordingly.
Citation: https://doi.org/10.5194/hess-2024-63-AC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
559 | 163 | 35 | 757 | 66 | 35 | 25 |
- HTML: 559
- PDF: 163
- XML: 35
- Total: 757
- Supplement: 66
- BibTeX: 35
- EndNote: 25
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1