the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
All models are wrong, but are they useful? Assessing reliability across multiple sites to build trust in urban drainage modelling
Agnethe Nedergaard Pedersen
Annette Brink-Kjær
Peter Steen Mikkelsen
Download
- Final revised paper (published on 24 Nov 2022)
- Supplement to the final revised paper
- Preprint (discussion started on 25 Jul 2022)
- Supplement to the preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-615', Anonymous Referee #1, 02 Aug 2022
In their paper, Pedersen et al. present an approach for assessing the performance of urban drainage models at the local level (i.e. at different sites) based on a variety of criteria. The manuscript is well-written and structured, and it is easy to follow. There are many other papers that deal with model validation at local scales in rural and urban areas, so I found the novelty of the paper to be a bit weak. My rating for the paper's suitability for HESS is "medium", as I am not sure its content will be of interest to most of the journal's readers. Apart from that, I have no major concerns with the paper's content, as both the methods and the analysis are sound. Below you will find some specific comments.
Figure 2. This is a cross-section, I assume. Include labels to indicate ground level, pipe borders, circle dashed lines, etc.
Table 1. Units should be added to the variables.
Figure 4. Figure 4 is not very informative. Could you please elaborate on the different symbols in the figure caption? This figure should also be moved to the supplementary material - it does not add any value to the article.
Figure 5. The 1:1 line should be in a different color or presented as a solid line.
Figure 6. Can also be transferred to the supplementary material.
Sites names. I suggest simplifying the names of the sites in the text (e.g. in line 336) and in the figures (e.g. Figure 7). For example, it would be easier to read "Site A" instead of “F64F46Y”.
Lines 467-472 and Figure 12. From these statistics, what can we learn? Does it tell us anything about the model's capabilities? I suggest removing this part from the manuscript if not.
Section 4.4 and Figure 13. I don't understand this section. The figure shows what exactly? I cannot follow the rationale for plotting based on physical properties, signatures, and slopes here.
Conclusion section. Currently, it's a summary, mirroring the abstract a bit. I suggest shortening it to one paragraph, summarizing the main findings.
Citation: https://doi.org/10.5194/egusphere-2022-615-RC1 -
AC1: 'Reply on RC1', Agnethe Nedergaard Pedersen, 17 Oct 2022
The original reviewer comments are included in italics and sequentially provided with numbers for easy cross-referencing.
Authors’ responses are written in normal style, and the line numbers and Figure numbers refer to the original manuscript
Reviewer RC1:
A.1. In their paper, Pedersen et al. present an approach for assessing the performance of urban drainage models at the local level (i.e. at different sites) based on a variety of criteria. The manuscript is well-written and structured, and it is easy to follow. There are many other papers that deal with model validation at local scales in rural and urban areas, so I found the novelty of the paper to be a bit weak. My rating for the paper's suitability for HESS is "medium", as I am not sure its content will be of interest to most of the journal's readers. Apart from that, I have no major concerns with the paper's content, as both the methods and the analysis are sound. Below you will find some specific comments.Thank you very much for your review of the manuscript and your comments, which have given us ideas for improving the clarity of the manuscript.
We are aware that many papers in the urban drainage modelling field have dealt with model validation (e.g. Annus et al., 2021; Tscheikner-Gratl et al., 2016; Vonach et al., 2019). These however focus on calibration of models, i.e. reducing the uncertainty contribution related to (lumped) model parameters based on relatively few measurement sites and events, and use classical statistical indicators. The approach we here develop has focus on reducing uncertainty contributions related to (spatially distributed and detailed) system attributes, using hydrological and hydraulic signatures for the statistical evaluations while in this process accounting for the input uncertainty stemming from spatially distributed rainfall (through weighting of individual events in the statistical evaluation). We furthermore suggest that there is a novelty in the way we focus on several model objectives, use measurement data from multiple sites and multiple events, and display the results graphically in map-format, in a way that can be systematically up-scaled (and automated) for use with hundreds of measurement sites and very large models in an operational digital twin environment. We will make this more clear in the Introduction as well as the Conclusion, to better highlight the novelty of the paper.
We will do our best to accommodate your suggestions for changes to the manuscript, see our replies to your comments below.
A.2. Figure 2. This is a cross-section, I assume. Include labels to indicate ground level, pipe borders, circle dashed lines, etc.
Thank you for your comments, we will update this figure in the revised manuscript to the version presented below (and update the figure caption accordingly):
A.3. Table 1. Units should be added to the variables
We will include units in a new column in the table.A.4. Figure 4. Figure 4 is not very informative. Could you please elaborate on the different symbols in the figure caption? This figure should also be moved to the supplementary material - it does not add any value to the article.
We realize that Figure 4 is wrongly placed and too briefly introduced in section 2.3.2 (about event weighting methods), making its value to the manuscript unclear. However, this figure is designed to illustrate the structured uncertainty framework we rely on in an easy-to-refer-to manner, so that discussions related to model uncertainty can become clearer and more structured. We thus prefer keeping it in the manuscript.
Reviewer RC2 furthermore asks for further elaboration of the uncertainty framework used (RC2, introductory remarks and comment B.6). We suggest moving this figure forward to section 2.1 (where the framework for model adequacy assessment is introduced) and explaining it thoroughly there. Please see also our response to Reviewer RC2.
A.5. Figure 5. The 1:1 line should be in a different color or presented as a solid line.
We will change the 1:1 line to a solid grey line for Figure 5, 6 and 9.A.6. Figure 6. Can also be transferred to the supplementary material.
We believe this is one of the essential figures in the paper, explaining the three methods we use to evaluate model performance, and we already refer to it several times in the text of the Methods chapter (lines 233-235, 258, 275). We thus suggest keeping it in the manuscript while at the same time inserting references back to it when later explaining details in the Results chapter as well.A.7. Sites names. I suggest simplifying the names of the sites in the text (e.g. in line 336) and in the figures (e.g. Figure 7). For example, it would be easier to read "Site A" instead of “F64F46Y”.
We agree that the many different site names can appear confusing. However, the naming structure is made by the utility with a direct link to the asset database. This same naming is also used in the open dataset (Pedersen et al., 2021) that we previously published, containing part of the data and models used here. To ensure transparency and maintain the possibility that others can replicate our results and contribute to further developments in the field, we therefore suggest keeping the names as they are.
The naming structure is systematic and includes a reference to where the manhole is located (character 1-3), to which system type it is (character 4, F=combined system, R=rainwater system), a forthcoming number (character 5-6), and (character 7) and an indication of whether it is a basin (B), if something is being regulated (R) or if it is an overflow structure (Y). In this paper further suffix is given to indicate where this manhole is in a structure when needed. We will include these descriptions in the revised manuscript.A.8. Lines 467-472 and Figure 12. From these statistics, what can we learn? Does it tell us anything about the model's capabilities? I suggest removing this part from the manuscript if not.
Figure 12 and 13 (see A.9. below) represent our first efforts to investigate possible patterns in the scorings presented in tabular form in Figures 10 and 11 as well as tables in the Supplementary Material. We wanted to present the results in the paper to stimulate others to potentially contribute to work in this field and improve the analysis based on our open data set. We agree however that this preliminary analysis does not produce clear results and thus suggest moving the figures to the Supplementary Material and only mentioning it briefly in the manuscript.A.9. Section 4.4 and Figure 13. I don't understand this section. The figure shows what exactly? I cannot follow the rationale for plotting based on physical properties, signatures, and slopes here.
Again (like for A.8 above), we agree that this section is confusing, and we thus suggest moving it to the Supplementary Material and only mentioning it briefly in the manuscript.A.10. Conclusion section. Currently, it's a summary, mirroring the abstract a bit. I suggest shortening it to one paragraph, summarizing the main findings.
We will shorten the conclusion in the revised manuscript.
References
Annus, I., Vassiljev, A., Kändler, N., Kaur, K., 2021. Automatic calibration module for an urban drainage system model. Water (Switzerland) 13. https://doi.org/10.3390/w13101419
Pedersen, A.N., Pedersen, J.W., Vigueras-Rodriguez, A., Brink-Kjær, A., Borup, M., Mikkelsen, P.S., 2021. The Bellinge data set: open data and models for community-wide urban drainage systems research. Earth Syst Sci Data 13, 4779–4798. https://doi.org/10.5194/essd-13-4779-2021
Tscheikner-Gratl, F., Zeisl, P., Kinzel, C., Rauch, W., Kleidorfer, M., Leimgruber, J., Ertl, T., 2016. Lost in calibration: Why people still do not calibrate their models, and why they still should - A case study from urban drainage modelling. Water Science and Technology 74, 2337–2348. https://doi.org/10.2166/wst.2016.395
Vonach, T., Kleidorfer, M., Rauch, W., Tscheikner-Gratl, F., 2019. An Insight to the Cornucopia of Possibilities in Calibration Data Collection. Water Resources Management 33, 1629–1645. https://doi.org/10.1007/s11269-018-2163-6Citation: https://doi.org/10.5194/egusphere-2022-615-AC1
-
AC1: 'Reply on RC1', Agnethe Nedergaard Pedersen, 17 Oct 2022
-
RC2: 'Comment on egusphere-2022-615', Anonymous Referee #2, 29 Aug 2022
The manuscript “All models are wrong, but are they useful? Assessing reliability across multiple sites to build trust in urban drainage modelling” by Pedersen et al. introduces a well devised and clear described framework to assess reliability of urban drainage models. The manuscript is on a high level, and I think it will be of interest to the readers. However, there are some weaknesses that should be addressed. One point is the link to existing uncertainty assessment frameworks, be it in urban drainage or other fields. Throughout the paper the link to uncertainty is observable but is never clearly made. There is a need to elaborate on this, also including more literature on the topic, and define the links and boundaries of this study. A second point is the statement of the missing studies on spatial variability of rain events. I would encourage a more through look into that also in the view of radar data. Some of the figures could also need a bit of work to clarify them to the reader (see also detailed comments). Finally based on the a bit provocative title, I would have expected more suggestion for possible ways forward and an applicability discussion of the framework in the conclusion rather than a mere summary of the results.
Detailed comments:
Line 50 – 51: There is more to say about quantifying uncertainties than GLUE.
Line 59 – 64: The assumption of only future low-cost level meters may be not the full picture, when also low-cost flow meters may occur (e.g. image based).
Line 91 – 91: Spatial variability of rain events and their impact on urban drainage models has been investigated several times.
Figure 2: I think the figure could be improved. Now it is a bit unsure why the cross sections are needed.
Figure 4: Location of uncertainties are defined in several papers. Why did you use the one shown here?
Table 3: What is the reasoning to put these exact boundaries between “green”, “yellow” and “red”? Can that be changed depending on the objective or subjective factors?
Figure 7: Legend is quite small when printed.
Figure 10 and 11: More tables than figures I would say.
Figure 12: Coloring of the figure is very light.
Figure 13: The figure needs more explanation in the text. As it is at present, it is quite confusing.
Citation: https://doi.org/10.5194/egusphere-2022-615-RC2 -
AC2: 'Reply on RC2', Agnethe Nedergaard Pedersen, 17 Oct 2022
The original reviewer comments are included in italics and sequentially provided with numbers for easy cross-referencing.
Author’s responses are written in normal style, and the line numbers and Figure numbers refer to the original manuscript
Reviewer RC2
B.1. The manuscript “All models are wrong, but are they useful? Assessing reliability across multiple sites to build trust in urban drainage modelling” by Pedersen et al. introduces a well devised and clear described framework to assess reliability of urban drainage models. The manuscript is on a high level, and I think it will be of interest to the readers. However, there are some weaknesses that should be addressed. One point is the link to existing uncertainty assessment frameworks, be it in urban drainage or other fields. Throughout the paper the link to uncertainty is observable but is never clearly made. There is a need to elaborate on this, also including more literature on the topic, and define the links and boundaries of this study. A second point is the statement of the missing studies on spatial variability of rain events. I would encourage a more through look into that also in the view of radar data. Some of the figures could also need a bit of work to clarify them to the reader (see also detailed comments). Finally based on the a bit provocative title, I would have expected more suggestion for possible ways forward and an applicability discussion of the framework in the conclusion rather than a mere summary of the results.
Thank you very much for your review of the manuscript and for your comments, which have given us ideas for improving the clarity of the manuscript.
We understand your concern with regards to linking to existing uncertainty assessment frameworks, which is in the submitted manuscript not as clear as is could be. We suggest addressing this more explicitly in the Introduction an moving Figure 4, which is in the submitted manuscript only briefly referred to in section 2.3.2 about event weighting methods), forward to section 2.1 (where the framework for model adequacy assessment is introduced) and explaining it thoroughly there. See also our reply your comment B.6.
Regarding the spatial variability of rainfall, we thank you for pointing out this oddly formulated sentence. It sends another message than intended and will thus be modified, see our reply to you comment B.4.
As a response to the provocative title, we will include a section at the end of Chapter 4 elaborating on the usefulness of the models, and what needs to be investigated and discussed in the future within this field.
B.2. Line 50 – 51: There is more to say about quantifying uncertainties than GLUE.
Yes, you are completely right. We will expand this section part of the manuscript referring also to other uncertainty quantification methods than GLUE.B.3. Line 59 – 64: The assumption of only future low-cost level meters may be not the full picture, when also low-cost flow meters may occur (e.g. image based).
Yes, you are right. With new technologies on the flow meters, flow-meters will probably also be installed. We will adjust the text accordingly.B.4. Line 91 – 91: Spatial variability of rain events and their impact on urban drainage models has been investigated several times.
We see that this sentence was written oddly and thus suggest changing it from“Uncertainty of input data, such as unrealistic representation of rain events due to spatial variability, has not to the authors awareness been investigated“
to
“Although many studies have shown that rainfall varies spatially at scales significant to urban drainage modelling (e.g. Gregersen et al., 2013; Thomassen et al., 2022), a method that accounts for unrealistic representation of rainfall spatial variability when using rainfall data from point gauges in model assessment has not to the authors awareness been investigated”.
B.5. Figure 2: I think the figure could be improved. Now it is a bit unsure why the cross sections are needed.
The cross-sections are needed to make it easier to understand the other parts of the figure. Reviewer RC1 also asks for improvements to this figure, please see our response to RC1, comment A.2.B.6. Figure 4: Location of uncertainties are defined in several papers. Why did you use the one shown here?
To our experience, the urban drainage research community has to some extent forgotten to consider the different location of uncertainties, focusing mostly on rain input uncertainty and parameter uncertainty – with the main purpose of developing methods for auto-calibration of models. In utility companies (two of three authors of this paper are form a utility company) it is however well know that the asset databases are not always correct. As the utility companies get bigger and the urban drainage systems get more complex, the overview of the validity of the asset databases can be lost. With increasing application of digital twins, we however get the opportunity to look closer at the uncertainties in the model structure, including the uncertainties caused by imperfect information about the system attributes.We are aware that model uncertainty assessment is a large field and that many frameworks have been suggested in the scientific literature, which are however not always perfectly aligned across modeling fields and purposes. In our prior work (Pedersen et al., 2022) we combined the content of two well-cited papers (Walker et al., 2003) (focusing on uncertainty locations) and (Gupta et al., 2012) (focusing on the uncertainty in model structure) into a unified framework explaining the locations of uncertainties present in the semi-distributed ‘integrated urban drainage models’, which our work focuses on (i.e. a lumped-conceptual rainfall-runoff module that calculates runoff to a distributed, physics-based high-fidelity pipe flow module). Figure 4 is designed to illustrate this new framework in an easy-to-refer-to manner, so that discussions related to model uncertainty can become clearer and more structured.
We realise that this message has not come well across in the submitted manuscript. We will thus address this more explicitly in the Introduction and move Figure 4, which is in the submitted manuscript only briefly referred to in section 2.3.2 about event weighting methods), forward to section 2.1 (where the framework for model adequacy assessment is introduced) and introduce it thoroughly there, along with the explanations above.
B.7. Table 3: What is the reasoning to put these exact boundaries between “green”, “yellow” and “red”? Can that be changed depending on the objective or subjective factors?
As explained in Line 302-306 (the caption to Table 3) the boundaries between the categories are based subjective choices that we made based on the utility company’s experience. We are well aware that this could be improved but as we are not experienced with the methods, we don’t have the competencies nor experience yet to better categorize those. We will highlight this further in the manuscript text. The criteria can of course be different depending on the objective. We will include this as well.B.8. Figure 7: Legend is quite small when printed.
We will enlarge the legend text.B.9. Figure 10 and 11: More tables than figures I would say.
Agree. We will change this in the updated manuscript.B.10. Figure 12: Coloring of the figure is very light.
We agree and will darken the colours. On the request of reviewer RC1, we will furthermore move this figure to the Supplementary Material.B.11. Figure 13: The figure needs more explanation in the text. As it is at present, it is quite confusing.
Reviewer RC1 also commented on this part, and we consequently suggest moving both Figure 12 amd 13 to the Supplementary Information instead and only mentioning them briefly in the manuscript. Please see also our response to reviewer RC1, comment A.8 and A.9.References
- Gupta, H.V., Clark, M.P., Vrugt, J.A., Abramowitz, G., Ye, M., 2012. Towards a comprehensive assessment of model structural adequacy. Water Resour Res 48, 1–16. https://doi.org/10.1029/2011WR011044
- Meier, R., Tscheikner‐gratl, F., Steffelbauer, D.B., Makropoulos, C., 2022. Flow Measurements Derived from Camera Footage Using an Open‐Source Ecosystem. Water (Switzerland) 14. https://doi.org/10.3390/w14030424
- Pedersen, A.N., Borup, M., Brink-Kjær, A., Christiansen, L.E., Mikkelsen, P.S., 2021. Living and Prototyping Digital Twins for Urban Water Systems: Towards Multi-Purpose Value Creation Using Models and Sensors. Water (Basel) 13, 592. https://doi.org/10.3390/w13050592
- Pedersen, A.N., Pedersen, J.W., Borup, M., Brink-Kjær, A., Christiansen, L.E., Mikkelsen, P.S., 2022. Using multi-event hydrologic and hydraulic signatures from water level sensors to diagnose locations of uncertainty in integrated urban drainage models used in living digital twins. Water Science and Technology 85, 1981–1998. https://doi.org/10.2166/wst.2022.059
- Walker, W., Harremoës, P., Rotmans, J., van der Sluijs, J.P., van Asselt, M., Janssen, P., Krayer von Krauss, M., 2003. Defining Uncertainty: A Conceptual Basis for Uncertainty Management. Integrated Assessment 4, 5–17. https://doi.org/10.1076/iaij.4.1.5.16466
Citation: https://doi.org/10.5194/egusphere-2022-615-AC2
-
AC2: 'Reply on RC2', Agnethe Nedergaard Pedersen, 17 Oct 2022