the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Lack of robustness of hydrological models: A large-sample diagnosis and an attempt to identify the hydrological and climatic drivers
Abstract. The transferability of hydrological models over contrasted climate conditions, also identified as model robustness, has been the subject of much research in last decades. The occasional lack of robustness identified in such models is not only an operational challenge – since it affects the confidence that can be placed in projections of climate change impact – but it also hints at possible deficiencies in the structure of these models. This paper presents a large-scale application of the robustness assessment test (RAT) for three hydrological models with different levels of complexity: GR6J, HYPE and MIKE SHE. The dataset comprises 352 catchments located in Denmark, France and Sweden. Our aim is to evaluate how robustness varies over the dataset and between models and whether the lack of robustness can be linked to some hydrological and/or climatic characteristics of the catchments (thus providing a clue on where to focus model improvement efforts). We show that although the tested models are very different, they encounter similar robustness issues over the dataset. However, models do not necessarily lack robustness on the same catchments and are not sensitive to the same hydrological characteristics. This work highlights the applicability of the RAT regardless of model type and its ability to provide a detailed diagnostic evaluation of model robustness issues.
- Preprint
(1355 KB) - Metadata XML
-
Supplement
(3621 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on hess-2024-80', Anonymous Referee #1, 12 Jul 2024
General comments
This paper presents an application of the robustness assessment test to a large-sample of catchments across France, Denmark and Sweden. The analysis uses results from three hydrological models and the paper analyses how the robustness varies across the dataset in relation to a selection of hydrological and climatic characteristics. Overall the paper is easy to follow and results are presented clearly, but the manuscript could do with a little more synthesis to bring it all together at the end.
Specific comments
Section 1.3: There could be a little more detail in this section. For example, instead of mentioning that you use a ‘large set of catchments spanning various climate conditions in three European countries’ perhaps you could mention how many catchments are simulated, across which conditions and in which countries.
L91 Did you consider using any metrics other than the model bias to assess the differences between observed and simulated flows? Why did you choose this one and do you think your results are sensitive to this choice?
L131 could you quantify how many rivers are affected by hydropower production? It would be interesting to know how many catchments in this sample are affected by this.
L129 I understand that perhaps it does not dictate the hydrology as heavily, but since the geology is discussed for France and Denmark is it worth describing the Swedish geology as well?
Table 1 and 2: Although these tables are useful for listing all the signatures used in this study, I do not think the quantiles are particularly easy to digest, is there a more visual way that this information could be displayed? I like the maps in the supplementary material but understand that there are probably too many to include in the main paper.
L167: Could the runoff ratios exceeding 100% be related to the hydropower? Often water is imported to support these schemes.
L231: on L432 you mention that there is a regulation module in HYPE, could this be briefly described here?
Figure 3: # of stations isn’t a particularly intuitive label, perhaps you could instead write ‘Reactive catchments: ‘
L282: instead of saying ‘especially numerous’ could you instead quantify how many catchments are reactive in France?
L385 which is section 0?
L410: Have you thought about using any signatures which describe the degree of flow regulation by reservoirs/ hydropower? This might help to identify whether the flaws in the GR4J model are linked to this and could be included as a signature in Figure 10.
L453: could you elaborate on what you mean when you say the calibration of S-HYPE could be responsible for the seemingly random reactivity? Perhaps this could be done on L533.
L472: what do you mean by differs from the rest of the dataset? If the same can be said for the Swedish catchments then do you just mean that the results differ from those associated with the French data? This seems to be contradicted by L504.
Table 4: Could you perhaps shade the last three columns so that we can see the patterns visually?
L528: Again, perhaps you could consider using a signature to quantify the degree of dam regulation in each catchment to confirm or reject this hypothesis.
L564 what did you do with the catchments where the KGE was less than 0.7?
Section 5.1: This section feels like a lot of repetition of results/ discussion and doesn’t really feel like it achieves much synthesis. It would be good to make the implications of your work clearer here. The start of the paper makes it clear that this work is useful for understanding the implications of using models such as HYPE, GR6J and MIKE SHE for climate change applications, but I don’t feel like you ever quite bring together your findings here and discuss what your results mean for using these models for climate change applications. ‘Our analysis pointed out flaws in the models in terms of robustness to changing climate.’. Although I can see that the idea is that you use the results from catchments with different climatic conditions as proxies for how the models will perform under climate change, it would be good to make this link clearer.
It would be good to also have some discussion surrounding how transferable your results are to other hydrological models. Are your findings only relevant for the models used in this study? Or is it likely that your findings will be relevant for other models in used in other countries too?
Citation: https://doi.org/10.5194/hess-2024-80-RC1 -
RC2: 'Comment on hess-2024-80', Anonymous Referee #2, 15 Jul 2024
The study by Santos et al. explores the application of the robustness assessment test (RAT) for three hydrological models of varying complexity. They tested the RAT in 352 catchments across Denmark, France, and Sweden. The topic is very interesting and indeed worth studying. The methodology is well-explained, and the writing is clear. However, my main concern is that since these three models were calibrated separately, each at different times and by different research institutes, I am worried about the comparability of the results. Additionally, some of the explanations for the results appear somewhat strained and lack adequate data support; for example, linking robustness issues to dams regulation. If the authors adequately address these issues, I believe this paper is suitable for publication in the HESS journal. My detailed comments can be found below.
Detailed comments:
Line88: In this line 88, it says at least 30 years of data is needed, yet in the Figure 1, it labelled with ‘> 20 years’. So what is the minimum requirement for data?
Line 116: Could you could use Köppen-Geiger classes as a background map in Figure 2? This provides readers with a more intuitive understanding of the climate zones to which each watershed belongs.
Line 222-223: What do you mean by ‘free parameters’? Please clarify.
Line 234: Does this ‘ca.’ represent the catchment area?
Table 3: Please add the explanation of what do you mean by ‘OF’?
Table 3: You mentioned in the discussion that these 3 models were calibrated on different temporal period. Could you add in this table about the specific time periods during which each of these models was calibrated?
Line 385: Can you clarify what do you mean by ‘Sect. 0’?
Line 430: I don’t think the large river catchments will necessarily be higher than the average level. This sentence is not rigorous. Please correct.
Line 523-525: Can you provide the details of these tests on the choice of the evaporation formula in the supplement file?
Line 527: Can you add the location of these dams in one of your figures? It would nice and more convinced to see the spatial distribution of both dams locations and the GR6J robustness issues to draw this conclusion. Moreover, the presence of dams in a catchment does not necessarily mean they are impacted by the dams. So how do you know the streamflow of these catchments are actually affected by the dams?
Line 534-539: I’m a bit concerned about the different calibration strategies were used and also model calibrated over different time periods. Adopting different calibration methods may introduce uncertainty. I’m not sure whether the calibration results of models using different calibration methods are comparable or not? More justifications are needed here.
Citation: https://doi.org/10.5194/hess-2024-80-RC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
378 | 122 | 24 | 524 | 26 | 15 | 12 |
- HTML: 378
- PDF: 122
- XML: 24
- Total: 524
- Supplement: 26
- BibTeX: 15
- EndNote: 12
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1