the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Hybrid Hydrological Modeling for Large Alpine Basins: A Distributed Approach
Abstract. Large alpine basins provide abundant water resources crucial for hydropower generation, irrigation, and daily life. It is thus crucial to develop high-performance hydrological models for water resources management in large alpine basins. Recently, hybrid hydrological models have come to the forefront, synergizing the exceptional learning capacity of deep learning with the interpretability and physical consistency of process-based models. These models exhibit considerable promise in achieving precision in hydrological simulations. However, a notable limitation of existing hybrid models lies in their failure to incorporate spatial information within the basin and describe alpine hydrological processes, which restricts their applicability in hydrological modeling in large alpine basins. To address this issue, we develop a set of hybrid distributed hydrological models by employing a distributed process-based model as the backbone, and utilizing embedded neural networks (ENNs) to parameterize and replace different internal modules. The proposed models are tested on three large alpine basins on the Tibetan Plateau. Results are compared to those obtained from hybrid lumped models, state-of-the-art distributed hydrological model, and purely deep learning models. A climate perturbation method is further used to test the applicability of the hybrid models to analyze the hydrological sensitivities to climate change in large alpine basins. Results indicate that proposed hybrid hydrological models can perform well in predicting runoff processes and simulating runoff component contributions in large alpine basins. The optimal hybrid model with Nash-Sutcliffe efficiency coefficients (NSEs) higher than 0.87 shows comparable performance to state-of-the-art DL models. The hybrid distributed model also exhibits remarkable capability in simulating hydrological processes at ungauged sites within the basin, markedly surpassing traditional distributed models. Besides, the results also show reasonable patterns in the analysis of the hydrological sensitivities to climate change. Runoff exhibits an amplification effect in response to precipitation changes, with a 10 % precipitation change resulting in a 15–20 % runoff change in large alpine basins. An increase in temperature enhances evaporation capacity and changes the redistribution of rainfall and snowfall and the timing of snowmelt. It further leads to a decrease in the total runoff, the contributions of snowmelt runoff, and the intra-annual variability of runoff. Overall, this study provides a high-performance tool enriched with explicit hydrological knowledge for hydrological prediction and improves our understanding about the hydrological sensitivities to climate change in large alpine basins.
- Preprint
(4402 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 07 Jun 2024)
-
CC1: 'Comment on hess-2024-54', Kunlong He, 16 Apr 2024
reply
Mixing machine learning and physical models holds great promise, and this manuscript is interesting. I'm interested in taking a distributed process-based model as the backbone and leveraging embedded neural networks (ENNs) to parameterize and replace different internal modules. How is this parameterization and substitution implemented? The applicability of the mixed model in the analysis of hydrological sensitivity to climate change in large alpine basins is verified by the climate perturbation method. What is the method of climate perturbation? And most importantly, your NSE is over 0.87, so is it time dependent? What is it in hours, days and months? Has it changed much? Is the performance still better than other models? I think that's important.
Please point out the key points, how do you make sure that you are replacing the correct part with ENNs, or that you are replacing the key part?
What is the parameterization? How to ensure that the implementation is the most accurate. I mean, if you replace things that aren't critical, the end result doesn't really matter much. That's my guess, of course.
Why not use a modified KGE 'indicator instead of just NSE, as studies have shown that NSE is not necessarily accurate.Citation: https://doi.org/10.5194/hess-2024-54-CC1 -
RC1: 'Comment on hess-2024-54', Anonymous Referee #1, 23 Apr 2024
reply
Review of HESS Manuscript
“Hybrid Hydrological Modeling for Large Alpine Basins: A Distributed Approach”
Dear Editor,
Please find attached my review of the manuscript.
1. Scope
The scope of the paper is well suited for HESS.
2. Summary
The authors based their study on the hybrid model proposed by Li et al. (2023b), in which different modules of a conceptual hydrological model are substituted by Neural Networks. The authors then proposed a distributed version of the model, in which they subdivide the basins of interest into smaller subbasins and apply the hybrid models to each of these subbasins.
The authors then compare the performance of the lumped and distributed hybrid models against purely data-driven techniques (LSTM and CNN-LSTM) and show all models achieved similar performance. They also test the performance of the hybrid models to predict discharges in some of the predefined subbasins (untrained gauges). In the last sections, the authors run some experiments looking at the behaviour of the models when boundary conditions are modified (changes in precipitation and temperature).
3. Evaluation
Overall, the manuscript has the potential to be a good contribution, however, there are certain aspects mentioned in the questions below that should be taken into account before moving on to the next steps.
3.1 Major comments:
- The code is not published, and the authors indicate that it will be opened once the manuscript is accepted. I strongly recommend the editor to ask for an open code during the review process, as it increases the transparency of the study. I also tried to look for the code of the previous study (Li et al., 2023b) however I was not able to find it.
- They also indicate that the discharge information is not publicly available due to privacy reasons. Even though this reason is valid and outside of the capabilities of the authors, it automatically makes the study non-reproducible, which is especially important when machine learning methods are being proposed.
- The printing quality of all figures should be improved. When I zoom in, I cannot see the details. I suggest the authors print the figures in 300 dpi.
- The authors do not show the subbasins they used to create the distributed model. I encourage them to include this information. Also, I was not able to find information of the amount of subbasins they used.
- One major concern is that they are not considering any routing method. Consequently, even with a distributed model they just sum up the discharge coming from each subbasin. Moreover, the authors are working with large basins (over 90 000 km2 according to the manuscript) in which the routing processes can become highly relevant. Is there a reason why no routing is being used?
- The good performance of the distributed models can also be attributed to the fact that one has a more flexible model. More flexible models can get a better fit to the data, but this is not directly related to having a distributed version. One way to test this hypothesis is to use the same number of models you used in the distributed version but let them receive the same data (similar to Feng, 2022). If your models performed better, then you can say that the distributed nature of the models is beneficial, and the improvement is not just because of the increase in flexibility. If the performance is the same, then it would mean that the distributed version (especially without any routing) does not give an advantage.
- In Section 3.2 the authors present the results of the evaluation on untrained gauges. One can see that for the MT subbasin the performance of three models completely drops when compared to the performance in the original basin (TNH). Why? In the text it was mentioned that “the models show comparatively poorer performance in runoff modeling at the MT station” but it was never explained why. Also, the discharge range (Figure 5) of the subbasins is similar, so is there a reason for the big differences in performance?
3.2 Minor comments:
Line 25: Is the word ‘almost’ a typo?
Line 32: Saying that process-based models can be used to understand the entire hydrological system including all internal processes is an overstatement, especially if you are referring to conceptual models. Conceptual models are mostly based on parameterized (empirical) relationships that somehow account for our understanding of the system, however the physics behind them is not much.
Line 122: It is not clear what the authors mean. Are the parameters the same for all basins or do they change?
Line 145: It would be better to create a table to specify the characteristics of each model.
Line 158: Hochreiter and Schmidhuber, 1997 created the LSTM architecture, but in this line, it sounds like they proposed the architecture for hydrological modelling. This paper should of course be cited, but not mixed with the other papers of hydrological applications.
Figure 2. How was the spatial discretization of the basins? This should be included in the paper.
Line 195: What does suites of experiments mean? Shouldn´t it be set of experiments?
Line 242-244: The PFAB values show a difference, but the changes of 0.01 in NSE are not significant. This can be just because of the initialization of the model. I do not agree that there is evidence to support that one model has an augmented ability to simulate overall runoff process.
Figure 3: The size of the figure should be increased, right now it is hard to see the details of the hydrographs. Also, the authors should use the same line width for all models. Right now, some lines look thicker than others, which gives a bias to the figure. The last hydrograph does not have values on the y-axis.
Line 263: The authors say that when one includes air temperature in the ENN there is an evident enhancement of the model performance. The PFAB values vary a bit more, but the differences in NSE values are extremely small (0.01 for 4 cases, 0.02 for 1 case and 0.03 for another). This is just one metric summarizing more than 5 years of data. I do not agree that NSE values show an evident enhancement of the model performance.
Figure 5e: How are the attributes being normalized? I do not understand how the area of the subbasins is comparable with the area of the entire basin. Also, why is the figure showing a range when referring to static attributes? This figure is not explained in the text.
Figure 7. The figure title indicates that the grey and yellow shading indicate annual and monthly responses. However, there are no shadings in the figure.
Figure 8. I suggest the authors use a proper name for the figure and not just refer to another figure.
Citation: https://doi.org/10.5194/hess-2024-54-RC1
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
231 | 77 | 9 | 317 | 4 | 5 |
- HTML: 231
- PDF: 77
- XML: 9
- Total: 317
- BibTeX: 4
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1