the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Hybrid Hydrological Modeling for Large Alpine Basins: A Distributed Approach
Abstract. Large alpine basins provide abundant water resources crucial for hydropower generation, irrigation, and daily life. It is thus crucial to develop high-performance hydrological models for water resources management in large alpine basins. Recently, hybrid hydrological models have come to the forefront, synergizing the exceptional learning capacity of deep learning with the interpretability and physical consistency of process-based models. These models exhibit considerable promise in achieving precision in hydrological simulations. However, a notable limitation of existing hybrid models lies in their failure to incorporate spatial information within the basin and describe alpine hydrological processes, which restricts their applicability in hydrological modeling in large alpine basins. To address this issue, we develop a set of hybrid distributed hydrological models by employing a distributed process-based model as the backbone, and utilizing embedded neural networks (ENNs) to parameterize and replace different internal modules. The proposed models are tested on three large alpine basins on the Tibetan Plateau. Results are compared to those obtained from hybrid lumped models, state-of-the-art distributed hydrological model, and purely deep learning models. A climate perturbation method is further used to test the applicability of the hybrid models to analyze the hydrological sensitivities to climate change in large alpine basins. Results indicate that proposed hybrid hydrological models can perform well in predicting runoff processes and simulating runoff component contributions in large alpine basins. The optimal hybrid model with Nash-Sutcliffe efficiency coefficients (NSEs) higher than 0.87 shows comparable performance to state-of-the-art DL models. The hybrid distributed model also exhibits remarkable capability in simulating hydrological processes at ungauged sites within the basin, markedly surpassing traditional distributed models. Besides, the results also show reasonable patterns in the analysis of the hydrological sensitivities to climate change. Runoff exhibits an amplification effect in response to precipitation changes, with a 10 % precipitation change resulting in a 15–20 % runoff change in large alpine basins. An increase in temperature enhances evaporation capacity and changes the redistribution of rainfall and snowfall and the timing of snowmelt. It further leads to a decrease in the total runoff, the contributions of snowmelt runoff, and the intra-annual variability of runoff. Overall, this study provides a high-performance tool enriched with explicit hydrological knowledge for hydrological prediction and improves our understanding about the hydrological sensitivities to climate change in large alpine basins.
- Preprint
(4402 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
CC1: 'Comment on hess-2024-54', Kunlong He, 16 Apr 2024
Mixing machine learning and physical models holds great promise, and this manuscript is interesting. I'm interested in taking a distributed process-based model as the backbone and leveraging embedded neural networks (ENNs) to parameterize and replace different internal modules. How is this parameterization and substitution implemented? The applicability of the mixed model in the analysis of hydrological sensitivity to climate change in large alpine basins is verified by the climate perturbation method. What is the method of climate perturbation? And most importantly, your NSE is over 0.87, so is it time dependent? What is it in hours, days and months? Has it changed much? Is the performance still better than other models? I think that's important.
Please point out the key points, how do you make sure that you are replacing the correct part with ENNs, or that you are replacing the key part?
What is the parameterization? How to ensure that the implementation is the most accurate. I mean, if you replace things that aren't critical, the end result doesn't really matter much. That's my guess, of course.
Why not use a modified KGE 'indicator instead of just NSE, as studies have shown that NSE is not necessarily accurate.Citation: https://doi.org/10.5194/hess-2024-54-CC1 -
AC1: 'Reply on CC1', Bu Li, 23 May 2024
We thank the participant in the community discussion for the contribution and for providing constructive feedback. Please find our replies in italic.
- Mixing machine learning and physical models holds great promise, and this manuscript is interesting. I'm interested in taking a distributed process-based model as the backbone and leveraging embedded neural networks (ENNs) to parameterize and replace different internal modules. How is this parameterization and substitution implemented? The applicability of the mixed model in the analysis of hydrological sensitivity to climate change in large alpine basins is verified by the climate perturbation method. What is the method of climate perturbation? And most importantly, your NSE is over 0.87, so is it time dependent? What is it in hours, days and months? Has it changed much? Is the performance still better than other models? I think that's important.
Response: Thanks for your comment. We develop a hybrid distributed hydrological model by employing a distributed process-based model as the backbone, and utilizing embedded neural networks (ENNs) to parameterize and replace different internal modules. First, we use the ENNs to calculate the parameters of physical models. Different from calibration in physical models, the static sub-basin variables are used as the inputs of ENNs for parameter calculation. Besides, we also employ different ENNs to replace the internal modules of physical models. These ENNs use the similar inputs and same outputs with the origional internal modules, while add static sub-basin variables as additional inputs. We will highlight it in the revised manuscript.
Furthremore, this study uses the climate perturbation method to test the hydrological sensitivity of hybrid models to daily precipitation and air temperature change. Specifically, we modify the precipitation and air temperature in the hybrid models inputs, and input in the trained hybrid models to analyze the change of simulated hydrological processes.
Finally, due to the limitation of observed runoff data and the the applicability of hybrid models, the model timestep set as 1 day. And all models are test in the daily timestep. The evaluation on the daily scale is more rigorous than that on the monthly scale, so the evaluation on the monthly scale was not carried out in this study.
- Please point out the key points, how do you make sure that you are replacing the correct part with ENNs, or that you are replacing the key part? What is the parameterization? How to ensure that the implementation is the most accurate. I mean, if you replace things that aren't critical, the end result doesn't really matter much. That's my guess, of course.
Response: The forms and inputs of embedded neural networks (ENNs) are important for hydrological modeling. While extensive testing and design were conducted in the preliminary stages of this study to obtain relatively optimal ENNs structures, it cannot be guaranteed that the ENNs used in this study are the absolute best. Therefore, the main contribution of this study is to develop a hybrid model framework to demonstrate the enhancement of the ENNs on hydrological modeling rather than aiming to develop a model with the highest hydrological accuracy.
- Why not use a modified KGE 'indicator instead of just NSE, as studies have shown that NSE is not necessarily accurate.
Response: We agree that only using NSE is not enough to evaluate the performance of hydrological models, so we employ three metrics, including NSE, mNSE, and PFAB to test the models. Thses three metrics have been used in many stuides to fully evaluate the model performance (Höge et al. 2022, Jiang et al. 2020).
Höge, M., Scheidegger, A., Baity-Jesi, M., et al. (2022). Improving hydrologic models for predictions and process understanding using neural ODEs. Hydrology and Earth System Sciences 26(19), 5085-5102.
Jiang, S., Zheng, Y. and Solomatine, D. (2020). Improving AI System Awareness of Geoscience Knowledge: Symbiotic Integration of Physical Approaches and Deep Learning. Geophysical Research Letters 47(13).
Citation: https://doi.org/10.5194/hess-2024-54-AC1 -
CC3: 'Reply on AC1', Kunlong He, 23 May 2024
When the time is opportune, examining the source code could illuminate this mechanism. I've employed KGE as a preliminary indicator to assess the performance ; Using KGE might yield varied outcomes. This, naturally, is my speculation.
Thank you very much for your efforts in this field.
References:
Althoff, D., & Rodrigues, L.N. (2021). Goodness-of-fit criteria for hydrological models: Model calibration and performance assessment. Journal of Hydrology, 600, 126674.
Kling, H., Fuchs, M., & Paulin, M. (2012). Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios. Journal of Hydrology, 424-425, 264-277.Citation: https://doi.org/10.5194/hess-2024-54-CC3
-
AC1: 'Reply on CC1', Bu Li, 23 May 2024
-
RC1: 'Comment on hess-2024-54', Anonymous Referee #1, 23 Apr 2024
Review of HESS Manuscript
“Hybrid Hydrological Modeling for Large Alpine Basins: A Distributed Approach”
Dear Editor,
Please find attached my review of the manuscript.
1. Scope
The scope of the paper is well suited for HESS.
2. Summary
The authors based their study on the hybrid model proposed by Li et al. (2023b), in which different modules of a conceptual hydrological model are substituted by Neural Networks. The authors then proposed a distributed version of the model, in which they subdivide the basins of interest into smaller subbasins and apply the hybrid models to each of these subbasins.
The authors then compare the performance of the lumped and distributed hybrid models against purely data-driven techniques (LSTM and CNN-LSTM) and show all models achieved similar performance. They also test the performance of the hybrid models to predict discharges in some of the predefined subbasins (untrained gauges). In the last sections, the authors run some experiments looking at the behaviour of the models when boundary conditions are modified (changes in precipitation and temperature).
3. Evaluation
Overall, the manuscript has the potential to be a good contribution, however, there are certain aspects mentioned in the questions below that should be taken into account before moving on to the next steps.
3.1 Major comments:
- The code is not published, and the authors indicate that it will be opened once the manuscript is accepted. I strongly recommend the editor to ask for an open code during the review process, as it increases the transparency of the study. I also tried to look for the code of the previous study (Li et al., 2023b) however I was not able to find it.
- They also indicate that the discharge information is not publicly available due to privacy reasons. Even though this reason is valid and outside of the capabilities of the authors, it automatically makes the study non-reproducible, which is especially important when machine learning methods are being proposed.
- The printing quality of all figures should be improved. When I zoom in, I cannot see the details. I suggest the authors print the figures in 300 dpi.
- The authors do not show the subbasins they used to create the distributed model. I encourage them to include this information. Also, I was not able to find information of the amount of subbasins they used.
- One major concern is that they are not considering any routing method. Consequently, even with a distributed model they just sum up the discharge coming from each subbasin. Moreover, the authors are working with large basins (over 90 000 km2 according to the manuscript) in which the routing processes can become highly relevant. Is there a reason why no routing is being used?
- The good performance of the distributed models can also be attributed to the fact that one has a more flexible model. More flexible models can get a better fit to the data, but this is not directly related to having a distributed version. One way to test this hypothesis is to use the same number of models you used in the distributed version but let them receive the same data (similar to Feng, 2022). If your models performed better, then you can say that the distributed nature of the models is beneficial, and the improvement is not just because of the increase in flexibility. If the performance is the same, then it would mean that the distributed version (especially without any routing) does not give an advantage.
- In Section 3.2 the authors present the results of the evaluation on untrained gauges. One can see that for the MT subbasin the performance of three models completely drops when compared to the performance in the original basin (TNH). Why? In the text it was mentioned that “the models show comparatively poorer performance in runoff modeling at the MT station” but it was never explained why. Also, the discharge range (Figure 5) of the subbasins is similar, so is there a reason for the big differences in performance?
3.2 Minor comments:
Line 25: Is the word ‘almost’ a typo?
Line 32: Saying that process-based models can be used to understand the entire hydrological system including all internal processes is an overstatement, especially if you are referring to conceptual models. Conceptual models are mostly based on parameterized (empirical) relationships that somehow account for our understanding of the system, however the physics behind them is not much.
Line 122: It is not clear what the authors mean. Are the parameters the same for all basins or do they change?
Line 145: It would be better to create a table to specify the characteristics of each model.
Line 158: Hochreiter and Schmidhuber, 1997 created the LSTM architecture, but in this line, it sounds like they proposed the architecture for hydrological modelling. This paper should of course be cited, but not mixed with the other papers of hydrological applications.
Figure 2. How was the spatial discretization of the basins? This should be included in the paper.
Line 195: What does suites of experiments mean? Shouldn´t it be set of experiments?
Line 242-244: The PFAB values show a difference, but the changes of 0.01 in NSE are not significant. This can be just because of the initialization of the model. I do not agree that there is evidence to support that one model has an augmented ability to simulate overall runoff process.
Figure 3: The size of the figure should be increased, right now it is hard to see the details of the hydrographs. Also, the authors should use the same line width for all models. Right now, some lines look thicker than others, which gives a bias to the figure. The last hydrograph does not have values on the y-axis.
Line 263: The authors say that when one includes air temperature in the ENN there is an evident enhancement of the model performance. The PFAB values vary a bit more, but the differences in NSE values are extremely small (0.01 for 4 cases, 0.02 for 1 case and 0.03 for another). This is just one metric summarizing more than 5 years of data. I do not agree that NSE values show an evident enhancement of the model performance.
Figure 5e: How are the attributes being normalized? I do not understand how the area of the subbasins is comparable with the area of the entire basin. Also, why is the figure showing a range when referring to static attributes? This figure is not explained in the text.
Figure 7. The figure title indicates that the grey and yellow shading indicate annual and monthly responses. However, there are no shadings in the figure.
Figure 8. I suggest the authors use a proper name for the figure and not just refer to another figure.
Citation: https://doi.org/10.5194/hess-2024-54-RC1 -
AC3: 'Reply on RC1', Bu Li, 31 May 2024
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2024-54/hess-2024-54-AC3-supplement.pdf
-
AC4: 'Reply on RC1', Bu Li, 03 Jun 2024
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2024-54/hess-2024-54-AC4-supplement.pdf
-
RC2: 'Comment on hess-2024-54', Anonymous Referee #2, 12 May 2024
The paper developed a hybrid framework that integrates a distributed process-based hydrological model and embedded neural networks (ENNs) for streamflow modeling in large alpine basins. The distributed EXP-Hydro model uses multiple mathematical equations to describe hydrological systems, including precipitation, snowmelt, runoff, and baseflow, which can be replaced by neural networks. The hybrid framework performs well in both gauged and ungauged basins across three large alpine basins. My major concerns are as follows:
- I suggest the authors rewrite the abstract, as it is too long. Some sentences should be moved to the introduction or results sections of the manuscript.
- The differences between the distributed models and the corresponding lumped models are unclear. From the manuscript, it appears that the only difference is that the lumped model simulates discharge for the entire basin, while the distributed model simulates discharge for each subbasin, and then summarizes the discharge for all the subbasins. Runoff routing is an important process in distributed hydrological models, which is also crucial for large basins. Please explain why river routing is missing.
- Please demonstrate the importance of using subbasins in alpine basins due to the significant variability of precipitation and temperature in space. Additionally, the sensitivity of the area threshold for the subbasins is not discussed in the manuscript. While the authors may have experience defining the threshold in Tibetan basins, it is unclear how this applies to other basins
- The significance of model performance is not discussed in the manuscript. For example, DMθ-Q-T and DMθ-QSM-T have very close NSE values in the Yellow River and Lancang River. If the authors only trained the model once, it is unclear if the differences are statistically significant.
- The authors conducted a series of sensitivity tests of runoff to climate change. However, it is difficult to explain the internal structure of a neural network and how we can trust the extrapolated results. For example, the model was not trained on a 20% increase in precipitation, meaning the perturbed scenarios are extrapolations. It would be more accurate to refer to this as model sensitivity to dynamic inputs rather than concluding runoff sensitivities to climate change.
- The improvement in streamflow estimation is important. However, it would be interesting to investigate when and where these improvements occur. Please analyze the spatial differences between the deep learning models and the EXP-Hydro model in simulated discharge
- I found it hard to follow many sentences; please polish the language. Some examples are listed below.
Minor comments:
Line 25: Alpine basins are important water sources, playing a crucial role in various aspects of human life and the environment, such as domestic water supply, irrigation, hydropower generation, and climate regulation. Please rewrite the sentence.
Line 26: The performance of a hydrological model can be accurate, to describe the model, use reliable could be better.
Line 27: shorten the sentence and use ‘climate change and adaption’.
Line 31: These models depend on physical laws and empirical knowledge.
Line 32-34: The sentence is too long. In addition, are these hydrological models sufficient to understand all hydrological processes?
Line 41: streamflow/discharge forecasting, snow water equivalent modeling, and groundwater level mapping. Please rewrite the sentence.
Figure 2. Please add some subplots to show the spatial variability of precipitation and temperature, which is the main reason for using the distributed schemes. Please show the subbasins and indicate the amount of subbasins.
Line 86: …the proposed models…
Line 87-88: Can the ENNs produce optimal parameters?
Line 203: The training period is 26 years and the evaluation/testing period is only 6 years. Is this setting reasonable? Why not set the same length for the training and testing? Please explain.
Line 247: I don’t think an improvement of NSE from 0.06 to 0.09 is a substantial improvement. Please rewrite the sentence.
Citation: https://doi.org/10.5194/hess-2024-54-RC2 -
AC5: 'Reply on RC2', Bu Li, 04 Jun 2024
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2024-54/hess-2024-54-AC5-supplement.pdf
-
CC2: 'Comment on hess-2024-54', John Ding, 13 May 2024
On the optimum NSE (Nash-Sutcliffe efficiency) value
During a recent development of a dual NSE scale, one byproduct we've produced is a hydrograph slope-based (DQ/Dt), 1-step forecast function called AR2, e.g., Cinkus et al. (2023, AC2), Ding (2023). This is expressed as follows:
Qar2[t+1]
=Qobs[t]+(Qobs[t]-Qobs[t-1])
=2Qobs[t]-Qobs[t-1],The acceleration-based model, i.e. , a physics one, is known as a second-order autoregressive process, AR(2,0,2,-1), or AR2 for short.
The AR2 is preset, as there are no free parameters to calibrate against observations. For an observed and an AR2-simulated hydrographs, an AR2_NSE can be computed (Eq. 1.)
I encourage the authors to compute the AR2_NSE for, as an example, the Yangtze River at ZMD gauge for the streamflow data they have, but not publicly available (RC1, Sect. 3.1, Bullet 2). I, for one, will be interested in how the alternate baseline compares with their optimal NSE value of 0.87 for the Yangtze (Lines 14-15, Fig. 3, and Table 1).
References
Cinkus, G., Mazzilli, N., Jourde, H., Wunsch, A., Liesch, T., Ravbar, N., Chen, Z., and Goldscheider, N.: When best is the enemy of good – critical evaluation of performance criteria in hydrological models, Hydrol. Earth Syst. Sci., 27, 2397–2411, https://doi.org/10.5194/hess-27-2397-2023, 2023.Ding, J. Y. (2023). “A dual Nash–Sutcliffe model efficiency scale: Introducing a simplest second order autoregressive process AR2 as an alternate benchmark model.” Manuscript # WRE4867 in: Conference Program & Abstract Proceedings, The 9th International Conference on Water Resource and Environment (WRE2023), November 21-24, 2023, Matsue, Japan, published by I-Shou University, Taiwan.
Citation: https://doi.org/10.5194/hess-2024-54-CC2 -
AC2: 'Reply on CC2', Bu Li, 24 May 2024
On the optimum NSE (Nash-Sutcliffe efficiency) value
During a recent development of a dual NSE scale, one byproduct we've produced is a hydrograph slope-based (DQ/Dt), 1-step forecast function called AR2, e.g., Cinkus et al. (2023, AC2), Ding (2023). This is expressed as follows:
Qar2[t+1]=Qobs[t]+(Qobs[t]-Qobs[t-1])=2Qobs[t]-Qobs[t-1],
The acceleration-based model, i.e. , a physics one, is known as a second-order autoregressive process, AR(2,0,2,-1), or AR2 for short.
The AR2 is preset, as there are no free parameters to calibrate against observations. For an observed and an AR2-simulated hydrographs, an AR2_NSE can be computed (Eq. 1.)
I encourage the authors to compute the AR2_NSE for, as an example, the Yangtze River at ZMD gauge for the streamflow data they have, but not publicly available (RC1, Sect. 3.1, Bullet 2). I, for one, will be interested in how the alternate baseline compares with their optimal NSE value of 0.87 for the Yangtze (Lines 14-15, Fig. 3, and Table 1).
Response: Thanks for your comment. Similar with other relevant studies (Höge et al. 2022, Jiang et al. 2020, Li et al. 2023), we utilized three common evaluation metrics, NSE, mNSE, and PFAB to test the performance of proposed models. These three metrics measure the overall, low, and peak fit-of-goodness and biases of simulated and observed data, respectively. Therefore, these three metrics can be used to systematically evaluate model performance, and employing additional metrics to test the model performance is not the main objective of this contribution. Besides, we plan to provide our simulation results in a revised document for readers to calculate different evaluation metrics.
Höge, M., Scheidegger, A., Baity-Jesi, M., et al. (2022). Improving hydrologic models for predictions and process understanding using neural ODEs. Hydrology and Earth System Sciences 26(19), 5085-5102.
Jiang, S., Zheng, Y. and Solomatine, D. (2020). Improving AI System Awareness of Geoscience Knowledge: Symbiotic Integration of Physical Approaches and Deep Learning. Geophysical Research Letters 47(13).
Li, B., Sun, T., Tian, F., et al. (2023). Enhancing process-based hydrological models with embedded neural networks: A hybrid approach. Journal of Hydrology, 130107.
Citation: https://doi.org/10.5194/hess-2024-54-AC2
-
AC2: 'Reply on CC2', Bu Li, 24 May 2024
Status: closed
-
CC1: 'Comment on hess-2024-54', Kunlong He, 16 Apr 2024
Mixing machine learning and physical models holds great promise, and this manuscript is interesting. I'm interested in taking a distributed process-based model as the backbone and leveraging embedded neural networks (ENNs) to parameterize and replace different internal modules. How is this parameterization and substitution implemented? The applicability of the mixed model in the analysis of hydrological sensitivity to climate change in large alpine basins is verified by the climate perturbation method. What is the method of climate perturbation? And most importantly, your NSE is over 0.87, so is it time dependent? What is it in hours, days and months? Has it changed much? Is the performance still better than other models? I think that's important.
Please point out the key points, how do you make sure that you are replacing the correct part with ENNs, or that you are replacing the key part?
What is the parameterization? How to ensure that the implementation is the most accurate. I mean, if you replace things that aren't critical, the end result doesn't really matter much. That's my guess, of course.
Why not use a modified KGE 'indicator instead of just NSE, as studies have shown that NSE is not necessarily accurate.Citation: https://doi.org/10.5194/hess-2024-54-CC1 -
AC1: 'Reply on CC1', Bu Li, 23 May 2024
We thank the participant in the community discussion for the contribution and for providing constructive feedback. Please find our replies in italic.
- Mixing machine learning and physical models holds great promise, and this manuscript is interesting. I'm interested in taking a distributed process-based model as the backbone and leveraging embedded neural networks (ENNs) to parameterize and replace different internal modules. How is this parameterization and substitution implemented? The applicability of the mixed model in the analysis of hydrological sensitivity to climate change in large alpine basins is verified by the climate perturbation method. What is the method of climate perturbation? And most importantly, your NSE is over 0.87, so is it time dependent? What is it in hours, days and months? Has it changed much? Is the performance still better than other models? I think that's important.
Response: Thanks for your comment. We develop a hybrid distributed hydrological model by employing a distributed process-based model as the backbone, and utilizing embedded neural networks (ENNs) to parameterize and replace different internal modules. First, we use the ENNs to calculate the parameters of physical models. Different from calibration in physical models, the static sub-basin variables are used as the inputs of ENNs for parameter calculation. Besides, we also employ different ENNs to replace the internal modules of physical models. These ENNs use the similar inputs and same outputs with the origional internal modules, while add static sub-basin variables as additional inputs. We will highlight it in the revised manuscript.
Furthremore, this study uses the climate perturbation method to test the hydrological sensitivity of hybrid models to daily precipitation and air temperature change. Specifically, we modify the precipitation and air temperature in the hybrid models inputs, and input in the trained hybrid models to analyze the change of simulated hydrological processes.
Finally, due to the limitation of observed runoff data and the the applicability of hybrid models, the model timestep set as 1 day. And all models are test in the daily timestep. The evaluation on the daily scale is more rigorous than that on the monthly scale, so the evaluation on the monthly scale was not carried out in this study.
- Please point out the key points, how do you make sure that you are replacing the correct part with ENNs, or that you are replacing the key part? What is the parameterization? How to ensure that the implementation is the most accurate. I mean, if you replace things that aren't critical, the end result doesn't really matter much. That's my guess, of course.
Response: The forms and inputs of embedded neural networks (ENNs) are important for hydrological modeling. While extensive testing and design were conducted in the preliminary stages of this study to obtain relatively optimal ENNs structures, it cannot be guaranteed that the ENNs used in this study are the absolute best. Therefore, the main contribution of this study is to develop a hybrid model framework to demonstrate the enhancement of the ENNs on hydrological modeling rather than aiming to develop a model with the highest hydrological accuracy.
- Why not use a modified KGE 'indicator instead of just NSE, as studies have shown that NSE is not necessarily accurate.
Response: We agree that only using NSE is not enough to evaluate the performance of hydrological models, so we employ three metrics, including NSE, mNSE, and PFAB to test the models. Thses three metrics have been used in many stuides to fully evaluate the model performance (Höge et al. 2022, Jiang et al. 2020).
Höge, M., Scheidegger, A., Baity-Jesi, M., et al. (2022). Improving hydrologic models for predictions and process understanding using neural ODEs. Hydrology and Earth System Sciences 26(19), 5085-5102.
Jiang, S., Zheng, Y. and Solomatine, D. (2020). Improving AI System Awareness of Geoscience Knowledge: Symbiotic Integration of Physical Approaches and Deep Learning. Geophysical Research Letters 47(13).
Citation: https://doi.org/10.5194/hess-2024-54-AC1 -
CC3: 'Reply on AC1', Kunlong He, 23 May 2024
When the time is opportune, examining the source code could illuminate this mechanism. I've employed KGE as a preliminary indicator to assess the performance ; Using KGE might yield varied outcomes. This, naturally, is my speculation.
Thank you very much for your efforts in this field.
References:
Althoff, D., & Rodrigues, L.N. (2021). Goodness-of-fit criteria for hydrological models: Model calibration and performance assessment. Journal of Hydrology, 600, 126674.
Kling, H., Fuchs, M., & Paulin, M. (2012). Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios. Journal of Hydrology, 424-425, 264-277.Citation: https://doi.org/10.5194/hess-2024-54-CC3
-
AC1: 'Reply on CC1', Bu Li, 23 May 2024
-
RC1: 'Comment on hess-2024-54', Anonymous Referee #1, 23 Apr 2024
Review of HESS Manuscript
“Hybrid Hydrological Modeling for Large Alpine Basins: A Distributed Approach”
Dear Editor,
Please find attached my review of the manuscript.
1. Scope
The scope of the paper is well suited for HESS.
2. Summary
The authors based their study on the hybrid model proposed by Li et al. (2023b), in which different modules of a conceptual hydrological model are substituted by Neural Networks. The authors then proposed a distributed version of the model, in which they subdivide the basins of interest into smaller subbasins and apply the hybrid models to each of these subbasins.
The authors then compare the performance of the lumped and distributed hybrid models against purely data-driven techniques (LSTM and CNN-LSTM) and show all models achieved similar performance. They also test the performance of the hybrid models to predict discharges in some of the predefined subbasins (untrained gauges). In the last sections, the authors run some experiments looking at the behaviour of the models when boundary conditions are modified (changes in precipitation and temperature).
3. Evaluation
Overall, the manuscript has the potential to be a good contribution, however, there are certain aspects mentioned in the questions below that should be taken into account before moving on to the next steps.
3.1 Major comments:
- The code is not published, and the authors indicate that it will be opened once the manuscript is accepted. I strongly recommend the editor to ask for an open code during the review process, as it increases the transparency of the study. I also tried to look for the code of the previous study (Li et al., 2023b) however I was not able to find it.
- They also indicate that the discharge information is not publicly available due to privacy reasons. Even though this reason is valid and outside of the capabilities of the authors, it automatically makes the study non-reproducible, which is especially important when machine learning methods are being proposed.
- The printing quality of all figures should be improved. When I zoom in, I cannot see the details. I suggest the authors print the figures in 300 dpi.
- The authors do not show the subbasins they used to create the distributed model. I encourage them to include this information. Also, I was not able to find information of the amount of subbasins they used.
- One major concern is that they are not considering any routing method. Consequently, even with a distributed model they just sum up the discharge coming from each subbasin. Moreover, the authors are working with large basins (over 90 000 km2 according to the manuscript) in which the routing processes can become highly relevant. Is there a reason why no routing is being used?
- The good performance of the distributed models can also be attributed to the fact that one has a more flexible model. More flexible models can get a better fit to the data, but this is not directly related to having a distributed version. One way to test this hypothesis is to use the same number of models you used in the distributed version but let them receive the same data (similar to Feng, 2022). If your models performed better, then you can say that the distributed nature of the models is beneficial, and the improvement is not just because of the increase in flexibility. If the performance is the same, then it would mean that the distributed version (especially without any routing) does not give an advantage.
- In Section 3.2 the authors present the results of the evaluation on untrained gauges. One can see that for the MT subbasin the performance of three models completely drops when compared to the performance in the original basin (TNH). Why? In the text it was mentioned that “the models show comparatively poorer performance in runoff modeling at the MT station” but it was never explained why. Also, the discharge range (Figure 5) of the subbasins is similar, so is there a reason for the big differences in performance?
3.2 Minor comments:
Line 25: Is the word ‘almost’ a typo?
Line 32: Saying that process-based models can be used to understand the entire hydrological system including all internal processes is an overstatement, especially if you are referring to conceptual models. Conceptual models are mostly based on parameterized (empirical) relationships that somehow account for our understanding of the system, however the physics behind them is not much.
Line 122: It is not clear what the authors mean. Are the parameters the same for all basins or do they change?
Line 145: It would be better to create a table to specify the characteristics of each model.
Line 158: Hochreiter and Schmidhuber, 1997 created the LSTM architecture, but in this line, it sounds like they proposed the architecture for hydrological modelling. This paper should of course be cited, but not mixed with the other papers of hydrological applications.
Figure 2. How was the spatial discretization of the basins? This should be included in the paper.
Line 195: What does suites of experiments mean? Shouldn´t it be set of experiments?
Line 242-244: The PFAB values show a difference, but the changes of 0.01 in NSE are not significant. This can be just because of the initialization of the model. I do not agree that there is evidence to support that one model has an augmented ability to simulate overall runoff process.
Figure 3: The size of the figure should be increased, right now it is hard to see the details of the hydrographs. Also, the authors should use the same line width for all models. Right now, some lines look thicker than others, which gives a bias to the figure. The last hydrograph does not have values on the y-axis.
Line 263: The authors say that when one includes air temperature in the ENN there is an evident enhancement of the model performance. The PFAB values vary a bit more, but the differences in NSE values are extremely small (0.01 for 4 cases, 0.02 for 1 case and 0.03 for another). This is just one metric summarizing more than 5 years of data. I do not agree that NSE values show an evident enhancement of the model performance.
Figure 5e: How are the attributes being normalized? I do not understand how the area of the subbasins is comparable with the area of the entire basin. Also, why is the figure showing a range when referring to static attributes? This figure is not explained in the text.
Figure 7. The figure title indicates that the grey and yellow shading indicate annual and monthly responses. However, there are no shadings in the figure.
Figure 8. I suggest the authors use a proper name for the figure and not just refer to another figure.
Citation: https://doi.org/10.5194/hess-2024-54-RC1 -
AC3: 'Reply on RC1', Bu Li, 31 May 2024
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2024-54/hess-2024-54-AC3-supplement.pdf
-
AC4: 'Reply on RC1', Bu Li, 03 Jun 2024
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2024-54/hess-2024-54-AC4-supplement.pdf
-
RC2: 'Comment on hess-2024-54', Anonymous Referee #2, 12 May 2024
The paper developed a hybrid framework that integrates a distributed process-based hydrological model and embedded neural networks (ENNs) for streamflow modeling in large alpine basins. The distributed EXP-Hydro model uses multiple mathematical equations to describe hydrological systems, including precipitation, snowmelt, runoff, and baseflow, which can be replaced by neural networks. The hybrid framework performs well in both gauged and ungauged basins across three large alpine basins. My major concerns are as follows:
- I suggest the authors rewrite the abstract, as it is too long. Some sentences should be moved to the introduction or results sections of the manuscript.
- The differences between the distributed models and the corresponding lumped models are unclear. From the manuscript, it appears that the only difference is that the lumped model simulates discharge for the entire basin, while the distributed model simulates discharge for each subbasin, and then summarizes the discharge for all the subbasins. Runoff routing is an important process in distributed hydrological models, which is also crucial for large basins. Please explain why river routing is missing.
- Please demonstrate the importance of using subbasins in alpine basins due to the significant variability of precipitation and temperature in space. Additionally, the sensitivity of the area threshold for the subbasins is not discussed in the manuscript. While the authors may have experience defining the threshold in Tibetan basins, it is unclear how this applies to other basins
- The significance of model performance is not discussed in the manuscript. For example, DMθ-Q-T and DMθ-QSM-T have very close NSE values in the Yellow River and Lancang River. If the authors only trained the model once, it is unclear if the differences are statistically significant.
- The authors conducted a series of sensitivity tests of runoff to climate change. However, it is difficult to explain the internal structure of a neural network and how we can trust the extrapolated results. For example, the model was not trained on a 20% increase in precipitation, meaning the perturbed scenarios are extrapolations. It would be more accurate to refer to this as model sensitivity to dynamic inputs rather than concluding runoff sensitivities to climate change.
- The improvement in streamflow estimation is important. However, it would be interesting to investigate when and where these improvements occur. Please analyze the spatial differences between the deep learning models and the EXP-Hydro model in simulated discharge
- I found it hard to follow many sentences; please polish the language. Some examples are listed below.
Minor comments:
Line 25: Alpine basins are important water sources, playing a crucial role in various aspects of human life and the environment, such as domestic water supply, irrigation, hydropower generation, and climate regulation. Please rewrite the sentence.
Line 26: The performance of a hydrological model can be accurate, to describe the model, use reliable could be better.
Line 27: shorten the sentence and use ‘climate change and adaption’.
Line 31: These models depend on physical laws and empirical knowledge.
Line 32-34: The sentence is too long. In addition, are these hydrological models sufficient to understand all hydrological processes?
Line 41: streamflow/discharge forecasting, snow water equivalent modeling, and groundwater level mapping. Please rewrite the sentence.
Figure 2. Please add some subplots to show the spatial variability of precipitation and temperature, which is the main reason for using the distributed schemes. Please show the subbasins and indicate the amount of subbasins.
Line 86: …the proposed models…
Line 87-88: Can the ENNs produce optimal parameters?
Line 203: The training period is 26 years and the evaluation/testing period is only 6 years. Is this setting reasonable? Why not set the same length for the training and testing? Please explain.
Line 247: I don’t think an improvement of NSE from 0.06 to 0.09 is a substantial improvement. Please rewrite the sentence.
Citation: https://doi.org/10.5194/hess-2024-54-RC2 -
AC5: 'Reply on RC2', Bu Li, 04 Jun 2024
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2024-54/hess-2024-54-AC5-supplement.pdf
-
CC2: 'Comment on hess-2024-54', John Ding, 13 May 2024
On the optimum NSE (Nash-Sutcliffe efficiency) value
During a recent development of a dual NSE scale, one byproduct we've produced is a hydrograph slope-based (DQ/Dt), 1-step forecast function called AR2, e.g., Cinkus et al. (2023, AC2), Ding (2023). This is expressed as follows:
Qar2[t+1]
=Qobs[t]+(Qobs[t]-Qobs[t-1])
=2Qobs[t]-Qobs[t-1],The acceleration-based model, i.e. , a physics one, is known as a second-order autoregressive process, AR(2,0,2,-1), or AR2 for short.
The AR2 is preset, as there are no free parameters to calibrate against observations. For an observed and an AR2-simulated hydrographs, an AR2_NSE can be computed (Eq. 1.)
I encourage the authors to compute the AR2_NSE for, as an example, the Yangtze River at ZMD gauge for the streamflow data they have, but not publicly available (RC1, Sect. 3.1, Bullet 2). I, for one, will be interested in how the alternate baseline compares with their optimal NSE value of 0.87 for the Yangtze (Lines 14-15, Fig. 3, and Table 1).
References
Cinkus, G., Mazzilli, N., Jourde, H., Wunsch, A., Liesch, T., Ravbar, N., Chen, Z., and Goldscheider, N.: When best is the enemy of good – critical evaluation of performance criteria in hydrological models, Hydrol. Earth Syst. Sci., 27, 2397–2411, https://doi.org/10.5194/hess-27-2397-2023, 2023.Ding, J. Y. (2023). “A dual Nash–Sutcliffe model efficiency scale: Introducing a simplest second order autoregressive process AR2 as an alternate benchmark model.” Manuscript # WRE4867 in: Conference Program & Abstract Proceedings, The 9th International Conference on Water Resource and Environment (WRE2023), November 21-24, 2023, Matsue, Japan, published by I-Shou University, Taiwan.
Citation: https://doi.org/10.5194/hess-2024-54-CC2 -
AC2: 'Reply on CC2', Bu Li, 24 May 2024
On the optimum NSE (Nash-Sutcliffe efficiency) value
During a recent development of a dual NSE scale, one byproduct we've produced is a hydrograph slope-based (DQ/Dt), 1-step forecast function called AR2, e.g., Cinkus et al. (2023, AC2), Ding (2023). This is expressed as follows:
Qar2[t+1]=Qobs[t]+(Qobs[t]-Qobs[t-1])=2Qobs[t]-Qobs[t-1],
The acceleration-based model, i.e. , a physics one, is known as a second-order autoregressive process, AR(2,0,2,-1), or AR2 for short.
The AR2 is preset, as there are no free parameters to calibrate against observations. For an observed and an AR2-simulated hydrographs, an AR2_NSE can be computed (Eq. 1.)
I encourage the authors to compute the AR2_NSE for, as an example, the Yangtze River at ZMD gauge for the streamflow data they have, but not publicly available (RC1, Sect. 3.1, Bullet 2). I, for one, will be interested in how the alternate baseline compares with their optimal NSE value of 0.87 for the Yangtze (Lines 14-15, Fig. 3, and Table 1).
Response: Thanks for your comment. Similar with other relevant studies (Höge et al. 2022, Jiang et al. 2020, Li et al. 2023), we utilized three common evaluation metrics, NSE, mNSE, and PFAB to test the performance of proposed models. These three metrics measure the overall, low, and peak fit-of-goodness and biases of simulated and observed data, respectively. Therefore, these three metrics can be used to systematically evaluate model performance, and employing additional metrics to test the model performance is not the main objective of this contribution. Besides, we plan to provide our simulation results in a revised document for readers to calculate different evaluation metrics.
Höge, M., Scheidegger, A., Baity-Jesi, M., et al. (2022). Improving hydrologic models for predictions and process understanding using neural ODEs. Hydrology and Earth System Sciences 26(19), 5085-5102.
Jiang, S., Zheng, Y. and Solomatine, D. (2020). Improving AI System Awareness of Geoscience Knowledge: Symbiotic Integration of Physical Approaches and Deep Learning. Geophysical Research Letters 47(13).
Li, B., Sun, T., Tian, F., et al. (2023). Enhancing process-based hydrological models with embedded neural networks: A hybrid approach. Journal of Hydrology, 130107.
Citation: https://doi.org/10.5194/hess-2024-54-AC2
-
AC2: 'Reply on CC2', Bu Li, 24 May 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
701 | 176 | 41 | 918 | 14 | 18 |
- HTML: 701
- PDF: 176
- XML: 41
- Total: 918
- BibTeX: 14
- EndNote: 18
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1