the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Knowledge-Informed Deep Learning for Hydrological Model Calibration: An Application to Coal Creek Watershed in Colorado
Pin Shuai
Alexandar Sun
Maruti K. Mudunuru
Xingyuan Chen
Abstract. Deep learning (DL)-assisted inverse mapping has shown promise in hydrological model calibration by directly estimating parameters from observations. However, the increasing computational demand for running the state-of-the-art hydrological model limits sufficient ensemble runs for its calibration. In this work, we present a novel knowledge-informed deep learning method that can efficiently conduct the calibration using a few hundred realizations. The method involves two steps. First, we determine decisive model parameters from a complete parameter set based on the mutual information (MI) between model responses and each parameter computed by a limited number of realizations (~50). Second, we perform more ensemble runs (e.g., several hundred) to generate the training sets for the inverse mapping, which selects informative model responses for estimating each parameter using MI-based parameter sensitivity. We applied this new DL-based method to calibrate a process-based integrated hydrological model, the Advanced Terrestrial Simulator (ATS), at Coal Creek Watershed, CO. The calibration is performed against observed stream discharge (Q) and remotely sensed evapotranspiration (ET) from the water year 2016 to 2018. Preliminary MI analysis on 50 realizations resulted in a down-selection of seven out of fourteen ATS model parameters. Then, we performed a complete MI analysis on 396 realizations and constructed the inverse mapping from informative responses to each of the selected parameters using a deep neural network. Compared with calibration using all observations, the new inverse mapping improves parameter estimations, thus enhancing the performance of ATS forward model runs. The Nash-Sutcliffe efficiency (NSE) of streamflow predictions increases from 0.65 to 0.80 when calibrating against Q alone. Using ET observation, on the other hand, does not show much improvement on the performance of ATS modeling mainly due to both the uncertainty of the remotely sensed product and the insufficient coverage of the model ET ensemble in capturing the observation. By using observed Q only, we further performed a multi-year analysis and show that Q is best simulated (NSE: 0.85) by including in calibration the dry year flow dynamics that shows more sensitivity to subsurface characteristics than the other wet years. Our success highlights the importance of leveraging data-driven knowledge in DL-assisted hydrological model calibration.
Peishi Jiang et al.
Status: closed
-
RC1: 'Comment on hess-2022-282', Anonymous Referee #1, 19 Sep 2022
General comments:
This study showcases a deep learning optimization method for a high-resolution hydrologic model supported by information theory. I appreciate the honest evaluation of the methodology, in-depth reasoning of the deteriorating model performance for ET, and examination of results and conclusions aligned with earlier studies. In general, this paper is well-written with a novel contribution. However, I think the paper would be stronger if the authors can address the following comments.
- Model validation for climate sensitivity: Currently, the model validation period overlaps with the period for calibrating ATS parameters. I am curious whether the optimized parameters would be able to capture the climate sensitivity on flow and ET, i.e., improving the flow/ET performance outside of the calibrating period (2016-2019). It would strongly support this tool’s eligibility in climate change studies.
- ET from flux tower: In this study, the authors have demonstrated that worse ET performance results from poor quality of MODIS ET products. In this study region, is there ET data from the flux tower that could be used for implementing this workflow? Even though the flux tower ET data has less spatial coverage, the data quality can be better, which might be more useful than MODIS ET when calibrating hydrologic parameters.
Specific comments:
L158: Can the authors elaborate on what five soil types and four geological types are?
L160: A 1000-year spin-up is extremely long. Can the authors briefly explain the reason for this long spin-up even if it might be explained in Shuai et al 2022?
L162: Could the authors briefly explain how they preselected the parameters in this study?
L208: Does the MI have to be zero? If the MI between a parameter and the model responses is small enough, is it possible to neglect that parameter? What would be a proper threshold for it?
L208-210: Interesting! Great summary!
L215: When training using different combinations of years, why do the authors only look at Q, not ET?
L249-250: Given the narrowed list, it seems that the authors eliminated the parameters with small MI (not zero), which slightly contradicts the previous statement where only parameters with zero MI would be eliminated (L208). It would be helpful to clarify the threshold of MI below which the parameters will be eliminated.
L286-287: Please clarify whether the extrapolation issue partially or solely contributes to the worse MI-informed results.
L320: Very interesting results!
Logistic:
Author name: Should the third author be Alexander?
Citation: https://doi.org/10.5194/hess-2022-282-RC1 - AC1: 'Reply on RC1', Peishi Jiang, 28 Dec 2022
-
RC2: 'Comment on hess-2022-282', Anonymous Referee #2, 10 Oct 2022
This study aims at basin scale parameter calibration for a physical hydrologic model (ATS) using DL-based inverse method. The authors leveraged the mutual information (MI) for the global sensitivity analysis to identify the relation between parameters and model simulations, which was later applied to the input selection of a MLP parameter inverse model. They executed different groups of simulations and analyses to comprehensively evaluate the proposed framework. The MS is well-written with overall structure easy to follow. I provide my suggestions below regarding better clarifying several points and hopefully they can be useful to further improving the quality of this study.
As my understanding on this study, the title “knowledge-informed DL” is mainly represented by the MI sensitivity analysis used in the input selection for the following inverse modeling. Knowledge informed learning, generally in my mind, is applying physical laws or constraints to the data driven model based on our domain knowledge. To bridge the proposed MI and physical processes together and better strengthen the headline of this study, I suggest the authors try to link the MI results with physical processes of the study area and give some physical explanations of the results from sensitivity analysis. This can further highlight the physical representations of this study.
I am still confused at the details about how the inverse framework is set up and trained. My understanding is that you first run some simulations with ATS (how are the parameters first initialized here?) and use the simulations and parameters to train an inverse mapping with inputs selected by MI, and then replace ATS simulations with real observations to estimate parameters. Does the “responses” mentioned throughout the paper mean the simulated ATS discharge and ET? What are the training targets and how do you develop the structure, tune the hyperparameters and train the DL framework? What are the training and testing dataset separation?... Maybe I didn’t understand some parts very well, but indeed expect the authors can better clarify their methodology and results to make readers more easily understand this work.
I didn’t understand the result of Figure 7 well and hope the authors can give more explanations. Which variables are the NSE and mKGE calculated on, estimated parameters or model simulations? If they are simulation metric, are theses simulations from the model forwarding with parameters estimated from real observations (Q & MODIS ET inverse)? For each individual parameter evaluation, how do you set up the values of other parameters when doing ATS forwarding. The caption notifies the performance is reported on testing data, but I didn’t see how the authors divide testing and training data.
I am thinking this multiple-years training VS one-year training discussed in section 3.3. As for multiple years, you choose to increase the input neuron number, or keep the one-year structure not changed and just use multiple years data as more training samples? I think the latter one could be more beneficial because inputting three-year time series once to the model would require large amounts of parameters in the input layer which can be inefficient and overfitted to small training data.
Another point I would be interested in is whether the authors have tried adding meteorological forcings to the inputs of inverse modeling. I feel the forcing-hydrologic response pair is very important to inform the characteristics of basin processes reflected in model parameters. I am expecting the paired input may bring more benefits to this study.
Specific comments
Line 76 Do you intend to discuss the overfitting problem here? Large number of weights and limited realizations as training data may cause overfitting with a complicated model.
Line 177 Please also give explanations for H(Y|X) to help readers’ understanding.
Line 258 and 259 How did the authors safely draw the conclusion of “improves the MI estimations” and “the parameters are falsely considered” based on the differences of preliminary and full analysis? Additionally, is it possible that in the preliminary analysis some parameters are not identified but actually behave sensitive if you include them in the full MI analysis?
Figure 8 The inputs to the inverse model here are real observations or simulated responses?
Citation: https://doi.org/10.5194/hess-2022-282-RC2 - AC2: 'Reply on RC2', Peishi Jiang, 28 Dec 2022
-
RC3: 'Comment on hess-2022-282', Anonymous Referee #3, 04 Dec 2022
This paper proposes a knowledge-informed deep learning method that can reduce the computational demand required by the calibration of the computationally expensive environmental model. I like the proposed MI sensitivity analysis best because it is able to disclose the sensitivity of parameters varying along with time which traditional sensitivity analysis is not capable of. Please see the comments in the attached PDF file for suggestions and questions.
- AC3: 'Reply on RC3', Peishi Jiang, 28 Dec 2022
Status: closed
-
RC1: 'Comment on hess-2022-282', Anonymous Referee #1, 19 Sep 2022
General comments:
This study showcases a deep learning optimization method for a high-resolution hydrologic model supported by information theory. I appreciate the honest evaluation of the methodology, in-depth reasoning of the deteriorating model performance for ET, and examination of results and conclusions aligned with earlier studies. In general, this paper is well-written with a novel contribution. However, I think the paper would be stronger if the authors can address the following comments.
- Model validation for climate sensitivity: Currently, the model validation period overlaps with the period for calibrating ATS parameters. I am curious whether the optimized parameters would be able to capture the climate sensitivity on flow and ET, i.e., improving the flow/ET performance outside of the calibrating period (2016-2019). It would strongly support this tool’s eligibility in climate change studies.
- ET from flux tower: In this study, the authors have demonstrated that worse ET performance results from poor quality of MODIS ET products. In this study region, is there ET data from the flux tower that could be used for implementing this workflow? Even though the flux tower ET data has less spatial coverage, the data quality can be better, which might be more useful than MODIS ET when calibrating hydrologic parameters.
Specific comments:
L158: Can the authors elaborate on what five soil types and four geological types are?
L160: A 1000-year spin-up is extremely long. Can the authors briefly explain the reason for this long spin-up even if it might be explained in Shuai et al 2022?
L162: Could the authors briefly explain how they preselected the parameters in this study?
L208: Does the MI have to be zero? If the MI between a parameter and the model responses is small enough, is it possible to neglect that parameter? What would be a proper threshold for it?
L208-210: Interesting! Great summary!
L215: When training using different combinations of years, why do the authors only look at Q, not ET?
L249-250: Given the narrowed list, it seems that the authors eliminated the parameters with small MI (not zero), which slightly contradicts the previous statement where only parameters with zero MI would be eliminated (L208). It would be helpful to clarify the threshold of MI below which the parameters will be eliminated.
L286-287: Please clarify whether the extrapolation issue partially or solely contributes to the worse MI-informed results.
L320: Very interesting results!
Logistic:
Author name: Should the third author be Alexander?
Citation: https://doi.org/10.5194/hess-2022-282-RC1 - AC1: 'Reply on RC1', Peishi Jiang, 28 Dec 2022
-
RC2: 'Comment on hess-2022-282', Anonymous Referee #2, 10 Oct 2022
This study aims at basin scale parameter calibration for a physical hydrologic model (ATS) using DL-based inverse method. The authors leveraged the mutual information (MI) for the global sensitivity analysis to identify the relation between parameters and model simulations, which was later applied to the input selection of a MLP parameter inverse model. They executed different groups of simulations and analyses to comprehensively evaluate the proposed framework. The MS is well-written with overall structure easy to follow. I provide my suggestions below regarding better clarifying several points and hopefully they can be useful to further improving the quality of this study.
As my understanding on this study, the title “knowledge-informed DL” is mainly represented by the MI sensitivity analysis used in the input selection for the following inverse modeling. Knowledge informed learning, generally in my mind, is applying physical laws or constraints to the data driven model based on our domain knowledge. To bridge the proposed MI and physical processes together and better strengthen the headline of this study, I suggest the authors try to link the MI results with physical processes of the study area and give some physical explanations of the results from sensitivity analysis. This can further highlight the physical representations of this study.
I am still confused at the details about how the inverse framework is set up and trained. My understanding is that you first run some simulations with ATS (how are the parameters first initialized here?) and use the simulations and parameters to train an inverse mapping with inputs selected by MI, and then replace ATS simulations with real observations to estimate parameters. Does the “responses” mentioned throughout the paper mean the simulated ATS discharge and ET? What are the training targets and how do you develop the structure, tune the hyperparameters and train the DL framework? What are the training and testing dataset separation?... Maybe I didn’t understand some parts very well, but indeed expect the authors can better clarify their methodology and results to make readers more easily understand this work.
I didn’t understand the result of Figure 7 well and hope the authors can give more explanations. Which variables are the NSE and mKGE calculated on, estimated parameters or model simulations? If they are simulation metric, are theses simulations from the model forwarding with parameters estimated from real observations (Q & MODIS ET inverse)? For each individual parameter evaluation, how do you set up the values of other parameters when doing ATS forwarding. The caption notifies the performance is reported on testing data, but I didn’t see how the authors divide testing and training data.
I am thinking this multiple-years training VS one-year training discussed in section 3.3. As for multiple years, you choose to increase the input neuron number, or keep the one-year structure not changed and just use multiple years data as more training samples? I think the latter one could be more beneficial because inputting three-year time series once to the model would require large amounts of parameters in the input layer which can be inefficient and overfitted to small training data.
Another point I would be interested in is whether the authors have tried adding meteorological forcings to the inputs of inverse modeling. I feel the forcing-hydrologic response pair is very important to inform the characteristics of basin processes reflected in model parameters. I am expecting the paired input may bring more benefits to this study.
Specific comments
Line 76 Do you intend to discuss the overfitting problem here? Large number of weights and limited realizations as training data may cause overfitting with a complicated model.
Line 177 Please also give explanations for H(Y|X) to help readers’ understanding.
Line 258 and 259 How did the authors safely draw the conclusion of “improves the MI estimations” and “the parameters are falsely considered” based on the differences of preliminary and full analysis? Additionally, is it possible that in the preliminary analysis some parameters are not identified but actually behave sensitive if you include them in the full MI analysis?
Figure 8 The inputs to the inverse model here are real observations or simulated responses?
Citation: https://doi.org/10.5194/hess-2022-282-RC2 - AC2: 'Reply on RC2', Peishi Jiang, 28 Dec 2022
-
RC3: 'Comment on hess-2022-282', Anonymous Referee #3, 04 Dec 2022
This paper proposes a knowledge-informed deep learning method that can reduce the computational demand required by the calibration of the computationally expensive environmental model. I like the proposed MI sensitivity analysis best because it is able to disclose the sensitivity of parameters varying along with time which traditional sensitivity analysis is not capable of. Please see the comments in the attached PDF file for suggestions and questions.
- AC3: 'Reply on RC3', Peishi Jiang, 28 Dec 2022
Peishi Jiang et al.
Peishi Jiang et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
630 | 295 | 25 | 950 | 6 | 11 |
- HTML: 630
- PDF: 295
- XML: 25
- Total: 950
- BibTeX: 6
- EndNote: 11
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1