Simulation-Based Inference for Parameter Estimation of Complex Watershed Simulators

Hull, Robert; Leonarduzzi, Elena; De La Fuente, Luis; Viet Tran, Hoang; Bennett, Andrew; Melchior, Peter; Maxwell, Reed M.; Condon, Laura E.

doi:https://doi.org/10.5194/hess-2023-264

Preprints

https://doi.org/10.5194/hess-2023-264

Preprints

08 Jan 2024

| 08 Jan 2024

Status: a revised version of this preprint was accepted for the journal HESS.

Simulation-Based Inference for Parameter Estimation of Complex Watershed Simulators

Robert Hull, Elena Leonarduzzi, Luis De La Fuente, Hoang Viet Tran, Andrew Bennett, Peter Melchior, Reed M. Maxwell, and Laura E. Condon

Abstract. High-resolution, spatially distributed process-based (PB) simulators are widely employed in the study of complex watershed processes and their responses to a changing climate. However, calibrating these simulators to observed data remains a significant challenge due to several persistent issues including: (1) intractability stemming from the computational demands and complex responses of simulators, which renders infeasible calculation of the conditional probability of parameters and data, and (2) uncertainty stemming from the choice of simplified model representations of complex natural hydrologic processes. Here we demonstrate how Simulation-Based Inference (SBI) can help address both these challenges for parameter estimation. SBI uses a learned mapping between parameter space and observed data to estimate parameters for generation of calibrated model simulations. To demonstrate the potential of SBI in hydrologic modelling, we conduct a set of synthetic experiments to infer two common physical parameters, Manning's coefficient and hydraulic conductivity, using a representation of a snowmelt-dominated catchment in Colorado, USA. We introduce novel deep learning (DL) components to the SBI approach, including an 'emulator' as a surrogate for the process-based simulator to rapidly explore parameter responses. We also employ a density-based neural network to represent the joint probability of parameters and data without strong assumptions about its functional form. While addressing intractability, we also show that where uncertainty about model structure is significant, SBI can yield unreliable parameter estimates. Approaches to adopting the SBI framework to cases where model structure(s) may not be adequate are introduced using a performance-weighting approach.

Received: 09 Nov 2023 – Discussion started: 08 Jan 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Robert Hull, Elena Leonarduzzi, Luis De La Fuente, Hoang Viet Tran, Andrew Bennett, Peter Melchior, Reed M. Maxwell, and Laura E. Condon

Status: final response (author comments only)

RC1:
'Comment on hess-2023-264', Anonymous Referee #1, 13 Feb 2024

The paper "Simulation-Based Inference for Parameter Estimation of Complex Watershed Simulators" introduces a method utilizing SBI combined with deep learning techniques to improve the calibration of process-based simulators, focusing on Manning's coefficient and hydraulic conductivity in a snowmelt-dominated catchment. It aims to address two main challenges: the computational intractability of simulating complex watershed processes for parameter estimation and the uncertainty arising from simplified model representations of these complex processes. The study performed a series of synthetic experiments to investigate the performance of the SBI approach. The study is generally well-designed and the manuscript provides a detailed explanation of the methods, experiments, and results. Overall, I have no major concerns regarding the methodology of the paper. However, I believe that some points need further clarification.
1) One of the basic assumptions of the presented research is the efficacy of LSTM networks as surrogates for process-based models. While the study acknowledges that LSTMs may not perfectly reproduce the behaviors, I think the LSTM model in this case study is still ideal (based on Table 3, the worst KGE can be as high as 0.77, which indicates generally good performance of the LSTM in mimicking the process-based model). This is understandable because the basin is dominated by snowmelt, which the LSTM does well due to its strong memory capacity. However, the broader applicability of the approach still needs to be discussed, especially given the known challenges LSTMs face in accurately representing hydrologic behavior in arid regions (e.g., https://doi.org/10.1029/2019WR026793).
2) My second concern is with the terminology used throughout the description of the experimental settings, particularly the interchangeable use of the terms "observation", "simulation", and "synthetic observation". I understand that observation refers more to the simulation of Parflow, but sometimes the distinction is not so clear, forcing me to rely on context to understand their intended meanings. This ambiguity is further complicated by the term "simulator", which is used variably to refer to both "ParFlow" and the "LSTM" model (see my specific comments).
3) A related point is: When I read the text discussing model misspecification, particularly lines 82-90, I initially interpreted the discussion as addressing uncertainties arising from the ParFlow model. This interpretation was influenced by the preceding context, which focuses primarily on the challenges associated with PB models. However, as I read through the manuscript, it became clear that the uncertainties referred to might actually be related to the LSTM model used in the SBI framework, since ensemble modeling is applied to LSTM models instead of ParFlow. Perhaps I misunderstood something, but it has indeed caused some confusion. If this is the case, the statement "calibrating these simulators to observed data remains a significant challenge due to several persistent issues including: .... uncertainty stemming from the choice of simplified model representations of complex natural hydrologic processes" in the Abstract would also be misleading, as the manuscript primarily addresses uncertainties related to the LSTM surrogate simulator without adequately discussing the uncertainties inherent in the process-based (ParFlow) model itself.
Specific comments are below.
Abstract: "watershed" and "catchment" are used interchangeably here and throughout the text (along with "basin"). I am not sure if the author wants to distinguish between them, otherwise please consider a consistent word.

l62-64: I think the two sentences are contradictory if "watershed prediction" is identical to "streamflow prediction". Perhaps watershed prediction here refers to the prediction of hydrologic states and fluxes other than streamflow, if so, please clarify.

l67: It would be helpful to state what the difference is between this study and Tsai's study in terms of how DL helps with parameter calibration of PB models.

l74: I would say that it is not always true that DL can "preserve fidelity" unless the surrogate has sufficient predictive power (see my major comment)

l88 (and Sec 2.4): Which model do you mean here, the parflow, the LSTM, or the neural density estimator? This is quite confusing.

Sec 2.2: It seems that SBI itself includes deep learning (l68), but here it looks like SBI can be independent with deep learning, please clarify.

l279: what prior distribution is used in the study? please clarify and justify.

l299-l304: which simulator do you mean here, I assume it is the surrogate simulator (LSTM)? Because you take the result of the parflow simulator as an observation. Please clarify and unify.

l316: Until I read here I am aware that you are saying the uncertainty from the surrogate simulator.

l380: How was the 14 days determined?

l391: It would be nice to show the NSE or KGE value here.

l391: I am curious about the applicability of the SBI approach to other basins in case the surrogate simulator does not capture the flow behavior well.

l431: Which simulator do you mean here?

l471: It is not clear to me what model structure you are referring to.

l542: I find it strange to call the LSTM result a "synthetic observation".

Figure 4: Since you have defined many "observations", please clarify which observation you mean here.

l559: please consider not using e-style scientific notation here (see inconsistency with l581 and l582).

Sec 4.1.1: may remove the header if there is no "sec 4.1.2".

l630: please justify why different settings of LSTM were used than in exp #2.

Citation: https://doi.org/10.5194/hess-2023-264-RC1
- AC1: 'Reply on RC1', Robert Hull, 11 Apr 2024
  
  We thank the reviewer for their detailed reading and comments, and are delighted they find merit in the manuscript. Please see our complete responses in the attached document, highlighted in red.
  
  Citation: https://doi.org/10.5194/hess-2023-264-AC1
RC2:
'Comment on hess-2023-264', Uwe Ehret, 05 Mar 2024

Dear Editor, dear Authors,
Please see my comments in the attachment.
Yours sincerely, Uwe Ehret

Citation: https://doi.org/10.5194/hess-2023-264-RC2
- AC2: 'Reply on RC2', Robert Hull, 11 Apr 2024
  
  We thank the reviewer for their detailed reading and comments, and are delighted they find merit in the manuscript. Please see our complete responses in the attached document, highlighted in red.
  
  Citation: https://doi.org/10.5194/hess-2023-264-AC2

Robert Hull, Elena Leonarduzzi, Luis De La Fuente, Hoang Viet Tran, Andrew Bennett, Peter Melchior, Reed M. Maxwell, and Laura E. Condon

Model code and software

Repository for SBI in the Taylor RIver Basin Robert Hull https://github.com/rhull21/sbi_taylor

Robert Hull, Elena Leonarduzzi, Luis De La Fuente, Hoang Viet Tran, Andrew Bennett, Peter Melchior, Reed M. Maxwell, and Laura E. Condon

Viewed

Total article views: 512 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
325	148	39	512	34	27

HTML: 325
PDF: 148
XML: 39
Total: 512
BibTeX: 34
EndNote: 27

Views and downloads (calculated since 08 Jan 2024)

Month	HTML	PDF	XML	Total
Jan 2024	150	44	6	200
Feb 2024	39	14	5	58
Mar 2024	39	31	3	73
Apr 2024	41	17	12	70
May 2024	23	15	3	41
Jun 2024	19	16	2	37
Jul 2024	14	11	8	33

Cumulative views and downloads (calculated since 08 Jan 2024)

Month	HTML	PDF	XML	Total
Jan 2024	150	44	6	200
Feb 2024	39	14	5	58
Mar 2024	39	31	3	73
Apr 2024	41	17	12	70
May 2024	23	15	3	41
Jun 2024	19	16	2	37
Jul 2024	14	11	8	33

Viewed (geographical distribution)

Total article views: 505 (including HTML, PDF, and XML) Thereof 505 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 26 Jul 2024

Short summary

Large-scale hydrologic a needed tool to explore complex watershed processes and how they may evolve under a changing climate. However, calibrating them can be difficult because they are costly to run and have many unknown parameters. We implement a state-of-the-art approach to model calibration with a set of experiments in the Upper Colorado River Basin.


Total:	0
HTML:	0
PDF:	0
XML:	0