the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Neural networks in catchment hydrology: A comparative study of different algorithms in an ensemble of ungauged basins in Germany
Abstract. This study presents a comparative analysis of different neural network models, including Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) in predicting discharge within ungauged basins in Hesse, Germany. All models were trained on 54 catchments with 28 years of daily meteorological data, either including or excluding 11 static catchment attributes. The training process of each model scenario combination was repeated 100 times, using a Latin Hyper Cube Sampler for the purpose of hyperparameter optimisation with batch sizes of 256 and 2048. The evaluation was carried out using data from 35 additional catchments (6 years) to ensure predictions in basins that were not part of the training data. This evaluation assesses predictive accuracy, computational efficiency concerning varying batch sizes and input configurations and conducted a sensitivity analysis of various hydrological and meteorological. The findings indicate that all examined artificial neural networks demonstrate significant predictive capabilities, with a CNN model exhibiting slightly superior performance, closely followed by LSTM and GRU models. The integration of static features was found to improve performance across all models, highlighting the importance of feature selection. Furthermore, models utilising larger batch sizes displayed reduced performance. The analysis of computational efficiency revealed that a GRU model is 41 % faster than the CNN and 59 % faster than the LSTM model. Despite a modest disparity in performance among the models (<3.9 %), the GRU model's advantageous computational speed renders it an optimal compromise between predictive accuracy and computational demand.
- Preprint
(2449 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 12 Sep 2024)
-
CC1: 'Comment on hess-2024-183', John Ding, 23 Jul 2024
reply
On the NSE metric, R squared, and coefficient of determination
In the discussion paper, Table 5 summarizes performances of three ANN models based on the NSE, R squared, and three other metrics. Using their top NSE value as a key, this is distilled below:
Model, NSE, R squared
CNN, 0.76, 0.82
LSTM, 0.75, 0.82
GRU, 0.72, 0.79This shows that R squared establishes, empirically, an upper limit of the NSE. This is an interesting finding for interpreting the NSE. It helps answer a puzzling question, how high can an NSE value go, below a score of 1 for a perfect model being the observation. Kratzert et al. (2024, Figure 5) achieve a top median NSE value of ~0.79 for 531 CAMELS basins for LSTM model. In terms of achieving a highest possible NSE value which remains elusive, "even these 531 basins are most likely not enough to train optimal LSTM models for streamflow." (ibid., Lines 81-84).
But to be clear, is the authors' Coefficient of Determination (R squared) in Lines 330-331 the square of Pearson or linear correlation coefficient, gamma, defined in Equation 1 for the KGE metric and Line 240?
An earliest known NSE variant, called NDE (Nash-Ding efficiency, 1974), was recently recovered by Duc and Sawada (2023, Eq. 3 and Figure 2). NDE as well as NSE are variance, not correlation-based metrics. Figure 2 therein shows that the NDE, which happened to have been called R squared, establishes, statistically, an upper limit of the NSE.
Reference
Duc, L. and Sawada, Y.: A signal-processing-based interpretation of the Nash–Sutcliffe efficiency, Hydrol. Earth Syst. Sci., 27, 1827–1839, https://doi.org/10.5194/hess-27-1827-2023, 2023.
Kratzert, F., Gauch, M., Klotz, D., and Nearing, G.: HESS Opinions: Never train an LSTM on a single basin, Hydrol. Earth Syst. Sci. Discuss. [preprint], https://doi.org/10.5194/hess-2023-275, in review, 2024.
Citation: https://doi.org/10.5194/hess-2024-183-CC1 -
CC2: 'Reply on CC1', Max Weißenborn, 26 Jul 2024
reply
We would like to thank John Ding for his interesting comment on our manuscript. Regarding his raised points (in Italic):
“This shows that R squared establishes, empirically, an upper limit of the NSE. This is an interesting finding for interpreting the NSE. It helps answer a puzzling question, how high can an NSE value go, below a score of 1 for a perfect model being the observation. Kratzert et al. (2024, Figure 5) achieve a top median NSE value of ~0.79 for 531 CAMELS basins for LSTM model.”
The relationship between NSE and R squared is indeed an interesting aspect, which we will consider in the revised version of this manuscript.
“In terms of achieving a highest possible NSE value which remains elusive, "even these 531 basins are most likely not enough to train optimal LSTM models for streamflow." (ibid., Lines 81-84).“
John is right here, indeed the used basin data used for training might be inadequate. However, it could also be that the observed data itself comes with errors and uncertainties. We will discuss this point in the revised version of the manuscript.
“But to be clear, is the authors' Coefficient of Determination (R squared) in Lines 330-331 the square of Pearson or linear correlation coefficient, gamma, defined in Equation 1 for the KGE metric and Line 240?”Lines 330-331 refer to the square of the >Pearson product-moment correlation coefficients< (https://numpy.org/doc/stable/reference/generated/numpy.corrcoef.html).
The variable r, defined in Equation 1 for the KGE metric in line 240, is calculated in the same way but is not squared.“An earliest known NSE variant, called NDE (Nash-Ding efficiency, 1974), was recently recovered by Duc and Sawada (2023, Eq. 3 and Figure 2). NDE as well as NSE are variance, not correlation-based metrics. Figure 2 therein shows that the NDE, which happened to have been called R squared, establishes, statistically, an upper limit of the NSE.”
We will consider this recent publication in the revised version of the manuscript.
Citation: https://doi.org/10.5194/hess-2024-183-CC2
-
CC2: 'Reply on CC1', Max Weißenborn, 26 Jul 2024
reply
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
213 | 55 | 10 | 278 | 2 | 3 |
- HTML: 213
- PDF: 55
- XML: 10
- Total: 278
- BibTeX: 2
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1