Hydrological Concept Formation inside Long Short-Term Memory (LSTM) networks
- 1School of Geography and the Environment, University of Oxford, South Parks Road, Oxford, United Kingdom, OX1 3QY
- 2Department of Engineering, University of Oxford, Oxford, United Kingdom
- 3LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
- 4UK Centre for Ecology and Hydrology, Maclean Building, Crowmarsh Gifford, Wallingford, United Kingdom, OX10 8BB
- 5International Institute for Applied Systems Analysis (IIASA), Laxenburg, Austria
- 6Google Research, Vienna, Austria
- 7Institute for Environmental Studies, VU University, De Boelelaan 1087, 1081HV, Amsterdam, The Netherlands
- 1School of Geography and the Environment, University of Oxford, South Parks Road, Oxford, United Kingdom, OX1 3QY
- 2Department of Engineering, University of Oxford, Oxford, United Kingdom
- 3LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
- 4UK Centre for Ecology and Hydrology, Maclean Building, Crowmarsh Gifford, Wallingford, United Kingdom, OX10 8BB
- 5International Institute for Applied Systems Analysis (IIASA), Laxenburg, Austria
- 6Google Research, Vienna, Austria
- 7Institute for Environmental Studies, VU University, De Boelelaan 1087, 1081HV, Amsterdam, The Netherlands
Abstract. Neural networks have been shown to be extremely effective rainfall-runoff models, where the river discharge is predicted from meteorological inputs. However, the question remains, what have these models learned? Is it possible to extract information about the learned relationships that map inputs to outputs? And do these mappings represent known hydrological concepts? Small-scale experiments have demonstrated that the internal states of Long Short-Term Memory Networks (LSTMs), a particular neural network architecture predisposed to hydrological modelling, can be interpreted. By extracting the tensors which represent the learned translation from inputs (precipitation, temperature) to outputs (discharge), this research seeks to understand what information the LSTM captures about the hydrological system. We assess the hypothesis that the LSTM replicates real-world processes and that we can extract information about these processes from the internal states of the LSTM. We examine the cell-state vector, which represents the memory of the LSTM, and explore the ways in which the LSTM learns to reproduce stores of water, such as soil moisture and snow cover. We use a simple regression approach to map the LSTM state-vector to our target stores (soil moisture and snow). Good correlations (R2 > 0.8) between the probe outputs and the target variables of interest provide evidence that the LSTM contains information that reflects known hydrological processes comparable with the concept of variable-capacity soil moisture stores.
The implications of this study are threefold: 1) LSTMs reproduce known hydrological processes. 2) While conceptual models have theoretical assumptions embedded in the model a priori, the LSTM derives these from the data. These learned representations are interpretable by scientists. 3) LSTMs can be used to gain an estimate of intermediate stores of water such as soil moisture. While machine learning interpretability is still a nascent field, and our approach reflects a simple technique for exploring what the model has learned, the results are robust to different initial conditions and to a variety of benchmarking experiments. We therefore argue that deep learning approaches can be used to advance our scientific goals as well as our predictive goals.
Thomas Lees et al.
Status: closed
-
CC1: 'Comment on hess-2021-566', John Ding, 24 Nov 2021
S-hydrograph vs. the unit hydrograph
The LSTM uses a hyperbolic tangent (tanh) as an activation function for the cell input g[t] and the recurrent input h[t], though not discussed in the manuscript, but elsewhere (e.g., Frame et al., 2021, Fig. A1.). This is similar in shape to a summation or S-curve hydrograph in unit hydrograph theory, e.g., Chow (1964, Fig. 14-5(a)).
To map some hydrologic realism onto the LSTM, I suggest the authors consider, as an alternative, using a kernel, i.e. a unit hydrograph model, my work (Ding, 1974, Figs. 1 & 4) being but one. Since a kernel has typically a unimodal distribution, a new LSTM-kernel variant will have to track whether the discharge is rising, falling, or remains steady.
References
Chow, V. T. , 1964. Handbook of applied hydrology, Section 14 - Runoff, McGraw-Hill, New York, ISBN 07-010774-2.
Ding, J. Y., 1974. Variable unit hydrograph. J. Hydrol., 22: 53-69.
-
CC2: 'Supplement on CC1', John Ding, 07 Dec 2021
A kernel or impulse response function, u[t], has typically a rising limb, a peak, and a falling limb.
At time step, t-1, the discharge Q[t-1] had a corresponding u[t-1]. But at time step t, Q[t] has two u[t] values depending on the sign of dQ=Q[t]-Q[t-1], i.e. whether the discharge is rising (+), falling (-), or steady (0). During the rising stage, u[t]>u[t-1]; the falling stage, u[t]<u[t-1]; and u[t]=u[t-1], otherwise.
In the LSTM (long short-term memory) networks, one need know only u[t-1] and u[t], i.e. the short(est)-term memory, and not further apart.
-
AC1: 'Reply on CC1', Thomas Lees, 16 Feb 2022
We thank John Ding for his comments on our manuscript. Ultimately, our study was not focused on altering the LSTM architecture but rather on how we can best interpret the architecture that has already been shown to exhibit state of the art performance for rainfall runoff modelling. Other work has explored alterations to the LSTM architecture, and of particular interest is the Mass Conserving LSTM (Hoedt et al 2021) that some of the authors of this paper were involved in. That being said, the lead author, Thomas Lees, would be more than happy to discuss this further if you would like to reach out to him and organise a time to discuss. Thank you again for your interest in our work.
Hoedt, Pieter-Jan, et al. "MC-LSTM: Mass-Conserving LSTM." arXiv preprint arXiv:2101.05186 (2021).
-
CC2: 'Supplement on CC1', John Ding, 07 Dec 2021
-
RC1: 'Comment on hess-2021-566', Lukas Gudmundsson, 15 Dec 2021
The paper submitted by Lees et al. aims at advancing the interpretability of neural networks used for rainfall-runoff modelling, focussing in particular on Long Short Term Memory (LSTM) architectures. LSTMs (a special type of neural network) have in recent years been popularized for rainfall-runoff modelling. The resulting models perform very well but lack a stringent physical interpretation. An interesting property of LSTMs is that they contain internal states – similar to internal states (storages) in classical hydrological models. Lees et al use statistical “probes” to link these LSTM states to independent estimates of soil moisture and snow storage. The analysis shows that LSTM states mimic soil moisture and snow storage dynamics although these quantities were not used for model calibration.
This paper is to my knowledge the first systematic attempt to interpret the internals of an LSTM used for rainfall-runoff modelling in a systematic manner, although un-systematic examples have been emerging in the literature. The application of LSTMs for rainfall-runoff modelling is state of the art and the use of “probes” based on linear regression makes intuitively sense. Beside this, I personally value the authors effort to assess the robustness of their results by (i) applying the “probes” to both re-analysis based and remote-sensing based soil moisture estimates and (ii) by randomization based statistical testing.
The objective of the study is clearly stated: I.e., to test if internal states of an LSTM used for rainfall-runoff modelling at many catchments at once can be linked to independent estimates of soil moisture and snow. The paper is well structured to meet this objective and the data and methods are chosen accordingly.
Nonetheless the study leaves some open questions which the authors may want to further explore:
- An obvious limitation of the proposed approach is that it relies on independent estimates of soil-moisture and snow (or other variables). Therefore, it does not allow for a self-contained interpretation of LSTM states.
- It would be interesting to know how much of the variance of the internal state vectors is captured in the resulting soil moisture estimates.
- How many independent signals are present in the 64 states? (e.g. estimated as the number of dominant principal components)
Minor issues:
- Inconsistency: Equation 1-2 use i_t as input. Equations 3-4 use x_t
- The paper somehow requires that the reader is familiar with how LSTMs work. I acknowledge the authors choice not to repeat the LSTM definition, also since it is available in many other publications. Nonetheless, this made it a bit more difficult to fully understand the paper.
- Elastic net: I assume that the description of the elastic net regularisation might be quite cryptic to readers who are not familiar with this tool (I am). Also: Given the large number of samples (# catchments x # time steps) and the relatively low number of predictors I wondered if a linear regression would perform equally well.
- AC2: 'Reply on RC1', Thomas Lees, 16 Feb 2022
-
RC2: 'Comment on hess-2021-566', Anonymous Referee #2, 17 Jan 2022
Lees et al. adapted a novel method used in Natural Language Processing, “probe”, to examine the internal function of the Long Short Term Memory (LSTM) model in rainfall-runoff predictions. Their results over 669 catchments in Great Britain show a good correlation between the LSTM internal states with re-analysis and independent soil moisture and snow cover products.
I agree with the authors that this paper could be a stepping stone to a myriad of interesting explorations in the field of hydrology. I also appreciate the authors effort in providing additional analysis in the appendices. However, I have some minor comments about some parts of the manuscript, mostly about the clarity and the tone toward traditional hydrologic models.
- I feel the structure of the Introduction is a bit difficult and redundant for me to follow. I could not get the logical flow here. I found the main objective was stated in both the beginning and the end of the introduction. Why do we need a separate and long paragraph about the interpreting machine learning from other fields? This paragraph disrupts my focus on LSTM interpretability.
- I think the authors don’t have to state that LSTM is the best rainfall-runoff model multiple times in the paper (Introduction and Conclusion). While this statement is still debatable, in my opinion, each rainfall-runoff model has its place in the modeling world. LSTM is increasing its popularity because of its robustness, computational efficiency and accuracy. Period. There is no need for bashing one over another.
- Section 2.3 ERA5-Land Data: there is an imbalance between the descriptions of soil moisture and snow depth. I would expect to see more information about snow depth and its accuracy over GB.
- Figure 2: no y label
- Figure 5: no y label
- Line 249: I thought there are only two meteorological drivers (temperature and precipitation (line 6))?
- Line 268: See the second opinion
- Line 282: See the second opinion
- Line 319: I found a recent paper (Tran et al, Development of a Deep Learning Emulator for a Distributed Groundwater–Surface Water Model: ParFlow-ML. Water. 2021) in which the spatial information is included in the LSTM architecture. Do the authors think the probing technique could be use in this architecture? Can the probing technique map between predicted and observed spatially-distributed soil moisture?
- AC3: 'Reply on RC2', Thomas Lees, 16 Feb 2022
Status: closed
-
CC1: 'Comment on hess-2021-566', John Ding, 24 Nov 2021
S-hydrograph vs. the unit hydrograph
The LSTM uses a hyperbolic tangent (tanh) as an activation function for the cell input g[t] and the recurrent input h[t], though not discussed in the manuscript, but elsewhere (e.g., Frame et al., 2021, Fig. A1.). This is similar in shape to a summation or S-curve hydrograph in unit hydrograph theory, e.g., Chow (1964, Fig. 14-5(a)).
To map some hydrologic realism onto the LSTM, I suggest the authors consider, as an alternative, using a kernel, i.e. a unit hydrograph model, my work (Ding, 1974, Figs. 1 & 4) being but one. Since a kernel has typically a unimodal distribution, a new LSTM-kernel variant will have to track whether the discharge is rising, falling, or remains steady.
References
Chow, V. T. , 1964. Handbook of applied hydrology, Section 14 - Runoff, McGraw-Hill, New York, ISBN 07-010774-2.
Ding, J. Y., 1974. Variable unit hydrograph. J. Hydrol., 22: 53-69.
-
CC2: 'Supplement on CC1', John Ding, 07 Dec 2021
A kernel or impulse response function, u[t], has typically a rising limb, a peak, and a falling limb.
At time step, t-1, the discharge Q[t-1] had a corresponding u[t-1]. But at time step t, Q[t] has two u[t] values depending on the sign of dQ=Q[t]-Q[t-1], i.e. whether the discharge is rising (+), falling (-), or steady (0). During the rising stage, u[t]>u[t-1]; the falling stage, u[t]<u[t-1]; and u[t]=u[t-1], otherwise.
In the LSTM (long short-term memory) networks, one need know only u[t-1] and u[t], i.e. the short(est)-term memory, and not further apart.
-
AC1: 'Reply on CC1', Thomas Lees, 16 Feb 2022
We thank John Ding for his comments on our manuscript. Ultimately, our study was not focused on altering the LSTM architecture but rather on how we can best interpret the architecture that has already been shown to exhibit state of the art performance for rainfall runoff modelling. Other work has explored alterations to the LSTM architecture, and of particular interest is the Mass Conserving LSTM (Hoedt et al 2021) that some of the authors of this paper were involved in. That being said, the lead author, Thomas Lees, would be more than happy to discuss this further if you would like to reach out to him and organise a time to discuss. Thank you again for your interest in our work.
Hoedt, Pieter-Jan, et al. "MC-LSTM: Mass-Conserving LSTM." arXiv preprint arXiv:2101.05186 (2021).
-
CC2: 'Supplement on CC1', John Ding, 07 Dec 2021
-
RC1: 'Comment on hess-2021-566', Lukas Gudmundsson, 15 Dec 2021
The paper submitted by Lees et al. aims at advancing the interpretability of neural networks used for rainfall-runoff modelling, focussing in particular on Long Short Term Memory (LSTM) architectures. LSTMs (a special type of neural network) have in recent years been popularized for rainfall-runoff modelling. The resulting models perform very well but lack a stringent physical interpretation. An interesting property of LSTMs is that they contain internal states – similar to internal states (storages) in classical hydrological models. Lees et al use statistical “probes” to link these LSTM states to independent estimates of soil moisture and snow storage. The analysis shows that LSTM states mimic soil moisture and snow storage dynamics although these quantities were not used for model calibration.
This paper is to my knowledge the first systematic attempt to interpret the internals of an LSTM used for rainfall-runoff modelling in a systematic manner, although un-systematic examples have been emerging in the literature. The application of LSTMs for rainfall-runoff modelling is state of the art and the use of “probes” based on linear regression makes intuitively sense. Beside this, I personally value the authors effort to assess the robustness of their results by (i) applying the “probes” to both re-analysis based and remote-sensing based soil moisture estimates and (ii) by randomization based statistical testing.
The objective of the study is clearly stated: I.e., to test if internal states of an LSTM used for rainfall-runoff modelling at many catchments at once can be linked to independent estimates of soil moisture and snow. The paper is well structured to meet this objective and the data and methods are chosen accordingly.
Nonetheless the study leaves some open questions which the authors may want to further explore:
- An obvious limitation of the proposed approach is that it relies on independent estimates of soil-moisture and snow (or other variables). Therefore, it does not allow for a self-contained interpretation of LSTM states.
- It would be interesting to know how much of the variance of the internal state vectors is captured in the resulting soil moisture estimates.
- How many independent signals are present in the 64 states? (e.g. estimated as the number of dominant principal components)
Minor issues:
- Inconsistency: Equation 1-2 use i_t as input. Equations 3-4 use x_t
- The paper somehow requires that the reader is familiar with how LSTMs work. I acknowledge the authors choice not to repeat the LSTM definition, also since it is available in many other publications. Nonetheless, this made it a bit more difficult to fully understand the paper.
- Elastic net: I assume that the description of the elastic net regularisation might be quite cryptic to readers who are not familiar with this tool (I am). Also: Given the large number of samples (# catchments x # time steps) and the relatively low number of predictors I wondered if a linear regression would perform equally well.
- AC2: 'Reply on RC1', Thomas Lees, 16 Feb 2022
-
RC2: 'Comment on hess-2021-566', Anonymous Referee #2, 17 Jan 2022
Lees et al. adapted a novel method used in Natural Language Processing, “probe”, to examine the internal function of the Long Short Term Memory (LSTM) model in rainfall-runoff predictions. Their results over 669 catchments in Great Britain show a good correlation between the LSTM internal states with re-analysis and independent soil moisture and snow cover products.
I agree with the authors that this paper could be a stepping stone to a myriad of interesting explorations in the field of hydrology. I also appreciate the authors effort in providing additional analysis in the appendices. However, I have some minor comments about some parts of the manuscript, mostly about the clarity and the tone toward traditional hydrologic models.
- I feel the structure of the Introduction is a bit difficult and redundant for me to follow. I could not get the logical flow here. I found the main objective was stated in both the beginning and the end of the introduction. Why do we need a separate and long paragraph about the interpreting machine learning from other fields? This paragraph disrupts my focus on LSTM interpretability.
- I think the authors don’t have to state that LSTM is the best rainfall-runoff model multiple times in the paper (Introduction and Conclusion). While this statement is still debatable, in my opinion, each rainfall-runoff model has its place in the modeling world. LSTM is increasing its popularity because of its robustness, computational efficiency and accuracy. Period. There is no need for bashing one over another.
- Section 2.3 ERA5-Land Data: there is an imbalance between the descriptions of soil moisture and snow depth. I would expect to see more information about snow depth and its accuracy over GB.
- Figure 2: no y label
- Figure 5: no y label
- Line 249: I thought there are only two meteorological drivers (temperature and precipitation (line 6))?
- Line 268: See the second opinion
- Line 282: See the second opinion
- Line 319: I found a recent paper (Tran et al, Development of a Deep Learning Emulator for a Distributed Groundwater–Surface Water Model: ParFlow-ML. Water. 2021) in which the spatial information is included in the LSTM architecture. Do the authors think the probing technique could be use in this architecture? Can the probing technique map between predicted and observed spatially-distributed soil moisture?
- AC3: 'Reply on RC2', Thomas Lees, 16 Feb 2022
Thomas Lees et al.
Thomas Lees et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
1,064 | 442 | 23 | 1,529 | 14 | 8 |
- HTML: 1,064
- PDF: 442
- XML: 23
- Total: 1,529
- BibTeX: 14
- EndNote: 8
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1