Articles | Volume 25, issue 5
https://doi.org/10.5194/hess-25-2685-2021
© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/hess-25-2685-2021
© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A note on leveraging synergy in multiple meteorological data sets with deep learning for rainfall–runoff modeling
LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
Daniel Klotz
LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
Sepp Hochreiter
LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
Grey S. Nearing
CORRESPONDING AUTHOR
Google Research, Mountain View, CA, United States
Land, Air and Water Resources Department, University of California Davis, Davis, CA, USA
Related authors
Martin Gauch, Frederik Kratzert, Daniel Klotz, Grey Nearing, Deborah Cohen, and Oren Gilon
EGUsphere, https://doi.org/10.5194/egusphere-2025-1224, https://doi.org/10.5194/egusphere-2025-1224, 2025
Short summary
Short summary
Missing input data are one of the most common challenges when building deep learning hydrological models. We present and analyze different methods that can produce predictions when certain inputs are missing during training or inference. Our proposed strategies provide high accuracy while allowing for more flexible data handling and being robust to outages in operational scenarios.
Eduardo Acuña Espinoza, Frederik Kratzert, Daniel Klotz, Martin Gauch, Manuel Álvarez Chaves, Ralf Loritz, and Uwe Ehret
Hydrol. Earth Syst. Sci., 29, 1749–1758, https://doi.org/10.5194/hess-29-1749-2025, https://doi.org/10.5194/hess-29-1749-2025, 2025
Short summary
Short summary
Long short-term memory (LSTM) networks have demonstrated state-of-the-art performance for rainfall-runoff hydrological modelling. However, most studies focus on predictions at a daily scale, limiting the benefits of sub-daily (e.g. hourly) predictions in applications like flood forecasting. In this study, we introduce a new architecture, multi-frequency LSTM (MF-LSTM), designed to use inputs of various temporal frequencies to produce sub-daily (e.g. hourly) predictions at a moderate computational cost.
Eduardo Acuña Espinoza, Ralf Loritz, Frederik Kratzert, Daniel Klotz, Martin Gauch, Manuel Álvarez Chaves, and Uwe Ehret
Hydrol. Earth Syst. Sci., 29, 1277–1294, https://doi.org/10.5194/hess-29-1277-2025, https://doi.org/10.5194/hess-29-1277-2025, 2025
Short summary
Short summary
Data-driven techniques have shown the potential to outperform process-based models in rainfall–runoff simulations. Hybrid models, combining both approaches, aim to enhance accuracy and maintain interpretability. Expanding the set of test cases to evaluate hybrid models under different conditions, we test their generalization capabilities for extreme hydrological events.
Claudia Färber, Henning Plessow, Simon Mischel, Frederik Kratzert, Nans Addor, Guy Shalev, and Ulrich Looser
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2024-427, https://doi.org/10.5194/essd-2024-427, 2024
Revised manuscript accepted for ESSD
Short summary
Short summary
Large-sample datasets are essential in hydrological science to support modelling studies and advance process understanding. Caravan is a community initiative to create a large-sample hydrology dataset of meteorological forcing data, catchment attributes, and discharge data for catchments around the world. This dataset is a subset of hydrological discharge data and station-based watersheds from the Global Runoff Data Centre (GRDC), which are covered by an open data policy.
Frederik Kratzert, Martin Gauch, Daniel Klotz, and Grey Nearing
Hydrol. Earth Syst. Sci., 28, 4187–4201, https://doi.org/10.5194/hess-28-4187-2024, https://doi.org/10.5194/hess-28-4187-2024, 2024
Short summary
Short summary
Recently, a special type of neural-network architecture became increasingly popular in hydrology literature. However, in most applications, this model was applied as a one-to-one replacement for hydrology models without adapting or rethinking the experimental setup. In this opinion paper, we show how this is almost always a bad decision and how using these kinds of models requires the use of large-sample hydrology data sets.
Andreas Auer, Martin Gauch, Frederik Kratzert, Grey Nearing, Sepp Hochreiter, and Daniel Klotz
Hydrol. Earth Syst. Sci., 28, 4099–4126, https://doi.org/10.5194/hess-28-4099-2024, https://doi.org/10.5194/hess-28-4099-2024, 2024
Short summary
Short summary
This work examines the impact of temporal and spatial information on the uncertainty estimation of streamflow forecasts. The study emphasizes the importance of data updates and global information for precise uncertainty estimates. We use conformal prediction to show that recent data enhance the estimates, even if only available infrequently. Local data yield reasonable average estimations but fall short for peak-flow events. The use of global data significantly improves these predictions.
Daniel Klotz, Martin Gauch, Frederik Kratzert, Grey Nearing, and Jakob Zscheischler
Hydrol. Earth Syst. Sci., 28, 3665–3673, https://doi.org/10.5194/hess-28-3665-2024, https://doi.org/10.5194/hess-28-3665-2024, 2024
Short summary
Short summary
The evaluation of model performance is essential for hydrological modeling. Using performance criteria requires a deep understanding of their properties. We focus on a counterintuitive aspect of the Nash–Sutcliffe efficiency (NSE) and show that if we divide the data into multiple parts, the overall performance can be higher than all the evaluations of the subsets. Although this follows from the definition of the NSE, the resulting behavior can have unintended consequences in practice.
Grey S. Nearing, Daniel Klotz, Jonathan M. Frame, Martin Gauch, Oren Gilon, Frederik Kratzert, Alden Keefe Sampson, Guy Shalev, and Sella Nevo
Hydrol. Earth Syst. Sci., 26, 5493–5513, https://doi.org/10.5194/hess-26-5493-2022, https://doi.org/10.5194/hess-26-5493-2022, 2022
Short summary
Short summary
When designing flood forecasting models, it is necessary to use all available data to achieve the most accurate predictions possible. This manuscript explores two basic ways of ingesting near-real-time streamflow data into machine learning streamflow models. The point we want to make is that when working in the context of machine learning (instead of traditional hydrology models that are based on
bio-geophysics), it is not necessary to use complex statistical methods for injecting sparse data.
Sella Nevo, Efrat Morin, Adi Gerzi Rosenthal, Asher Metzger, Chen Barshai, Dana Weitzner, Dafi Voloshin, Frederik Kratzert, Gal Elidan, Gideon Dror, Gregory Begelman, Grey Nearing, Guy Shalev, Hila Noga, Ira Shavitt, Liora Yuklea, Moriah Royz, Niv Giladi, Nofar Peled Levi, Ofir Reich, Oren Gilon, Ronnie Maor, Shahar Timnat, Tal Shechter, Vladimir Anisimov, Yotam Gigi, Yuval Levin, Zach Moshe, Zvika Ben-Haim, Avinatan Hassidim, and Yossi Matias
Hydrol. Earth Syst. Sci., 26, 4013–4032, https://doi.org/10.5194/hess-26-4013-2022, https://doi.org/10.5194/hess-26-4013-2022, 2022
Short summary
Short summary
Early flood warnings are one of the most effective tools to save lives and goods. Machine learning (ML) models can improve flood prediction accuracy but their use in operational frameworks is limited. The paper presents a flood warning system, operational in India and Bangladesh, that uses ML models for forecasting river stage and flood inundation maps and discusses the models' performances. In 2021, more than 100 million flood alerts were sent to people near rivers over an area of 470 000 km2.
Juliane Mai, Hongren Shen, Bryan A. Tolson, Étienne Gaborit, Richard Arsenault, James R. Craig, Vincent Fortin, Lauren M. Fry, Martin Gauch, Daniel Klotz, Frederik Kratzert, Nicole O'Brien, Daniel G. Princz, Sinan Rasiya Koya, Tirthankar Roy, Frank Seglenieks, Narayan K. Shrestha, André G. T. Temgoua, Vincent Vionnet, and Jonathan W. Waddell
Hydrol. Earth Syst. Sci., 26, 3537–3572, https://doi.org/10.5194/hess-26-3537-2022, https://doi.org/10.5194/hess-26-3537-2022, 2022
Short summary
Short summary
Model intercomparison studies are carried out to test various models and compare the quality of their outputs over the same domain. In this study, 13 diverse model setups using the same input data are evaluated over the Great Lakes region. Various model outputs – such as streamflow, evaporation, soil moisture, and amount of snow on the ground – are compared using standardized methods and metrics. The basin-wise model outputs and observations are made available through an interactive website.
Jonathan M. Frame, Frederik Kratzert, Daniel Klotz, Martin Gauch, Guy Shalev, Oren Gilon, Logan M. Qualls, Hoshin V. Gupta, and Grey S. Nearing
Hydrol. Earth Syst. Sci., 26, 3377–3392, https://doi.org/10.5194/hess-26-3377-2022, https://doi.org/10.5194/hess-26-3377-2022, 2022
Short summary
Short summary
The most accurate rainfall–runoff predictions are currently based on deep learning. There is a concern among hydrologists that deep learning models may not be reliable in extrapolation or for predicting extreme events. This study tests that hypothesis. The deep learning models remained relatively accurate in predicting extreme events compared with traditional models, even when extreme events were not included in the training set.
Thomas Lees, Steven Reece, Frederik Kratzert, Daniel Klotz, Martin Gauch, Jens De Bruijn, Reetik Kumar Sahu, Peter Greve, Louise Slater, and Simon J. Dadson
Hydrol. Earth Syst. Sci., 26, 3079–3101, https://doi.org/10.5194/hess-26-3079-2022, https://doi.org/10.5194/hess-26-3079-2022, 2022
Short summary
Short summary
Despite the accuracy of deep learning rainfall-runoff models, we are currently uncertain of what these models have learned. In this study we explore the internals of one deep learning architecture and demonstrate that the model learns about intermediate hydrological stores of soil moisture and snow water, despite never having seen data about these processes during training. Therefore, we find evidence that the deep learning approach learns a physically realistic mapping from inputs to outputs.
Daniel Klotz, Frederik Kratzert, Martin Gauch, Alden Keefe Sampson, Johannes Brandstetter, Günter Klambauer, Sepp Hochreiter, and Grey Nearing
Hydrol. Earth Syst. Sci., 26, 1673–1693, https://doi.org/10.5194/hess-26-1673-2022, https://doi.org/10.5194/hess-26-1673-2022, 2022
Short summary
Short summary
This contribution evaluates distributional runoff predictions from deep-learning-based approaches. We propose a benchmarking setup and establish four strong baselines. The results show that accurate, precise, and reliable uncertainty estimation can be achieved with deep learning.
Martin Gauch, Frederik Kratzert, Daniel Klotz, Grey Nearing, Jimmy Lin, and Sepp Hochreiter
Hydrol. Earth Syst. Sci., 25, 2045–2062, https://doi.org/10.5194/hess-25-2045-2021, https://doi.org/10.5194/hess-25-2045-2021, 2021
Short summary
Short summary
We present multi-timescale Short-Term Memory (MTS-LSTM), a machine learning approach that predicts discharge at multiple timescales within one model. MTS-LSTM is significantly more accurate than the US National Water Model and computationally more efficient than an individual LSTM model per timescale. Further, MTS-LSTM can process different input variables at different timescales, which is important as the lead time of meteorological forecasts often depends on their temporal resolution.
Martin Gauch, Frederik Kratzert, Daniel Klotz, Grey Nearing, Deborah Cohen, and Oren Gilon
EGUsphere, https://doi.org/10.5194/egusphere-2025-1224, https://doi.org/10.5194/egusphere-2025-1224, 2025
Short summary
Short summary
Missing input data are one of the most common challenges when building deep learning hydrological models. We present and analyze different methods that can produce predictions when certain inputs are missing during training or inference. Our proposed strategies provide high accuracy while allowing for more flexible data handling and being robust to outages in operational scenarios.
Eduardo Acuña Espinoza, Frederik Kratzert, Daniel Klotz, Martin Gauch, Manuel Álvarez Chaves, Ralf Loritz, and Uwe Ehret
Hydrol. Earth Syst. Sci., 29, 1749–1758, https://doi.org/10.5194/hess-29-1749-2025, https://doi.org/10.5194/hess-29-1749-2025, 2025
Short summary
Short summary
Long short-term memory (LSTM) networks have demonstrated state-of-the-art performance for rainfall-runoff hydrological modelling. However, most studies focus on predictions at a daily scale, limiting the benefits of sub-daily (e.g. hourly) predictions in applications like flood forecasting. In this study, we introduce a new architecture, multi-frequency LSTM (MF-LSTM), designed to use inputs of various temporal frequencies to produce sub-daily (e.g. hourly) predictions at a moderate computational cost.
Eduardo Acuña Espinoza, Ralf Loritz, Frederik Kratzert, Daniel Klotz, Martin Gauch, Manuel Álvarez Chaves, and Uwe Ehret
Hydrol. Earth Syst. Sci., 29, 1277–1294, https://doi.org/10.5194/hess-29-1277-2025, https://doi.org/10.5194/hess-29-1277-2025, 2025
Short summary
Short summary
Data-driven techniques have shown the potential to outperform process-based models in rainfall–runoff simulations. Hybrid models, combining both approaches, aim to enhance accuracy and maintain interpretability. Expanding the set of test cases to evaluate hybrid models under different conditions, we test their generalization capabilities for extreme hydrological events.
Sanika Baste, Daniel Klotz, Eduardo Acuña Espinoza, Andras Bardossy, and Ralf Loritz
EGUsphere, https://doi.org/10.5194/egusphere-2025-425, https://doi.org/10.5194/egusphere-2025-425, 2025
Short summary
Short summary
This study evaluates the extrapolation performance of Long Short-Term Memory (LSTM) networks in rainfall-runoff modeling, specifically under extreme conditions. The findings reveal that the LSTM cannot predict discharge values beyond a theoretical limit, which is well below the extremity of its training data. This behavior results from the LSTM's gating structures rather than saturation of cell states alone.
Daniel Klotz, Peter Miersch, Thiago V. M. do Nascimento, Fabrizio Fenicia, Martin Gauch, and Jakob Zscheischler
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2024-450, https://doi.org/10.5194/essd-2024-450, 2025
Preprint under review for ESSD
Short summary
Short summary
Data availability is central to hydrological science. It is the basis for advancing our understanding of hydrological processes, building prediction models, and anticipatory water management. We present a data-driven daily runoff reconstruction product for natural streamflow. We name it EARLS: European aggregated reconstruction for large-sample studies. The reconstructions represent daily simulations of natural streamflow across Europe and cover the period from 1953 to 2020.
Gab Abramowitz, Anna Ukkola, Sanaa Hobeichi, Jon Cranko Page, Mathew Lipson, Martin G. De Kauwe, Samuel Green, Claire Brenner, Jonathan Frame, Grey Nearing, Martyn Clark, Martin Best, Peter Anthoni, Gabriele Arduini, Souhail Boussetta, Silvia Caldararu, Kyeungwoo Cho, Matthias Cuntz, David Fairbairn, Craig R. Ferguson, Hyungjun Kim, Yeonjoo Kim, Jürgen Knauer, David Lawrence, Xiangzhong Luo, Sergey Malyshev, Tomoko Nitta, Jerome Ogee, Keith Oleson, Catherine Ottlé, Phillipe Peylin, Patricia de Rosnay, Heather Rumbold, Bob Su, Nicolas Vuichard, Anthony P. Walker, Xiaoni Wang-Faivre, Yunfei Wang, and Yijian Zeng
Biogeosciences, 21, 5517–5538, https://doi.org/10.5194/bg-21-5517-2024, https://doi.org/10.5194/bg-21-5517-2024, 2024
Short summary
Short summary
This paper evaluates land models – computer-based models that simulate ecosystem dynamics; land carbon, water, and energy cycles; and the role of land in the climate system. It uses machine learning and AI approaches to show that, despite the complexity of land models, they do not perform nearly as well as they could given the amount of information they are provided with about the prediction problem.
Claudia Färber, Henning Plessow, Simon Mischel, Frederik Kratzert, Nans Addor, Guy Shalev, and Ulrich Looser
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2024-427, https://doi.org/10.5194/essd-2024-427, 2024
Revised manuscript accepted for ESSD
Short summary
Short summary
Large-sample datasets are essential in hydrological science to support modelling studies and advance process understanding. Caravan is a community initiative to create a large-sample hydrology dataset of meteorological forcing data, catchment attributes, and discharge data for catchments around the world. This dataset is a subset of hydrological discharge data and station-based watersheds from the Global Runoff Data Centre (GRDC), which are covered by an open data policy.
Frederik Kratzert, Martin Gauch, Daniel Klotz, and Grey Nearing
Hydrol. Earth Syst. Sci., 28, 4187–4201, https://doi.org/10.5194/hess-28-4187-2024, https://doi.org/10.5194/hess-28-4187-2024, 2024
Short summary
Short summary
Recently, a special type of neural-network architecture became increasingly popular in hydrology literature. However, in most applications, this model was applied as a one-to-one replacement for hydrology models without adapting or rethinking the experimental setup. In this opinion paper, we show how this is almost always a bad decision and how using these kinds of models requires the use of large-sample hydrology data sets.
Andreas Auer, Martin Gauch, Frederik Kratzert, Grey Nearing, Sepp Hochreiter, and Daniel Klotz
Hydrol. Earth Syst. Sci., 28, 4099–4126, https://doi.org/10.5194/hess-28-4099-2024, https://doi.org/10.5194/hess-28-4099-2024, 2024
Short summary
Short summary
This work examines the impact of temporal and spatial information on the uncertainty estimation of streamflow forecasts. The study emphasizes the importance of data updates and global information for precise uncertainty estimates. We use conformal prediction to show that recent data enhance the estimates, even if only available infrequently. Local data yield reasonable average estimations but fall short for peak-flow events. The use of global data significantly improves these predictions.
Daniel Klotz, Martin Gauch, Frederik Kratzert, Grey Nearing, and Jakob Zscheischler
Hydrol. Earth Syst. Sci., 28, 3665–3673, https://doi.org/10.5194/hess-28-3665-2024, https://doi.org/10.5194/hess-28-3665-2024, 2024
Short summary
Short summary
The evaluation of model performance is essential for hydrological modeling. Using performance criteria requires a deep understanding of their properties. We focus on a counterintuitive aspect of the Nash–Sutcliffe efficiency (NSE) and show that if we divide the data into multiple parts, the overall performance can be higher than all the evaluations of the subsets. Although this follows from the definition of the NSE, the resulting behavior can have unintended consequences in practice.
Louise J. Slater, Louise Arnal, Marie-Amélie Boucher, Annie Y.-Y. Chang, Simon Moulds, Conor Murphy, Grey Nearing, Guy Shalev, Chaopeng Shen, Linda Speight, Gabriele Villarini, Robert L. Wilby, Andrew Wood, and Massimiliano Zappa
Hydrol. Earth Syst. Sci., 27, 1865–1889, https://doi.org/10.5194/hess-27-1865-2023, https://doi.org/10.5194/hess-27-1865-2023, 2023
Short summary
Short summary
Hybrid forecasting systems combine data-driven methods with physics-based weather and climate models to improve the accuracy of predictions for meteorological and hydroclimatic events such as rainfall, temperature, streamflow, floods, droughts, tropical cyclones, or atmospheric rivers. We review recent developments in hybrid forecasting and outline key challenges and opportunities in the field.
Grey S. Nearing, Daniel Klotz, Jonathan M. Frame, Martin Gauch, Oren Gilon, Frederik Kratzert, Alden Keefe Sampson, Guy Shalev, and Sella Nevo
Hydrol. Earth Syst. Sci., 26, 5493–5513, https://doi.org/10.5194/hess-26-5493-2022, https://doi.org/10.5194/hess-26-5493-2022, 2022
Short summary
Short summary
When designing flood forecasting models, it is necessary to use all available data to achieve the most accurate predictions possible. This manuscript explores two basic ways of ingesting near-real-time streamflow data into machine learning streamflow models. The point we want to make is that when working in the context of machine learning (instead of traditional hydrology models that are based on
bio-geophysics), it is not necessary to use complex statistical methods for injecting sparse data.
Sella Nevo, Efrat Morin, Adi Gerzi Rosenthal, Asher Metzger, Chen Barshai, Dana Weitzner, Dafi Voloshin, Frederik Kratzert, Gal Elidan, Gideon Dror, Gregory Begelman, Grey Nearing, Guy Shalev, Hila Noga, Ira Shavitt, Liora Yuklea, Moriah Royz, Niv Giladi, Nofar Peled Levi, Ofir Reich, Oren Gilon, Ronnie Maor, Shahar Timnat, Tal Shechter, Vladimir Anisimov, Yotam Gigi, Yuval Levin, Zach Moshe, Zvika Ben-Haim, Avinatan Hassidim, and Yossi Matias
Hydrol. Earth Syst. Sci., 26, 4013–4032, https://doi.org/10.5194/hess-26-4013-2022, https://doi.org/10.5194/hess-26-4013-2022, 2022
Short summary
Short summary
Early flood warnings are one of the most effective tools to save lives and goods. Machine learning (ML) models can improve flood prediction accuracy but their use in operational frameworks is limited. The paper presents a flood warning system, operational in India and Bangladesh, that uses ML models for forecasting river stage and flood inundation maps and discusses the models' performances. In 2021, more than 100 million flood alerts were sent to people near rivers over an area of 470 000 km2.
Juliane Mai, Hongren Shen, Bryan A. Tolson, Étienne Gaborit, Richard Arsenault, James R. Craig, Vincent Fortin, Lauren M. Fry, Martin Gauch, Daniel Klotz, Frederik Kratzert, Nicole O'Brien, Daniel G. Princz, Sinan Rasiya Koya, Tirthankar Roy, Frank Seglenieks, Narayan K. Shrestha, André G. T. Temgoua, Vincent Vionnet, and Jonathan W. Waddell
Hydrol. Earth Syst. Sci., 26, 3537–3572, https://doi.org/10.5194/hess-26-3537-2022, https://doi.org/10.5194/hess-26-3537-2022, 2022
Short summary
Short summary
Model intercomparison studies are carried out to test various models and compare the quality of their outputs over the same domain. In this study, 13 diverse model setups using the same input data are evaluated over the Great Lakes region. Various model outputs – such as streamflow, evaporation, soil moisture, and amount of snow on the ground – are compared using standardized methods and metrics. The basin-wise model outputs and observations are made available through an interactive website.
Jonathan M. Frame, Frederik Kratzert, Daniel Klotz, Martin Gauch, Guy Shalev, Oren Gilon, Logan M. Qualls, Hoshin V. Gupta, and Grey S. Nearing
Hydrol. Earth Syst. Sci., 26, 3377–3392, https://doi.org/10.5194/hess-26-3377-2022, https://doi.org/10.5194/hess-26-3377-2022, 2022
Short summary
Short summary
The most accurate rainfall–runoff predictions are currently based on deep learning. There is a concern among hydrologists that deep learning models may not be reliable in extrapolation or for predicting extreme events. This study tests that hypothesis. The deep learning models remained relatively accurate in predicting extreme events compared with traditional models, even when extreme events were not included in the training set.
Thomas Lees, Steven Reece, Frederik Kratzert, Daniel Klotz, Martin Gauch, Jens De Bruijn, Reetik Kumar Sahu, Peter Greve, Louise Slater, and Simon J. Dadson
Hydrol. Earth Syst. Sci., 26, 3079–3101, https://doi.org/10.5194/hess-26-3079-2022, https://doi.org/10.5194/hess-26-3079-2022, 2022
Short summary
Short summary
Despite the accuracy of deep learning rainfall-runoff models, we are currently uncertain of what these models have learned. In this study we explore the internals of one deep learning architecture and demonstrate that the model learns about intermediate hydrological stores of soil moisture and snow water, despite never having seen data about these processes during training. Therefore, we find evidence that the deep learning approach learns a physically realistic mapping from inputs to outputs.
Daniel Klotz, Frederik Kratzert, Martin Gauch, Alden Keefe Sampson, Johannes Brandstetter, Günter Klambauer, Sepp Hochreiter, and Grey Nearing
Hydrol. Earth Syst. Sci., 26, 1673–1693, https://doi.org/10.5194/hess-26-1673-2022, https://doi.org/10.5194/hess-26-1673-2022, 2022
Short summary
Short summary
This contribution evaluates distributional runoff predictions from deep-learning-based approaches. We propose a benchmarking setup and establish four strong baselines. The results show that accurate, precise, and reliable uncertainty estimation can be achieved with deep learning.
Martin Gauch, Frederik Kratzert, Daniel Klotz, Grey Nearing, Jimmy Lin, and Sepp Hochreiter
Hydrol. Earth Syst. Sci., 25, 2045–2062, https://doi.org/10.5194/hess-25-2045-2021, https://doi.org/10.5194/hess-25-2045-2021, 2021
Short summary
Short summary
We present multi-timescale Short-Term Memory (MTS-LSTM), a machine learning approach that predicts discharge at multiple timescales within one model. MTS-LSTM is significantly more accurate than the US National Water Model and computationally more efficient than an individual LSTM model per timescale. Further, MTS-LSTM can process different input variables at different timescales, which is important as the lead time of meteorological forecasts often depends on their temporal resolution.
Cited articles
Addor, N., Newman, A. J., Mizukami, N., and Clark, M. P.: The CAMELS data set: catchment attributes and meteorology for large-sample studies, Hydrol. Earth Syst. Sci., 21, 5293–5313, https://doi.org/10.5194/hess-21-5293-2017, 2017a. a
Addor, N., Newman, A. J., Mizukami, N., and Clark, M. P.: Catchment attributes
for large-sample studies, Boulder, CO, UCAR/NCAR,
https://doi.org/10.5065/D6G73C3Q, 2017b. a
Addor, N., Nearing, G., Prieto, C., Newman, A. J., Le Vine, N., and Clark,
M. P.: A Ranking of Hydrological Signatures Based on Their
Predictability in Space, Water Resour. Res., 54, 8792–8812,
https://doi.org/10.1029/2018WR022606, 2018. a, b
Alemohammad, S. H., McColl, K. A., Konings, A. G., Entekhabi, D., and Stoffelen, A.: Characterization of precipitation product errors across the United States using multiplicative triple collocation, Hydrol. Earth Syst. Sci., 19, 3489–3503, https://doi.org/10.5194/hess-19-3489-2015, 2015. a
Beck, H. E., Vergopolan, N., Pan, M., Levizzani, V., van Dijk, A. I. J. M., Weedon, G. P., Brocca, L., Pappenberger, F., Huffman, G. J., and Wood, E. F.: Global-scale evaluation of 22 precipitation datasets using gauge observations and hydrological modeling, Hydrol. Earth Syst. Sci., 21, 6201–6217, https://doi.org/10.5194/hess-21-6201-2017, 2017. a
Clausen, B. and Biggs, B.: Flow variables for ecological studies in temperate
streams: groupings based on covariance, J. Hydrol., 237, 184–197,
https://doi.org/10.1016/S0022-1694(00)00306-1, 2000. a, b
Court, A.: Measures of streamflow timing, J. Geophys. Res., 67,
4335–4339, https://doi.org/10.1029/JZ067i011p04335, 1962. a
Duan, Q., Ajami, N. K., Gao, X., and Sorooshian, S.: Multi-model ensemble
hydrologic prediction using Bayesian model averaging, Adv. Water
Resour., 30, 1371–1386, 2007. a
Frame, J., Nearing, G., Kratzert, F., and Rahman, M.: Post processing the US
National Water Model with a Long Short-Term Memory network, J. Am. Water Resour. As.,
https://doi.org/10.31223/osf.io/4xhac, in review, 2020. a
Gers, F. A., Schmidhuber, J., and Cummins, F.: Learning to forget: continual prediction
with LSTM, Neural Comput., 12, 2451–2471, 2000. a
Henn, B., Newman, A. J., Livneh, B., Daly, C., and Lundquist, J. D.: An
assessment of differences in gridded precipitation datasets in complex
terrain, J. Hydrol., 556, 1205–1219, 2018. a
Hochreiter, S.: Untersuchungen zu dynamischen neuronalen Netzen, Diploma,
Technische Universität München, München, 91, 1991. a
Hochreiter, S. and Schmidhuber, J.: Flat minima, Neural Comput., 9, 1–42,
1997a. a
Hochreiter, S. and Schmidhuber, J.: Long short-term memory, Neural Comput.,
9, 1735–1780, 1997b. a
Houska, T., Kraft, P., Chamorro-Chavez, A. and Breuer, L.: SPOTting Model
Parameters Using a Ready-Made Python Package, PLoS ONE, 10, e0145180,
https://doi.org/10.1371/journal.pone.0145180, 2015. a, b
Hoyer, S. and Hamman, J.: xarray: N-D labeled arrays and datasets in
Python, Journal of Open Research Software, 5, p. 10, https://doi.org/10.5334/jors.148, 2017. a
Hunter, J. D.: Matplotlib: A 2D graphics environment, Comput. Sci.
Eng., 9, 90–95, 2007. a
Kingma, D. P. and Ba, J.: Adam: A method for stochastic optimization, arXiv
[preprint], arXiv:1412.6980, 2014. a
Kratzert, F.: Extended NLDAS forcings, HydroShare, https://doi.org/10.4211/hs.0a68bfd7ddf642a8be9041d60f40868c, 2019. a
Kratzert, F.: Extended Maurer forcings, HydroShare, https://doi.org/10.4211/hs.17c896843cf940339c3c3496d0c1c077, 2019b. a
Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005–6022, https://doi.org/10.5194/hess-22-6005-2018, 2018. a
Kratzert, F., Klotz, D., Herrnegger, M., Sampson, A. K., Hochreiter, S., and
Nearing, G. S.: Toward Improved Predictions in Ungauged Basins: Exploiting
the Power of Machine Learning, Water Resour. Res., 55, 11344–11354,
https://doi.org/10.1029/2019WR026065, 2019a. a, b, c, d
Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., and Nearing, G.: Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets, Hydrol. Earth Syst. Sci., 23, 5089–5110, https://doi.org/10.5194/hess-23-5089-2019, 2019b. a, b, c, d, e, f, g
Kratzert, F., Klotz, D., Hochreiter, S., and Nearing, G. S.: Benchmark models, HydroShare, https://doi.org/10.4211/hs.474ecc37e7db45baa425cdb4fc1b61e1, 2019c. a
Kratzert, F., Klotz, D., Hochreiter, S., and Nearing, G. S.: Pre-trained models, Zenodo [data set],
https://doi.org/10.5281/zenodo.4670268, 2021. a
Ladson, A., Brown, R., Neal, B., and Nathan, R.: A standard approach to
baseflow separation using the Lyne and Hollick filter, Australian Journal of Water Resources, 17, , 25–34, 2013. a
Lundquist, J., Hughes, M., Gutmann, E., and Kapnick, S.: Our skill in modeling
mountain rain and snow is bypassing the skill of our observational networks,
B. Am. Meteorol. Soc., 100, 2473–2490, https://doi.org/10.1175/BAMS-D-19-0001.1, 2019. a
Madadgar, S. and Moradkhani, H.: Improved B ayesian multimodeling: Integration
of copulas and B ayesian model averaging, Water Resour. Res., 50,
9586–9603, 2014. a
Maurer, E. P., Wood, A., Adam, J., Lettenmaier, D. P., and Nijssen, B.: A
long-term hydrologically based dataset of land surface fluxes and states for
the conterminous United States, J. Climate, 15, 3237–3251, 2002. a
McColl, K. A., Vogelzang, J., Konings, A. G., Entekhabi, D., Piles, M., and
Stoffelen, A.: Extended triple collocation: Estimating errors and
correlation coefficients with respect to an unknown target: EXTENDED
TRIPLE COLLOCATION, Geophys. Res. Lett., 41, 6229–6236,
https://doi.org/10.1002/2014GL061322, 2014. a
McKinney, W.: Data Structures for Statistical Computing in Python, Proceedings
of the 9th Python in Science Conference, Austin, Texas, 28 June–3 July, 1697900, 51–56, 2010. a
Nash, J. E. and Sutcliffe, J. V.: River flow forecasting through conceptual
models part I – A discussion of principles, J. Hydrol., 10,
282–290, 1970. a
Newman, A., Sampson, K., Clark, M., Bock, A., Viger, R., and Blodgett, D.: A
large-sample watershed-scale hydrometeorological dataset for the contiguous
USA, Boulder, CO: UCAR/NCAR, https://doi.org/10.5065/D6MW2F4D, 2014. a, b
Newman, A. J., Clark, M. P., Longman, R. J., and Giambelluca, T. W.:
Methodological intercomparisons of station-based gridded meteorological
products: Utility, limitations, and paths forward, J.
Hydrometeorol., 20, 531–547, 2019. a
Olden, J. D. and Poff, N. L.: Redundancy and the choice of hydrologic indices
for characterizing streamflow regimes, River Res. Appl., 19,
101–121, https://doi.org/10.1002/rra.700, 2003. a, b
Parkes, B., Higginbottom, T. P., Hufkens, K., Ceballos, F., Kramer, B., and
Foster, T.: Weather dataset choice introduces uncertainty to estimates of
crop yield responses to climate variability and change, Environ.
Res. Lett., 14, 124089, https://doi.org/10.1088/1748-9326/ab5ebb, 2019. a
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z.,
Desmaison, A., Antiga, L., and Lerer, A.: Pytorch: an imperative style, high-performance deep learning library,in:
Advances in Neural Information Processing Systems, 32,
8024–8035, 2017. a
Pearl, J.: Embracing causality in default reasoning, Artificial Intelligence,
35, 259–271, 1988. a
Sankarasubramanian, A., Vogel, R. M., and Limbrunner, J. F.: Climate elasticity
of streamflow in the United States, Water Resour. Res., 37,
1771–1781, https://doi.org/10.1029/2000WR900330, 2001. a
Sawicz, K., Wagener, T., Sivapalan, M., Troch, P. A., and Carrillo, G.: Catchment classification: empirical analysis of hydrologic similarity based on catchment function in the eastern USA, Hydrol. Earth Syst. Sci., 15, 2895–2911, https://doi.org/10.5194/hess-15-2895-2011, 2011. a, b
Scipal, K., Dorigo, W., and deJeu, R.: Triple collocation—A new tool to
determine the error structure of global soil moisture products, in: 2010 IEEE
International Geoscience and Remote Sensing Symposium, Honolulu, HI, USA, 25—30 July 2010, 4426–4429, IEEE,
2010. a
Shrikumar, A., Greenside, P., Shcherbina, A., and Kundaje, A.: Not just a black
box: Learning important features through propagating activation differences,
arXiv [preprint], arXiv:1605.01713, 2016. a
Stoffelen, A.: Toward the true near-surface wind speed: Error modeling and
calibration using triple collocation, J. Geophys. Res.-Oceans, 103, 7755–7766, 1998. a
Sutton, R.: The bitter lesson, Incomplete Ideas (blog), available at: http://www.incompleteideas.net/IncIdeas/BitterLesson.html (last access: 13 May 2020), 2019. a
Thornton, P. E., Running, S. W., White, M. A.: Generating surfaces of
daily meteorological variables over large regions of complex terrain, J.
Hydrol., 190, 214–251, 1997. a
Timmermans, B., Wehner, M., Cooley, D., O'Brien, T., and Krishnan, H.: An
evaluation of the consistency of extremes in gridded precipitation data sets,
Clim. Dynam., 52, 6651–6670, 2019. a
Upstream-Tech:
SACSMA-SNOW17, available at: https://github.com/Upstream-Tech/SACSMA-SNOW17.git, last access: 11 July 2020. a
Tolson, B. A. and Shoemaker, C. A.: Dynamically dimensioned search algorithm
for computationally efficient watershed model calibration, Water Resour. Res., 43, W01413, https://doi.org/10.1029/2005WR004723, 2007. a
Van Der Walt, S., Colbert, S. C., and Varoquaux, G.: The NumPy array: A
structure for efficient numerical computation, Comput. Sci.
Eng., 13, 22–30, 2011. a
van Rossum, G.: Python tutorial, Technical Report CS-R9526, Tech. rep.,
Centrum voor Wiskunde en Informatica (CWI), Amsterdam, 1995. a
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T.,
Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright,
J., van der Walt, S. J., Brett, M., Wilson, J., Jarrod Millman, K.,
Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E.,
Carey, C., Polat, İ., Feng, Y., Moore, E. W., Vand erPlas, J.,
Laxalde, D., Perktold, J., Cimrman, R., Henriksen, I., Quintero,
E. A., Harris, C. R., Archibald, A. M., Ribeiro, A. H., Pedregosa,
F., van Mulbregt, P., and Contributors, S. . .: SciPy 1.0: Fundamental
Algorithms for Scientific Computing in Python, Nat. Methods, 17, 261–272,
https://doi.org/10.1038/s41592-019-0686-2, 2020. a, b
Wellman, M. P. and Henrion, M.: Explaining'explaining away', IEEE T.
Pattern Anal., 15, 287–292, 1993. a
Westerberg, I. K. and McMillan, H. K.: Uncertainty in hydrological signatures, Hydrol. Earth Syst. Sci., 19, 3951–3968, https://doi.org/10.5194/hess-19-3951-2015, 2015. a, b, c, d
Xia, Y., Mitchell, K., Ek, M., Sheffield, J., Cosgrove, B., Wood, E., Luo, L., Alonge, C., Wei, H., Meng, J., Livneh, B., Lettenmaier, D., Koren, V., Duan,
Q., Mo, K., Fan, Y., and Mocko, D.: Continental-scale water and energy
flux analysis and validation for the North American Land Data Assimilation
System project phase 2 (NLDAS-2): 1. Intercomparison and application of model
products, J. Geophys. Res.-Atmos., 117, D03109, https://doi.org/10.1029/2011JD016048, 2012. a
Yilmaz, K. K., Hogue, T. S., Hsu, K.-L., Sorooshian, S., Gupta, H. V., and
Wagener, T.: Intercomparison of rain gauge, radar, and satellite-based
precipitation estimates with emphasis on hydrologic forecasting, J.
Hydrometeorol., 6, 497–517, 2005. a
Yilmaz, K. K., Gupta, H. V., and Wagener, T.: A process-based diagnostic
approach to model evaluation: Application to the NWS distributed hydrologic
model, Water Resour. Res., 44, W09417, https://doi.org/10.1029/2007WR006716, 2008. a, b, c
Short summary
We investigate how deep learning models use different meteorological data sets in the task of (regional) rainfall–runoff modeling. We show that performance can be significantly improved when using different data products as input and further show how the model learns to combine those meteorological input differently across time and space. The results are carefully benchmarked against classical approaches, showing the supremacy of the presented approach.
We investigate how deep learning models use different meteorological data sets in the task of...