Articles | Volume 29, issue 3
https://doi.org/10.5194/hess-29-785-2025
https://doi.org/10.5194/hess-29-785-2025
Research article
 | 
13 Feb 2025
Research article |  | 13 Feb 2025

A diversity-centric strategy for the selection of spatio-temporal training data for LSTM-based streamflow forecasting

Everett Snieder and Usman T. Khan

Related authors

Resampling and ensemble techniques for improving ANN-based high-flow forecast accuracy
Everett Snieder, Karen Abogadil, and Usman T. Khan
Hydrol. Earth Syst. Sci., 25, 2543–2566, https://doi.org/10.5194/hess-25-2543-2021,https://doi.org/10.5194/hess-25-2543-2021, 2021
Short summary

Cited articles

Abrahart, R. J. and See, L.: Comparing Neural Network and Autoregressive Moving Average Techniques for the Provision of Continuous River Flow Forecasts in Two Contrasting Catchments, Hydrol. Process., 14, 2157–2172, https://doi.org/10.1002/1099-1085(20000815/30)14:11/12<2157::AID-HYP57>3.0.CO;2-S, 2000. a
Addor, N., Newman, A. J., Mizukami, N., and Clark, M. P.: The CAMELS data set: catchment attributes and meteorology for large-sample studies, Hydrol. Earth Syst. Sci., 21, 5293–5313, https://doi.org/10.5194/hess-21-5293-2017, 2017. a
Anctil, F. and Lauzon, N.: Generalisation for neural networks through data sampling and training procedures, with applications to streamflow predictions, Hydrol. Earth Syst. Sci., 8, 940–958, https://doi.org/10.5194/hess-8-940-2004, 2004. a, b
Arsenault, R., Brissette, F., Martel, J.-L., Troin, M., Lévesque, G., Davidson-Chaput, J., Gonzalez, M. C., Ameli, A., and Poulin, A.: A Comprehensive, Multisource Database for Hydrometeorological Modeling of 14,425 North American Watersheds, Scientific Data, 7, 243, https://doi.org/10.1038/s41597-020-00583-2, 2020 (data available at: https://osf.io/rpc3w/, last access: 1 May 2024). a, b
Arsenault, R., Martel, J.-L., Brunet, F., Brissette, F., and Mai, J.: Continuous streamflow prediction in ungauged basins: long short-term memory neural networks clearly outperform traditional hydrological models, Hydrol. Earth Syst. Sci., 27, 139–157, https://doi.org/10.5194/hess-27-139-2023, 2023a. a, b
Download
Short summary
Improving the accuracy of flood forecasts is paramount to minimising flood damage. Machine learning (ML) models are increasingly being applied for flood forecasting. Such models are typically trained on large historic hydrometeorological datasets. In this work, we evaluate methods for selecting training datasets that maximise the spatio-temporal diversity of the represented hydrological processes. Empirical results showcase the importance of hydrological diversity in training ML models.
Share