How well do process-based and data-driven hydrological models learn from limited discharge data?

Staudinger, Maria; Herzog, Anna; Loritz, Ralf; Houska, Tobias; Pool, Sandra; Spieler, Diana; Wagner, Paul D.; Mai, Juliane; Kiesel, Jens; Thober, Stephan; Guse, Björn; Ehret, Uwe

doi:https://doi.org/10.5194/hess-29-5005-2025

Articles | Volume 29, issue 19

https://doi.org/10.5194/hess-29-5005-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/hess-29-5005-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 29, issue 19

Research article

|

08 Oct 2025

Research article |

| 08 Oct 2025

How well do process-based and data-driven hydrological models learn from limited discharge data?

Maria Staudinger, Anna Herzog, Ralf Loritz, Tobias Houska, Sandra Pool, Diana Spieler, Paul D. Wagner, Juliane Mai, Jens Kiesel, Stephan Thober, Björn Guse, and Uwe Ehret

Supplement

https://doi.org/10.5194/hess-29-5005-2025-supplement

Data sets

MariStau/IMPRO_infotheory_Data_Code: Data and code used to calculate conditional entropy values Maria Staudinger and Uwe Ehret https://doi.org/10.5281/zenodo.14938050

CORINE land use CLMS https://doi.org/10.2909/960998c1-1870-4e82-8051-6485205ebbac

Model code and software

MariStau/IMPRO_infotheory_Data_Code: Data and code used to calculate conditional entropy values M. Staudinger and U. Ehret https://doi.org/10.5281/zenodo.14938050

Short summary

Three process-based and four data-driven hydrological models are compared using different training data. We found that process-based models perform better with small datasets but stop learning soon, while data-driven models learn longer. The study highlights the importance of memory in data and the impact of different data sampling methods on model performance. The direct comparison of these models is novel and provides a clear understanding of their performance under various data conditions.