Articles | Volume 27, issue 13
https://doi.org/10.5194/hess-27-2397-2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/hess-27-2397-2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
When best is the enemy of good – critical evaluation of performance criteria in hydrological models
Guillaume Cinkus
CORRESPONDING AUTHOR
HydroSciences Montpellier (HSM), CNRS, IRD, Univ. Montpellier, 34090 Montpellier, France
Naomi Mazzilli
UMR 1114 EMMAH (AU-INRAE), Université d'Avignon, 84000 Avignon,
France
Hervé Jourde
HydroSciences Montpellier (HSM), CNRS, IRD, Univ. Montpellier, 34090 Montpellier, France
Andreas Wunsch
Institute of Applied Geosciences, Karlsruhe Institute of Technology (KIT), Kaiserstr. 12, 76131 Karlsruhe, Germany
Tanja Liesch
Institute of Applied Geosciences, Karlsruhe Institute of Technology (KIT), Kaiserstr. 12, 76131 Karlsruhe, Germany
Nataša Ravbar
Karst Research Institute, ZRC SAZU, Titov trg 2, 6230 Postojna,
Slovenia
Zhao Chen
Institute of Groundwater Management, Technical University of Dresden, 01062 Dresden, Germany
Nico Goldscheider
Institute of Applied Geosciences, Karlsruhe Institute of Technology (KIT), Kaiserstr. 12, 76131 Karlsruhe, Germany
Related authors
Vianney Sivelle, Guillaume Cinkus, Naomi Mazzilli, David Labat, Bruno Arfib, Nicolas Massei, Yohann Cousquer, Dominique Bertin, and Hervé Jourde
Hydrol. Earth Syst. Sci., 29, 1259–1276, https://doi.org/10.5194/hess-29-1259-2025, https://doi.org/10.5194/hess-29-1259-2025, 2025
Short summary
Short summary
KarstMod provides a platform for global modelling of the rain level–flow relationship in karstic basins. The platform provides a set of tools to assess the dynamics of the compartments considered in the model and to detect possible flaws in structure and parameterization. This platform is developed as part of the French observatory network on karst hydrology (SNO KARST), which aims to strengthen the sharing of knowledge and promote interdisciplinary research on karst systems at a national level.
Guillaume Cinkus, Andreas Wunsch, Naomi Mazzilli, Tanja Liesch, Zhao Chen, Nataša Ravbar, Joanna Doummar, Jaime Fernández-Ortega, Juan Antonio Barberá, Bartolomé Andreo, Nico Goldscheider, and Hervé Jourde
Hydrol. Earth Syst. Sci., 27, 1961–1985, https://doi.org/10.5194/hess-27-1961-2023, https://doi.org/10.5194/hess-27-1961-2023, 2023
Short summary
Short summary
Numerous modelling approaches can be used for studying karst water resources, which can make it difficult for a stakeholder or researcher to choose the appropriate method. We conduct a comparison of two widely used karst modelling approaches: artificial neural networks (ANNs) and reservoir models. Results show that ANN models are very flexible and seem great for reproducing high flows. Reservoir models can work with relatively short time series and seem to accurately reproduce low flows.
Andreas Wunsch, Tanja Liesch, Guillaume Cinkus, Nataša Ravbar, Zhao Chen, Naomi Mazzilli, Hervé Jourde, and Nico Goldscheider
Hydrol. Earth Syst. Sci., 26, 2405–2430, https://doi.org/10.5194/hess-26-2405-2022, https://doi.org/10.5194/hess-26-2405-2022, 2022
Short summary
Short summary
Modeling complex karst water resources is difficult enough, but often there are no or too few climate stations available within or close to the catchment to deliver input data for modeling purposes. We apply image recognition algorithms to time-distributed, spatially gridded meteorological data to simulate karst spring discharge. Our models can also learn the approximate catchment location of a spring independently.
Fabienne Doll, Tanja Liesch, Maria Wetzel, Stefan Kunz, and Stefan Broda
EGUsphere, https://doi.org/10.5194/egusphere-2025-3539, https://doi.org/10.5194/egusphere-2025-3539, 2025
This preprint is open for discussion and under review for Geoscientific Model Development (GMD).
Short summary
Short summary
With the growing use of machine learning for groundwater level (GWL) prediction, proper performance estimation is crucial. This study compares three validation strategies—blocked cross-validation (bl-CV), repeated out-of-sample (repOOS), and out-of-sample (OOS)—for 1D-CNN models using meteorological inputs. Results show that bl-CV offers the most reliable performance estimates, while OOS is the most uncertain, highlighting the need for careful method selection.
Marc Ohmer, Tanja Liesch, Bastian Habbel, Benedikt Heudorfer, Mariana Gomez, Patrick Clos, Maximilian Nölscher, and Stefan Broda
Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2025-321, https://doi.org/10.5194/essd-2025-321, 2025
Preprint under review for ESSD
Short summary
Short summary
We present a public dataset of weekly groundwater levels from more than 3,000 wells across Germany, spanning 32 years. It combines weather data and site-specific environmental information to support forecasting groundwater changes. Three benchmark models of varying complexity show how data and modeling approaches influence predictions. This resource promotes open, reproducible research and helps guide future water management decisions.
Vianney Sivelle, Guillaume Cinkus, Naomi Mazzilli, David Labat, Bruno Arfib, Nicolas Massei, Yohann Cousquer, Dominique Bertin, and Hervé Jourde
Hydrol. Earth Syst. Sci., 29, 1259–1276, https://doi.org/10.5194/hess-29-1259-2025, https://doi.org/10.5194/hess-29-1259-2025, 2025
Short summary
Short summary
KarstMod provides a platform for global modelling of the rain level–flow relationship in karstic basins. The platform provides a set of tools to assess the dynamics of the compartments considered in the model and to detect possible flaws in structure and parameterization. This platform is developed as part of the French observatory network on karst hydrology (SNO KARST), which aims to strengthen the sharing of knowledge and promote interdisciplinary research on karst systems at a national level.
Raoul A. Collenteur, Ezra Haaf, Mark Bakker, Tanja Liesch, Andreas Wunsch, Jenny Soonthornrangsan, Jeremy White, Nick Martin, Rui Hugman, Ed de Sousa, Didier Vanden Berghe, Xinyang Fan, Tim J. Peterson, Jānis Bikše, Antoine Di Ciacca, Xinyue Wang, Yang Zheng, Maximilian Nölscher, Julian Koch, Raphael Schneider, Nikolas Benavides Höglund, Sivarama Krishna Reddy Chidepudi, Abel Henriot, Nicolas Massei, Abderrahim Jardani, Max Gustav Rudolph, Amir Rouhani, J. Jaime Gómez-Hernández, Seifeddine Jomaa, Anna Pölz, Tim Franken, Morteza Behbooei, Jimmy Lin, and Rojin Meysami
Hydrol. Earth Syst. Sci., 28, 5193–5208, https://doi.org/10.5194/hess-28-5193-2024, https://doi.org/10.5194/hess-28-5193-2024, 2024
Short summary
Short summary
We show the results of the 2022 Groundwater Time Series Modelling Challenge; 15 teams applied data-driven models to simulate hydraulic heads, and three model groups were identified: lumped, machine learning, and deep learning. For all wells, reasonable performance was obtained by at least one team from each group. There was not one team that performed best for all wells. In conclusion, the challenge was a successful initiative to compare different models and learn from each other.
Andreas Wunsch, Tanja Liesch, and Nico Goldscheider
Hydrol. Earth Syst. Sci., 28, 2167–2178, https://doi.org/10.5194/hess-28-2167-2024, https://doi.org/10.5194/hess-28-2167-2024, 2024
Short summary
Short summary
Seasons have a strong influence on groundwater levels, but relationships are complex and partly unknown. Using data from wells in Germany and an explainable machine learning approach, we showed that summer precipitation is the key factor that controls the severeness of a low-water period in fall; high summer temperatures do not per se cause stronger decreases. Preceding winters have only a minor influence on such low-water periods in general.
Benedikt Heudorfer, Tanja Liesch, and Stefan Broda
Hydrol. Earth Syst. Sci., 28, 525–543, https://doi.org/10.5194/hess-28-525-2024, https://doi.org/10.5194/hess-28-525-2024, 2024
Short summary
Short summary
We build a neural network to predict groundwater levels from monitoring wells. We predict all wells at the same time, by learning the differences between wells with static features, making it an entity-aware global model. This works, but we also test different static features and find that the model does not use them to learn exactly how the wells are different, but only to uniquely identify them. As this model class is not actually entity aware, we suggest further steps to make it so.
Chloé Fandel, Ty Ferré, François Miville, Philippe Renard, and Nico Goldscheider
Hydrol. Earth Syst. Sci., 27, 4205–4215, https://doi.org/10.5194/hess-27-4205-2023, https://doi.org/10.5194/hess-27-4205-2023, 2023
Short summary
Short summary
From the surface, it is hard to tell where underground cave systems are located. We developed a computer model to create maps of the probable cave network in an area, based on the geologic setting. We then applied our approach in reverse: in a region where an old cave network was mapped, we used modeling to test what the geologic setting might have been like when the caves formed. This is useful because understanding past cave formation can help us predict where unmapped caves are located today.
Guillaume Cinkus, Andreas Wunsch, Naomi Mazzilli, Tanja Liesch, Zhao Chen, Nataša Ravbar, Joanna Doummar, Jaime Fernández-Ortega, Juan Antonio Barberá, Bartolomé Andreo, Nico Goldscheider, and Hervé Jourde
Hydrol. Earth Syst. Sci., 27, 1961–1985, https://doi.org/10.5194/hess-27-1961-2023, https://doi.org/10.5194/hess-27-1961-2023, 2023
Short summary
Short summary
Numerous modelling approaches can be used for studying karst water resources, which can make it difficult for a stakeholder or researcher to choose the appropriate method. We conduct a comparison of two widely used karst modelling approaches: artificial neural networks (ANNs) and reservoir models. Results show that ANN models are very flexible and seem great for reproducing high flows. Reservoir models can work with relatively short time series and seem to accurately reproduce low flows.
Leïla Serène, Christelle Batiot-Guilhe, Naomi Mazzilli, Christophe Emblanch, Milanka Babic, Julien Dupont, Roland Simler, Matthieu Blanc, and Gérard Massonnat
Hydrol. Earth Syst. Sci., 26, 5035–5049, https://doi.org/10.5194/hess-26-5035-2022, https://doi.org/10.5194/hess-26-5035-2022, 2022
Short summary
Short summary
This work aims to develop the Transit Time index (TTi) as a natural tracer of karst groundwater transit time, usable in the 0–6-month range. Based on the fluorescence of organic matter, TTi shows its relevance to detect a small proportion of fast infiltration water within a mix, while other natural transit time tracers provide no or less sensitive information. Comparison of the average TTi of different karst springs also provides consistent results with the expected relative transit times.
Marc Ohmer, Tanja Liesch, and Andreas Wunsch
Hydrol. Earth Syst. Sci., 26, 4033–4053, https://doi.org/10.5194/hess-26-4033-2022, https://doi.org/10.5194/hess-26-4033-2022, 2022
Short summary
Short summary
We present a data-driven approach to select optimal locations for groundwater monitoring wells. The applied approach can optimize the number of wells and their location for a network reduction (by ranking wells in order of their information content and reducing redundant) and extension (finding sites with great information gain) or both. It allows us to include a cost function to account for more/less suitable areas for new wells and can help to obtain maximum information content for a budget.
Andreas Wunsch, Tanja Liesch, Guillaume Cinkus, Nataša Ravbar, Zhao Chen, Naomi Mazzilli, Hervé Jourde, and Nico Goldscheider
Hydrol. Earth Syst. Sci., 26, 2405–2430, https://doi.org/10.5194/hess-26-2405-2022, https://doi.org/10.5194/hess-26-2405-2022, 2022
Short summary
Short summary
Modeling complex karst water resources is difficult enough, but often there are no or too few climate stations available within or close to the catchment to deliver input data for modeling purposes. We apply image recognition algorithms to time-distributed, spatially gridded meteorological data to simulate karst spring discharge. Our models can also learn the approximate catchment location of a spring independently.
Markus Merk, Nadine Goeppert, and Nico Goldscheider
Hydrol. Earth Syst. Sci., 25, 3519–3538, https://doi.org/10.5194/hess-25-3519-2021, https://doi.org/10.5194/hess-25-3519-2021, 2021
Short summary
Short summary
Soil moisture levels have decreased significantly over the past 2 decades. This decrease is not uniformly distributed over the observation period. The largest changes occur at tipping points during years of extreme drought, after which soil moisture levels reach significantly different alternate stable states. Not only the overall trend in soil moisture is affected, but also the seasonal dynamics.
Andreas Wunsch, Tanja Liesch, and Stefan Broda
Hydrol. Earth Syst. Sci., 25, 1671–1687, https://doi.org/10.5194/hess-25-1671-2021, https://doi.org/10.5194/hess-25-1671-2021, 2021
Cited articles
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, arXiv [preprint], https://doi.org/10.48550/arXiv.1603.04467, 2016.
Allaire, J., Xie, Y., McPherson, J., Luraschi, J., Ushey, K., Atkins, A.,
Wickham, H., Cheng, J., Chang, W., and Iannone, R.: Rmarkdown: Dynamic
documents for r, https://cran.r-project.org/package=rmarkdown (last access: 27 June 2023), 2021.
Allen, R. G., Pereira, L. S., Raes, D., Smith, M., and FAO (Eds.): Crop
evapotranspiration: Guidelines for computing crop water requirements, Food
and Agriculture Organization of the United Nations, Rome, https://appgeodb.nancy.inra.fr/biljou/pdf/Allen_FAO1998.pdf (last access: 27 June 2023), 1998.
Althoff, D. and Rodrigues, L. N.: Goodness-of-fit criteria for hydrological
models: Model calibration and performance assessment, J. Hydrol., 600,
126674, https://doi.org/10.1016/j.jhydrol.2021.126674, 2021.
ARSO: Ministry of the Environment and Spatial Planning, Slovenian Environment Agency, Archive of hydrological data ARSO, http://vode.arso.gov.si/hidarhiv/ (last access: 27 June 2023), 2021a.
ARSO: Ministry of the Environment and Spatial Planning, Slovenian Environment Agency, Archive of hydrological data ARSO, http://www.meteo.si/ (last access: 27 June 2023), 2021b.
Barber, C., Lamontagne, J. R., and Vogel, R. M.: Improved estimators of
correlation and R2 for skewed hydrologic data, Hydrolog. Sci. J., 65, 87–101, https://doi.org/10.1080/02626667.2019.1686639, 2020.
Beven, K.: How to make advances in hydrological modelling, Hydrol. Res., 50,
1481–1494, https://doi.org/10.2166/nh.2019.134, 2019.
Biondi, D., Freni, G., Iacobellis, V., Mascaro, G., and Montanari, A.: Validation of hydrological models: Conceptual basis, methodological approaches and a proposal for a code of practice, Phys. Chem. Earth Pt. A/B/C, 42–44, 70–76, https://doi.org/10.1016/j.pce.2011.07.037, 2012.
Choi, H. I.: Comment on Liu (2020): A rational performance criterion for
hydrological model, J. Hydrol., 606, 126927, https://doi.org/10.1016/j.jhydrol.2021.126927, 2022.
Chollet, F.: Keras, GitHub [code], https://github.com/keras-team/keras (last access: 12 October 2022), 2015.
Cinkus, G. and Wunsch, A.: Busemorose/KGE_critical_evaluation: Model code release, Zenodo [code], https://doi.org/10.5281/zenodo.7274031, 2022.
Cinkus, G., Wunsch, A., Mazzilli, N., Liesch, T., Chen, Z., Ravbar, N., Doummar, J., Fernández-Ortega, J., Barberá, J. A., Andreo, B., Goldscheider, N., and Jourde, H.: Comparison of artificial neural networks and reservoir models for simulating karst spring discharge on five test sites in the Alpine and Mediterranean regions, Hydrol. Earth Syst. Sci., 27, 1961–1985, https://doi.org/10.5194/hess-27-1961-2023, 2023.
Clark, M. P., Vogel, R. M., Lamontagne, J. R., Mizukami, N., Knoben, W. J. M., Tang, G., Gharari, S., Freer, J. E., Whitfield, P. H., Shook, K. R., and
Papalexiou, S. M.: The Abuse of Popular Performance Metrics in Hydrologic
Modeling, Water Resour. Res., 57, e2020WR029001, https://doi.org/10.1029/2020WR029001, 2021.
Freedman, D., Pisani, R., and Purves, R.: Statistics: Fourth International
Student Edition, W. W. Norton & Company, New York,
ISBN 978-0-393-92972-0, 2007.
Gabrovšek, F., Kogovšek, J., Kovačič, G., Petrič, M.,
Ravbar, N., and Turk, J.: Recent Results of Tracer Tests in the Catchment of
the Unica River (SW Slovenia), Acta Carsolog., 39, 27–37, https://doi.org/10.3986/ac.v39i1.110, 2010.
Gohel, D.: Flextable: Functions for tabular reporting, Manual, https://cran.r-project.org/web/packages/flextable/index.html
(last access: 27 June 2023), 2021.
Gupta, H. V., Sorooshian, S., and Yapo, P. O.: Toward improved calibration
of hydrologic models: Multiple and noncommensurable measures of information,
Water Resour. Res., 34, 751–763, https://doi.org/10.1029/97WR03495, 1998.
Gupta, H. V., Kling, H., Yilmaz, K. K., and Martinez, G. F.: Decomposition
of the mean squared error and NSE performance criteria: Implications for
improving hydrological modelling, J. Hydrol., 377, 80–91,
https://doi.org/10.1016/j.jhydrol.2009.08.003, 2009.
Hartmann, A., Goldscheider, N., Wagener, T., Lange, J., and Weiler, M.:
Karst water resources in a changing world: Review of hydrological modeling
approaches, Rev. Geophys., 52, 218–242, https://doi.org/10.1002/2013RG000443, 2014.
Hunter, J. D.: Matplotlib: A 2D Graphics Environment, Comput. Sci. Eng., 9,
90–95, https://doi.org/10.1109/MCSE.2007.55, 2007.
Jackson, E. K., Roberts, W., Nelsen, B., Williams, G. P., Nelson, E. J., and
Ames, D. P.: Introductory overview: Error metrics for hydrologic modelling A
review of common practices and an open source library to facilitate use and
adoption, Environ. Model. Softw., 119, 32–48, https://doi.org/10.1016/j.envsoft.2019.05.001, 2019.
Jain, S. K., Mani, P., Jain, S. K., Prakash, P., Singh, V. P., Tullos, D.,
Kumar, S., Agarwal, S. P., and Dimri, A. P.: A Brief review of flood
forecasting techniques and their applications, Int. J. River Basin Manage.,
16, 329–344, https://doi.org/10.1080/15715124.2017.1411920, 2018.
Kauffeldt, A., Wetterhall, F., Pappenberger, F., Salamon, P., and Thielen, J.: Technical review of large-scale hydrological models for implementation
in operational flood forecasting schemes on continental level, Environ. Model. Softw., 75, 68–76, https://doi.org/10.1016/j.envsoft.2015.09.009, 2016.
Kling, H., Fuchs, M., and Paulin, M.: Runoff conditions in the upper Danube
basin under an ensemble of climate change scenarios, J. Hydrol., 424–425,
264–277, https://doi.org/10.1016/j.jhydrol.2012.01.011, 2012.
Knoben, W. J. M., Freer, J. E., and Woods, R. A.: Technical note: Inherent
benchmark or not? Comparing Nash and Kling efficiency scores, Hydrol. Earth
Syst. Sci., 23, 4323–4331, https://doi.org/10.5194/hess-23-4323-2019, 2019.
Kovačič, G.: Hydrogeological study of the Malenščica karst
spring (SW Slovenia) by means of a time series analysis, Acta Carsolog., 39, 201–215, https://doi.org/10.3986/ac.v39i2.93, 2010.
Krause, P., Boyle, D. P., and Bäse, F.: Comparison of different efficiency criteria for hydrological model assessment, Adv. Geosci., 5, 89–97, https://doi.org/10.5194/adgeo-5-89-2005, 2005.
LeCun, Y., Bengio, Y., and Hinton, G.: Deep learning, Nature, 521, 436–444,
https://doi.org/10.1038/nature14539, 2015.
Lee, J. S. and Choi, H. I.: A rebalanced performance criterion for hydrological model calibration, J. Hydrol., 606, 127372,
https://doi.org/10.1016/j.jhydrol.2021.127372, 2022.
Legates, D. R. and McCabe Jr., G. J.: Evaluating the use of “goodness-of-fit” Measures in hydrologic and hydroclimatic model validation, Water Resour. Res., 35, 233–241, https://doi.org/10.1029/1998WR900018, 1999.
Liu, D.: A rational performance criterion for hydrological model, J. Hydrol., 590, 125488, https://doi.org/10.1016/j.jhydrol.2020.125488, 2020.
Massmann, C., Woods, R., and Wagener, T.: Reducing equifinality by carrying
out a multi-objective evaluation based on the bias, correlation and standard
deviation errors, in: EGU2018,
Vienna, Austria, 4–13 April, 2018EGUGA..2011457M, 11457, 2018.
Mayaud, C., Gabrovšek, F., Blatnik, M., Kogovšek, B., Petrič, M., and Ravbar, N.: Understanding flooding in poljes: A modelling perspective, J. Hydrol., 575, 874–889, https://doi.org/10.1016/j.jhydrol.2019.04.092, 2019.
Mazzilli, N., Guinot, V., Jourde, H., Lecoq, N., Labat, D., Arfib, B., Baudement, C., Danquigny, C., Soglio, L. D., and Bertin, D.: KarstMod: A
modelling platform for rainfall – discharge analysis and modelling dedicated
to karst systems, Environ. Model. Softw., 122, 103927,
https://doi.org/10.1016/j.envsoft.2017.03.015, 2019.
McKinney, W.: Data Structures for Statistical Computing in Python, in:
Proceedings of the 9th Python in Science Conference, Austin, Texas, 56–61, https://doi.org/10.25080/Majora-92bf1922-00a, 2010.
Mizukami, N., Rakovec, O., Newman, A. J., Clark, M. P., Wood, A. W., Gupta,
H. V., and Kumar, R.: On the choice of calibration metrics for “high-flow”
estimation using hydrologic models, Hydrol. Earth Syst. Sci., 23,
2601–2614, https://doi.org/10.5194/hess-23-2601-2019, 2019.
Moriasi, D. N., Gitau, M. W., Pai, N., and Daggupati, P.: Hydrologic and
Water Quality Models: Performance Measures and Evaluation Criteria, T. ASABE, 58, 1763–1785, https://doi.org/10.13031/trans.58.10715, 2015.
Muleta, M. K. and Nicklow, J. W.: Sensitivity and uncertainty analysis coupled with automatic calibration for a distributed watershed model, J. Hydrol., 306, 127–145, https://doi.org/10.1016/j.jhydrol.2004.09.005, 2005.
Nash, J. E. and Sutcliffe, J.: River flow forecasting through conceptual
models: Part 1. A discussion of principles, J. Hydrol., 10, 282–290, 1970.
Nogueira, F.: Bayesian Optimization: Open source constrained global
optimization tool for Python, GitHub [code], https://github.com/bayesian-optimization/BayesianOptimization
(last access: 27 June 2023), 2014.
Onyutha, C.: A hydrological model skill score and revised R-squared, Hydrol. Res., 53, 51–64, https://doi.org/10.2166/nh.2021.071, 2022.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel,
O., Blondel, M., Müller, A., Nothman, J., Louppe, G., Prettenhofer, P.,
Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher,
M., Perrot, M., and Duchesnay, É.: Scikit-learn: Machine Learning in
Python, arXiv [preprint], https://doi.org/10.48550/arXiv.1201.0490, 2018.
Petric, M.: Chapter 10.3 – Case Study: Characterization, exploitation, and
protection of the Malenščica karst spring, Slovenia, in: Groundwater
Hydrology of Springs, edited by: Kresic, N. and Stevanovic, Z.,
Butterworth-Heinemann, Boston, 428–441, https://doi.org/10.1016/B978-1-85617-502-9.00021-9, 2010.
Pool, S., Vis, M., and Seibert, J.: Evaluating model performance: Towards a
non-parametric variant of the Kling–Gupta efficiency, Hydrolog. Sci. J., 63,
1941–1953, https://doi.org/10.1080/02626667.2018.1552002, 2018.
Ravbar, N., Barberá, J. A., Petrič, M., Kogovšek, J., and Andreo, B.: The study of hydrodynamic behaviour of a complex karst system under low-flow conditions using natural and artificial tracers (the catchment of the Unica River, SW Slovenia), Environ. Earth Sci., 65, 2259–2272, https://doi.org/10.1007/s12665-012-1523-4, 2012.
R Core Team: R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org/ (last access: 27 June 2023), 2021.
Reback, J., jbrockmendel, McKinney, W., Bossche, J. V. den, Augspurger, T.,
Cloud, P., Hawkins, S., Roeschke, M., gfyoung, Sinhrks, Klein, A., Petersen,
T., Hoefler, P., Tratner, J., She, C., Ayd, W., Naveh, S., Garcia, M.,
Darbyshire, J. H. M., Schendel, J., Hayden, A., Shadrach, R., Saxton, D., Gorelli, M. E., Li, F., Zeitlin, M., Jancauskas, V., McMaster, A., Battiston, P., and Seabold, S.: Pandas-dev/pandas: Pandas 1.3.5, Zenodo [code], https://doi.org/10.5281/zenodo.5774815, 2021.
Ritter, A. and Muñoz-Carpena, R.: Performance evaluation of hydrological
models: Statistical significance for reducing subjectivity in goodness-of-fit assessments, J. Hydrol., 480, 33–45, https://doi.org/10.1016/j.jhydrol.2012.12.004, 2013.
Roberts, W., Williams, G. P., Jackson, E., Nelson, E. J., and Ames, D. P.:
Hydrostats: A Python Package for Characterizing Errors between Observed and
Predicted Time Series, Hydrology, 5, 66, https://doi.org/10.3390/hydrology5040066, 2018.
Santos, L., Thirel, G., and Perrin, C.: Technical note: Pitfalls in using
log-transformed flows within the KGE criterion, Hydrol. Earth Syst. Sci.,
22, 4583–4591, https://doi.org/10.5194/hess-22-4583-2018, 2018.
Schwemmle, R., Demand, D., and Weiler, M.: Technical note: Diagnostic efficiency specific evaluation of model performance, Hydrol. Earth Syst.
Sci., 25, 2187–2198, https://doi.org/10.5194/hess-25-2187-2021, 2021.
Seibert, J., Vis, M. J. P., Lewis, E., and van Meerveld, H. J.: Upper and
lower benchmarks in hydrological modelling, Hydrol. Process., 32, 1120–1125, https://doi.org/10.1002/hyp.11476, 2018.
Tang, G., Clark, M. P., and Papalexiou, S. M.: SC-Earth: A Station-Based
Serially Complete Earth Dataset from 1950 to 2019, J. Climate, 34, 6493–6511, https://doi.org/10.1175/JCLI-D-21-0067.1, 2021.
Thoen, E.: Padr: Quickly get datetime data ready for analysis, CRAN,
https://CRAN.R-project.org/package=padr (last access: 27 June 2023), 2021.
van der Walt, S., Colbert, S. C., and Varoquaux, G.: The NumPy Array: A Structure for Efficient Numerical Computation, Comput. Sci. Eng., 13, 22–30, https://doi.org/10.1109/MCSE.2011.37, 2011.
van Rossum, G.: Python Tutorial, https://ir.cwi.nl/pub/5008/05008D.pdf
(last access: 27 June 2023), 1995.
van Werkhoven, K., Wagener, T., Reed, P., and Tang, Y.: Sensitivity-guided
reduction of parametric dimensionality for multi-objective calibration of
watershed models, Adv. Water Resour., 32, 1154–1169,
https://doi.org/10.1016/j.advwatres.2009.03.002, 2009.
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., Takahashi, K., Vaughan, D., Wilke, C., Woo, K., and Yutani, H.: Welcome to the tidyverse, J. Open Source Softw., 4, 1686, https://doi.org/10.21105/joss.01686, 2019.
Wilke, C. O.: Cowplot: Streamlined plot theme and plot annotations for
“Ggplot2”, Manual, https://cran.r-project.org/web/packages/cowplot/index.html
(last access: 27 June 2023), 2020.
Willmott, C. J.: On the validations of models, Phys. Geogr., 2, 184–194,
https://doi.org/10.1080/02723646.1981.10642213, 1981.
Willmott, C. J., Ackleson, S. G., Davis, R. E., Feddema, J. J., Klink, K. M., Legates, D. R., O'Donnell, J., and Rowe, C. M.: Statistics for the evaluation and comparison of models, J. Geophys. Res., 90, 8995, https://doi.org/10.1029/JC090iC05p08995, 1985.
Willmott, C. J., Robeson, S. M., and Matsuura, K.: A refined index of model
performance, Int. J. Climatol., 32, 2088–2094, https://doi.org/10.1002/joc.2419, 2012.
Wöhling, T., Samaniego, L., and Kumar, R.: Evaluating multiple performance criteria to calibrate the distributed hydrological model of the
upper Neckar catchment, Environ. Earth Sci., 69, 453–468,
https://doi.org/10.1007/s12665-013-2306-2, 2013.
Xie, Y., Allaire, J. J., and Grolemund, G.: R markdown: The definitive
guide, Chapman and Hall/CRC, Boca Raton, Florida, ISBN 978-1-138-35933-8, 2018.
Xie, Y., Dervieux, C., and Riederer, E.: R markdown cookbook, Chapman and
Hall/CRC, Boca Raton, Florida, ISBN 978-0-367-56383-7, 2020.
Zambrano-Bigiarini, M.: hydroGOF: Goodness-of-fit functions for comparison of simulated and observed hydrological time series, Zenodo [code], https://doi.org/10.5281/zenodo.839854, 2020.
Short summary
The Kling–Gupta Efficiency (KGE) is a performance criterion extensively used to evaluate hydrological models. We conduct a critical study on the KGE and its variant to examine counterbalancing errors. Results show that, when assessing a simulation, concurrent over- and underestimation of discharge can lead to an overall higher criterion score without an associated increase in model relevance. We suggest that one carefully choose performance criteria and use scaling factors.
The Kling–Gupta Efficiency (KGE) is a performance criterion extensively used to evaluate...