Articles | Volume 29, issue 19
https://doi.org/10.5194/hess-29-4761-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/hess-29-4761-2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Can causal discovery lead to a more robust prediction model for runoff signatures?
Hossein Abbasizadeh
CORRESPONDING AUTHOR
Faculty of Environmental Sciences, Czech University of Life Sciences Prague, Prague, Czech Republic
Petr Maca
Faculty of Environmental Sciences, Czech University of Life Sciences Prague, Prague, Czech Republic
Martin Hanel
Faculty of Environmental Sciences, Czech University of Life Sciences Prague, Prague, Czech Republic
Mads Troldborg
The James Hutton Institute, Invergowrie, Dundee DD2 5DA, Scotland, UK
Amir AghaKouchak
Department of Civil and Environmental Engineering, University of California, Irvine, CA, USA
United Nations University Institute for Water, Environment and Health, Hamilton, ON, Canada
Department of Earth System Science, University of California, Irvine, CA, USA
Related authors
No articles found.
Vishal Thakur, Yannis Markonis, Rohini Kumar, Johanna Ruth Thomson, Mijael Rodrigo Vargas Godoy, Martin Hanel, and Oldrich Rakovec
Hydrol. Earth Syst. Sci., 29, 4395–4416, https://doi.org/10.5194/hess-29-4395-2025, https://doi.org/10.5194/hess-29-4395-2025, 2025
Short summary
Short summary
Understanding the changes in water movement in earth is crucial for everyone. To quantify this water movement there are several techniques. We examined how different methods of estimating evaporation impact predictions of various types of water movement across Europe. We found that, while these methods generally agree on whether changes are increasing or decreasing, they differ in magnitude. This means selecting the right evaporation method is crucial for accurate predictions of water movement.
Jan Řehoř, Rudolf Brázdil, Oldřich Rakovec, Martin Hanel, Milan Fischer, Rohini Kumar, Jan Balek, Markéta Poděbradská, Vojtěch Moravec, Luis Samaniego, Yannis Markonis, and Miroslav Trnka
Hydrol. Earth Syst. Sci., 29, 3341–3358, https://doi.org/10.5194/hess-29-3341-2025, https://doi.org/10.5194/hess-29-3341-2025, 2025
Short summary
Short summary
We present a robust method for identification and classification of global land drought events (GLDEs) based on soil moisture. Two models were used to calculate soil moisture and delimit soil drought over global land from 1980–2022, with clusters of 775 and 630 GLDEs. Using four spatiotemporal and three motion-related characteristics, we categorized GLDEs into seven severity and seven dynamic categories. The frequency of GLDEs has generally increased in recent decades.
Yavar Pourmohamad, John T. Abatzoglou, Erin J. Belval, Erica Fleishman, Karen Short, Matthew C. Reeves, Nicholas Nauslar, Philip E. Higuera, Eric Henderson, Sawyer Ball, Amir AghaKouchak, Jeffrey P. Prestemon, Julia Olszewski, and Mojtaba Sadegh
Earth Syst. Sci. Data, 16, 3045–3060, https://doi.org/10.5194/essd-16-3045-2024, https://doi.org/10.5194/essd-16-3045-2024, 2024
Short summary
Short summary
The FPA FOD-Attributes dataset provides > 300 biological, physical, social, and administrative attributes associated with > 2.3×106 wildfire incidents across the US from 1992 to 2020. The dataset can be used to (1) answer numerous questions about the covariates associated with human- and lightning-caused wildfires and (2) support descriptive, diagnostic, predictive, and prescriptive wildfire analytics, including the development of machine learning models.
Mijael Rodrigo Vargas Godoy, Yannis Markonis, Oldrich Rakovec, Michal Jenicek, Riya Dutta, Rajani Kumar Pradhan, Zuzana Bešťáková, Jan Kyselý, Roman Juras, Simon Michael Papalexiou, and Martin Hanel
Hydrol. Earth Syst. Sci., 28, 1–19, https://doi.org/10.5194/hess-28-1-2024, https://doi.org/10.5194/hess-28-1-2024, 2024
Short summary
Short summary
The study introduces a novel benchmarking method based on the water cycle budget for hydroclimate data fusion. Using this method and multiple state-of-the-art datasets to assess the spatiotemporal patterns of water cycle changes in Czechia, we found that differences in water availability distribution are dominated by evapotranspiration. Furthermore, while the most significant temporal changes in Czechia occur during spring, the median spatial patterns stem from summer changes in the water cycle.
Petr Kavka, Jiří Cajthaml, Adam Tejkl, and Martin Hanel
Abstr. Int. Cartogr. Assoc., 6, 120, https://doi.org/10.5194/ica-abs-6-120-2023, https://doi.org/10.5194/ica-abs-6-120-2023, 2023
Heidi Kreibich, Kai Schröter, Giuliano Di Baldassarre, Anne F. Van Loon, Maurizio Mazzoleni, Guta Wakbulcho Abeshu, Svetlana Agafonova, Amir AghaKouchak, Hafzullah Aksoy, Camila Alvarez-Garreton, Blanca Aznar, Laila Balkhi, Marlies H. Barendrecht, Sylvain Biancamaria, Liduin Bos-Burgering, Chris Bradley, Yus Budiyono, Wouter Buytaert, Lucinda Capewell, Hayley Carlson, Yonca Cavus, Anaïs Couasnon, Gemma Coxon, Ioannis Daliakopoulos, Marleen C. de Ruiter, Claire Delus, Mathilde Erfurt, Giuseppe Esposito, Didier François, Frédéric Frappart, Jim Freer, Natalia Frolova, Animesh K. Gain, Manolis Grillakis, Jordi Oriol Grima, Diego A. Guzmán, Laurie S. Huning, Monica Ionita, Maxim Kharlamov, Dao Nguyen Khoi, Natalie Kieboom, Maria Kireeva, Aristeidis Koutroulis, Waldo Lavado-Casimiro, Hong-Yi Li, Maria Carmen LLasat, David Macdonald, Johanna Mård, Hannah Mathew-Richards, Andrew McKenzie, Alfonso Mejia, Eduardo Mario Mendiondo, Marjolein Mens, Shifteh Mobini, Guilherme Samprogna Mohor, Viorica Nagavciuc, Thanh Ngo-Duc, Huynh Thi Thao Nguyen, Pham Thi Thao Nhi, Olga Petrucci, Nguyen Hong Quan, Pere Quintana-Seguí, Saman Razavi, Elena Ridolfi, Jannik Riegel, Md Shibly Sadik, Nivedita Sairam, Elisa Savelli, Alexey Sazonov, Sanjib Sharma, Johanna Sörensen, Felipe Augusto Arguello Souza, Kerstin Stahl, Max Steinhausen, Michael Stoelzle, Wiwiana Szalińska, Qiuhong Tang, Fuqiang Tian, Tamara Tokarczyk, Carolina Tovar, Thi Van Thu Tran, Marjolein H. J. van Huijgevoort, Michelle T. H. van Vliet, Sergiy Vorogushyn, Thorsten Wagener, Yueling Wang, Doris E. Wendt, Elliot Wickham, Long Yang, Mauricio Zambrano-Bigiarini, and Philip J. Ward
Earth Syst. Sci. Data, 15, 2009–2023, https://doi.org/10.5194/essd-15-2009-2023, https://doi.org/10.5194/essd-15-2009-2023, 2023
Short summary
Short summary
As the adverse impacts of hydrological extremes increase in many regions of the world, a better understanding of the drivers of changes in risk and impacts is essential for effective flood and drought risk management. We present a dataset containing data of paired events, i.e. two floods or two droughts that occurred in the same area. The dataset enables comparative analyses and allows detailed context-specific assessments. Additionally, it supports the testing of socio-hydrological models.
Markéta Součková, Roman Juras, Kryštof Dytrt, Vojtěch Moravec, Johanna Ruth Blöcher, and Martin Hanel
Nat. Hazards Earth Syst. Sci., 22, 3501–3525, https://doi.org/10.5194/nhess-22-3501-2022, https://doi.org/10.5194/nhess-22-3501-2022, 2022
Short summary
Short summary
Avalanches are natural hazards that threaten people and infrastructure. With climate change, avalanche activity is changing. We analysed the change in frequency and size of avalanches in the Krkonoše Mountains, Czechia, and detected important variables with machine learning tools from 1979–2020. Wet avalanches in February and March have increased, and slab avalanches have decreased and become smaller. The identified variables and their threshold levels may help in avalanche decision-making.
Sadaf Nasreen, Markéta Součková, Mijael Rodrigo Vargas Godoy, Ujjwal Singh, Yannis Markonis, Rohini Kumar, Oldrich Rakovec, and Martin Hanel
Earth Syst. Sci. Data, 14, 4035–4056, https://doi.org/10.5194/essd-14-4035-2022, https://doi.org/10.5194/essd-14-4035-2022, 2022
Short summary
Short summary
This article presents a 500-year reconstructed annual runoff dataset for several European catchments. Several data-driven and hydrological models were used to derive the runoff series using reconstructed precipitation and temperature and a set of proxy data. The simulated runoff was validated using independent observed runoff data and documentary evidence. The validation revealed a good fit between the observed and reconstructed series for 14 catchments, which are available for further analysis.
Sofia Hallerbäck, Laurie S. Huning, Charlotte Love, Magnus Persson, Katarina Stensen, David Gustafsson, and Amir AghaKouchak
The Cryosphere, 16, 2493–2503, https://doi.org/10.5194/tc-16-2493-2022, https://doi.org/10.5194/tc-16-2493-2022, 2022
Short summary
Short summary
Using unique data, some dating back to the 18th century, we show a significant trend in shorter ice duration, later freeze, and earlier break-up dates across Sweden. In recent observations, the mean ice durations have decreased by 11–28 d and the chance of years with an extremely short ice cover duration (less than 50 d) have increased by 800 %. Results show that even a 1 °C increase in air temperatures can result in a decrease in ice duration in Sweden of around 8–23 d.
Mads Troldborg, Zisis Gagkas, Andy Vinten, Allan Lilly, and Miriam Glendell
Hydrol. Earth Syst. Sci., 26, 1261–1293, https://doi.org/10.5194/hess-26-1261-2022, https://doi.org/10.5194/hess-26-1261-2022, 2022
Short summary
Short summary
Pesticides continue to pose a threat to surface water quality worldwide. Here, we present a spatial Bayesian belief network (BBN) for assessing inherent pesticide risk to water quality. The BBN was applied in a small catchment with limited data to simulate the risk of five pesticides and evaluate the likely effectiveness of mitigation measures. The probabilistic graphical model combines diverse data and explicitly accounts for uncertainties, which are often ignored in pesticide risk assessments.
Cited articles
Addor, N., Newman, A. J., Mizukami, N., and Clark, M. P.: The CAMELS data set: catchment attributes and meteorology for large-sample studies, Hydrol. Earth Syst. Sci., 21, 5293–5313, https://doi.org/10.5194/hess-21-5293-2017, 2017. a, b
AghaKouchak, A., Pan, B., Mazdiyasni, O., Sadegh, M., Jiwa, S., Zhang, W., Love, C., Madadgar, S., Papalexiou, S., Davis, S., Hsu, K., and Sorooshian, S.: Status and prospects for drought forecasting: Opportunities in artificial intelligence and hybrid physical–statistical forecasting, Philos. T. Roy. Soc. A, 380, 20210288, https://doi.org/10.1098/rsta.2021.0288, 2022. a
AghaKouchak, A., Huning, L. S., Sadegh, M., Qin, Y., Markonis, Y., Vahedifard, F., Love, C. A., Mishra, A., Mehran, A., Obringer, R., Hjelmstad, A., Pallickara, S., Jiwa, S., Hanel, M., Zhao, Y., Pendergrass, A. G. Arabi, M. Davis, S. J., Ward, P. J., Svoboda, M., Pulwarty, R., and Kreibich, H.: Toward impact-based monitoring of drought and its cascading hazards, Nature Reviews Earth & Environment, 4, 582–595, 2023. a
Aguilera, P. A., Fernandez, A., Fernandez, R., Rumi, R., and Salmeron, A.: Bayesian networks in environmental modelling, Environ. Modell. Softw., 26, 1376–1388, https://doi.org/10.1016/j.envsoft.2011.06.004, 2011. a
Arshad, A., Mirchi, A., Taghvaeian, S., and AghaKouchak, A.: Downscaled-GRACE data reveal anthropogenic and climate-induced water storage decline across the Indus Basin, Water Resour. Res., 60, e2023WR035882, https://doi.org/10.1029/2023WR035882, 2024. a
Bang, C. W. and Didelez, V.: Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets, arXiv [preprint], https://doi.org/10.48550/arXiv.2503.21526, 2025. a, b
Blöschl, G., Sivapalan, M., Wagener, T., Viglione, A., and Savenije, H.: Runoff prediction in ungauged basins: synthesis across processes, places and scales, Cambridge University Press, https://doi.org/10.1017/CBO9781139235761, 2013. a, b
Breiman, L.: Random forests, Mach. Learn., 45, 5–32, https://doi.org/10.1023/A:1010950718922, 2001. a
Chagas, V. B. P., Chaffe, P. L. B., and Bloeschl, G.: Regional Low Flow Hydrology: Model Development and Evaluation, Water Resour. Res., 60, e2023WR035063, https://doi.org/10.1029/2023WR035063, 2024. a
Ciulla, F. and Varadharajan, C.: A network approach for multiscale catchment classification using traits, Hydrol. Earth Syst. Sci., 28, 1617–1651, https://doi.org/10.5194/hess-28-1617-2024, 2024. a
Clark, M. P., Kavetski, D., and Fenicia, F.: Pursuing the method of multiple working hypotheses for hydrological modeling, Water Resour. Res., 47, W09301, https://doi.org/10.1029/2010WR009827, 2011. a
Clausen, B. and Biggs, B.: Flow variables for ecological studies in temperate streams: groupings based on covariance, J. Hydrol., 237, 184–197, https://doi.org/10.1016/S0022-1694(00)00306-1, 2000. a, b
Delforge, D., de Viron, O., Vanclooster, M., Van Camp, M., and Watlet, A.: Detecting hydrological connectivity using causal inference from time series: synthetic and real karstic case studies, Hydrol. Earth Syst. Sci., 26, 2181–2199, https://doi.org/10.5194/hess-26-2181-2022, 2022. a
Deng, J., Shan, K., Shi, K., Qian, S. S., Zhang, Y., Qin, B., and Zhu, G.: Nutrient reduction mitigated the expansion of cyanobacterial blooms caused by climate change in Lake Taihu according to Bayesian network models, Water Res., 236, 119946, https://doi.org/10.1016/j.watres.2023.119946, 2023. a
Deng, Y. and Ebert-Uphoff, I.: Weakening of atmospheric information flow in a warming climate in the Community Climate System Model, Geophys. Res. Lett., 41, 193–200, https://doi.org/10.1002/2013GL058646, 2014. a
Desai, S. and Ouarda, T. B. M. J.: Regional hydrological frequency analysis at ungauged sites with random forest regression, J. Hydrol., 594, 125861, https://doi.org/10.1016/j.jhydrol.2020.125861, 2021. a
Dey, P., Mathai, J., Sivapalan, M., and Mujumdar, P. P.: On the regional-scale variability in flow duration curves in Peninsular India, Hydrol. Earth Syst. Sci., 28, 1493–1514, https://doi.org/10.5194/hess-28-1493-2024, 2024. a
Dubos, V., Hani, I., Ouarda, T. B. M. J., and St-Hilaire, A.: Short-term forecasting of spring freshet peak flow with the Generalized Additive model, J. Hydrol., 612, 128089, https://doi.org/10.1016/j.jhydrol.2022.128089, 2022. a, b
Dutta, R. and Maity, R.: Temporal networks-based approach for nonstationary hydroclimatic modeling and its demonstration with streamflow prediction, Water Resour. Res., 56, e2020WR027086, https://doi.org/10.1029/2020WR027086, 2020. a
Ebert-Uphoff, I. and Deng, Y.: Causal Discovery for Climate Research Using Graphical Models, J. Climate, 25, 5648–5665, https://doi.org/10.1175/JCLI-D-11-00387.1, 2012. a
Falcone, J. A.: GAGES-II: Geospatial attributes of gages for evaluating streamflow, Tech. rep., US Geological Survey, https://doi.org/10.3133/70046617, 2011. a
Ficchi, A., Perrin, C., and Andreassian, V.: Hydrological modelling at multiple sub-daily time steps: Model improvement via flux-matching, J. Hydrol., 575, 1308–1327, https://doi.org/10.1016/j.jhydrol.2019.05.084, 2019. a
Gao, B., Yang, J., Chen, Z., Sugihara, G., Li, M., Stein, A., Kwan, M.-P., and Wang, J.: Causal inference from cross-sectional earth system data with geographical convergent cross mapping, Nat. Commun., 14, 5875, https://doi.org/10.1038/s41467-023-41619-6, 2023. a
Geiger, D. and Heckerman, D.: Learning gaussian networks, in: Uncertainty in Artificial Intelligence, Elsevier, 235–243, https://doi.org/10.1016/B978-1-55860-332-5.50035-3, 1994. a
Gentile, A., Canone, D., Ceperley, N., Gisolo, D., Previati, M., Zuecco, G., Schaefli, B., and Ferraris, S.: Towards a conceptualization of the hydrological processes behind changes of young water fraction with elevation: a focus on mountainous alpine catchments, Hydrol. Earth Syst. Sci., 27, 2301–2323, https://doi.org/10.5194/hess-27-2301-2023, 2023. a
Giuntoli, I., Renard, B., Vidal, J. P., and Bard, A.: Low flows in France and their relationship to large-scale climate indices, J. Hydrol., 482, 105–118, https://doi.org/10.1016/j.jhydrol.2012.12.038, 2013. a
Gleeson, T., Moosdorf, N., Hartmann, J., and van Beek, L. P. H.: A glimpse beneath earth's surface: GLobal HYdrogeology MaPS (GLHYMPS) of permeability and porosity, Geophys. Res. Lett., 41, 3891–3898, https://doi.org/10.1002/2014GL059856, 2014. a
Glymour, C., Zhang, K., and Spirtes, P.: Review of Causal Discovery Methods Based on Graphical Models, Frontiers in Genetics, 10, 524, https://doi.org/10.3389/fgene.2019.00524, 2019. a
Gnann, S. J., Woods, R. A., and Howden, N. J. K.: Is There a Baseflow Budyko Curve?, Water Resour. Res., 55, 2838–2855, https://doi.org/10.1029/2018WR024464, 2019. a
Gower, J.: A General Coefficient of Similarity and Some of Its Properties, 27, 857–871, https://doi.org/10.2307/2528823, 1971. a
Guzha, A. C., Rufino, M. C., Okoth, S., Jacobs, S., and Nobrega, R. L. B.: Impacts of land use and land cover change on surface runoff, discharge and low flows: Evidence from East Africa, Journal of Hydrology: Regional Studies, 15, 49–67, https://doi.org/10.1016/j.ejrh.2017.11.005, 2018. a
Hartmann, J. and Moosdorf, N.: The new global lithological map database GLiM: A representation of rock properties at the Earth surface, Geochem. Geophy. Geosy., 13, Q12004, https://doi.org/10.1029/2012GC004370, 2012. a
Hastie, T., Tibshirani, R., Friedman, J. H., and Friedman, J. H.: The elements of statistical learning: data mining, inference, and prediction, Vol. 2, Springer, https://doi.org/10.1007/978-0-387-84858-7, 2009. a, b
Hausser, J. and Strimmer, K.: Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks, J. Mach. Learn. Res., 10, 1469−-1484, 2009. a
Heinze-Deml, C., Maathuis, M. H., and Meinshausen, N.: Causal Structure Learning, Annu. Rev. Stat. Appl., 5, 371–391, https://doi.org/10.1146/annurev-statistics-031017-100630, 2018a. a
Heinze-Deml, C., Peters, J., and Meinshausen, N.: Invariant Causal Prediction for Nonlinear Models, Journal of Causal Inference, 6, 20170016, https://doi.org/10.1515/jci-2017-0016, 2018b. a
Hennig, C. and Liao, T. F.: How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification, J. Roy. Stat. Soc. C-Appl., 62, 309–369, https://doi.org/10.1111/j.1467-9876.2012.01066.x, 2013. a
Herman, J. D., Reed, P. M., Zeff, H. B., and Characklis, G. W.: How should robustness be defined for water systems planning under change?, J. Water Res. Plan. Man., 141, 04015012, https://doi.org/10.1061/(ASCE)WR.1943-5452.0000509, 2015. a
Hrachowitz, M., Fovet, O., Ruiz, L., Euser, T., Gharari, S., Nijzink, R., Freer, J., Savenije, H. H. G., and Gascuel-Odoux, C.: Process consistency in models: The importance of system signatures, expert knowledge, and process complexity, Water Resour. Res., 50, 7445–7469, https://doi.org/10.1002/2014WR015484, 2014. a
Jackson-Blake, L. A., Clayer, F., Haande, S., Sample, J. E., and Moe, S. J.: Seasonal forecasting of lake water quality and algal bloom risk using a continuous Gaussian Bayesian network, Hydrol. Earth Syst. Sci., 26, 3103–3124, https://doi.org/10.5194/hess-26-3103-2022, 2022. a
Jehn, F. U., Bestian, K., Breuer, L., Kraft, P., and Houska, T.: Using hydrological and climatic catchment clusters to explore drivers of catchment behavior, Hydrol. Earth Syst. Sci., 24, 1081–1100, https://doi.org/10.5194/hess-24-1081-2020, 2020. a
Kalisch, M. and Bühlman, P.: Estimating high-dimensional directed acyclic graphs with the PC-algorithm, J. Mach. Learn. Res., 8, 613–636, 2007. a
Kalisch, M., Maechler, M., Colombo, D., Maathuis, M. H., and Buehlmann, P.: Causal Inference Using Graphical Models with the R Package pcalg, J. Stat. Softw., 47, 1–26, 2012. a
Kirchner, J. W.: Characterizing nonlinear, nonstationary, and heterogeneous hydrologic behavior using ensemble rainfall–runoff analysis (ERRA): proof of concept, Hydrol. Earth Syst. Sci., 28, 4427–4454, https://doi.org/10.5194/hess-28-4427-2024, 2024. a
Kook, L., Saengkyongam, S., Lundborg, A. R., Hothorn, T., and Peters, J.: Model-Based Causal Feature Selection for General Response Types, J. Am. Stat. Assoc., 120, 1090–1101, https://doi.org/10.1080/01621459.2024.2395588, 2024. a
Kretschmer, M., Coumou, D., Donges, J. F., and Runge, J.: Using Causal Effect Networks to Analyze Different Arctic Drivers of Midlatitude Winter Circulation, J. Climate, 29, 4069–4081, https://doi.org/10.1175/JCLI-D-15-0654.1, 2016. a
Kuentz, A., Arheimer, B., Hundecha, Y., and Wagener, T.: Understanding hydrologic variability across Europe through catchment classification, Hydrol. Earth Syst. Sci., 21, 2863–2879, https://doi.org/10.5194/hess-21-2863-2017, 2017. a
Laaha, G. and Bloeschl, G.: A comparison of low flow regionalisation methods – catchment grouping, J. Hydrol., 323, 193–214, https://doi.org/10.1016/j.jhydrol.2005.09.001, 2006. a
Ladson, A. R., Brown, R., Neal, B., and Nathan, R.: A standard approach to baseflow separation using the Lyne and Hollick filter, Australasian Journal of Water Resources, 17, 25–34, https://doi.org/10.7158/13241583.2013.11465417, 2013. a
Ley, R., Casper, M. C., Hellebrand, H., and Merz, R.: Catchment classification by runoff behaviour with self-organizing maps (SOM), Hydrol. Earth Syst. Sci., 15, 2947–2962, https://doi.org/10.5194/hess-15-2947-2011, 2011. a
Li, C. and Mahadevan, S.: Efficient approximate inference in Bayesian networks with continuous variables, Reliab. Eng. Syst. Safe., 169, 269–280, https://doi.org/10.1016/j.ress.2017.08.017, 2018. a
Li, J. and Wang, Z. J.: Controlling the False Discovery Rate of the Association/Causality Structure Learned with the PC Algorithm, J. Mach. Learn. Res., 10, 475–514, 2009. a
Love, C. A., Skahill, B. E., England, J. F., Karlovits, G., Duren, A., and AghaKouchak, A.: Integrating Climatic and Physical Information in a Bayesian Hierarchical Model of Extreme Daily Precipitation, Water, 12, 2211, https://doi.org/10.3390/w12082211, 2020. a
Marcot, B. G. and Penman, T. D.: Advances in Bayesian network modelling: Integration of modelling technologies, Environ. Modell. Softw., 111, 386–393, https://doi.org/10.1016/j.envsoft.2018.09.016, 2019. a
Matos, A. C. d. S. and Silva, F. E. O. e: Bayesian estimation of hydrological model parameters in the signature-domain: Aiming for a regional approach, J. Hydrol., 639, 131554, https://doi.org/10.1016/j.jhydrol.2024.131554, 2024. a
McMillan, H.: Linking hydrologic signatures to hydrologic processes: A review, Hydrol. Process., 34, 1393–1409, https://doi.org/10.1002/hyp.13632, 2020. a
McMillan, H. K., Gnann, S. J., and Araki, R.: Large Scale Evaluation of Relationships Between Hydrologic Signatures and Processes, Water Resour. Res., 58, e2021WR031751, https://doi.org/10.1029/2021WR031751, 2022. a, b, c
Meek, C.: Causal inference and causal explanation with background knowledge, arXiv [preprint], https://doi.org/10.48550/arXiv.1302.4972, 1995. a, b
Nanda, A., Sen, S., and McNamara, J. P.: How spatiotemporal variation of soil moisture can explain hydrological connectivity of infiltration-excess dominated hillslope: Observations from lesser Himalayan landscape, J. Hydrol., 579, 124146, https://doi.org/10.1016/j.jhydrol.2019.124146, 2019. a
Neri, M., Coulibaly, P., and Toth, E.: Similarity of catchment dynamics based on the interaction between streamflow and forcing time series: Use of a transfer entropy signature, J. Hydrol., 614, 128555, https://doi.org/10.1016/j.jhydrol.2022.128555, 2022. a
Newman, A. J., Clark, M. P., Sampson, K., Wood, A., Hay, L. E., Bock, A., Viger, R. J., Blodgett, D., Brekke, L., Arnold, J. R., Hopson, T., and Duan, Q.: Development of a large-sample watershed-scale hydrometeorological data set for the contiguous USA: data set characteristics and assessment of regional variability in hydrologic model performance, Hydrol. Earth Syst. Sci., 19, 209–223, https://doi.org/10.5194/hess-19-209-2015, 2015. a
Newman, A., Sampson, K., Clark, M., Bock, A., Viger, R. J., Blodgett, D., Addor, N., and Mizukami, M.: CAMELS: Catchment Attributes and MEteorology for Large-sample Studies, Version 1.2, UCAR/NCAR – GDEX [data set], https://doi.org/10.5065/D6MW2F4D, 2022. a
Nguyen, T.-T., Huu, Q. N., and Li, M. J.: Forecasting Time Series Water Levels on Mekong River Using Machine Learning Models, in: 2015 Seventh International Conference on Knowledge and Systems Engineering (KSE), Ho Chi Minh City, Vietnam, 8–10 October 2015. IEEE, 292–297, https://doi.org/10.1109/KSE.2015.53, 2015. a
Nojavan, F. A., Qian, S. S., and Stow, C. A.: Comparative analysis of discretization methods in Bayesian networks, Environ. Modell. Softw., 87, 64–71, https://doi.org/10.1016/j.envsoft.2016.10.007, 2017. a
Olden, J. and Poff, N.: Redundancy and the choice of hydrologic indices for characterizing streamflow regimes, River Res. Appl., 19, 101–121, https://doi.org/10.1002/rra.700, 2003. a, b
Olden, J. D., Kennard, M. J., and Pusey, B. J.: A framework for hydrologic classification with a review of methodologies and applications in ecohydrology, Ecohydrology, 5, 503–518, https://doi.org/10.1002/eco.251, 2012. a
Ombadi, M.: Causal Inference, Nonlinear Dynamics, and Information Theory Applications in Hydrometeorological Systems, Doctoral Dissertation, University of California, Irvine, 2021. a
Ombadi, M., Nguyen, P., Sorooshian, S., and Hsu, K.: Evaluation of Methods for Causal Discovery in Hydrometeorological Systems, Water Resour. Res., 56, e2020WR027251, https://doi.org/10.1029/2020WR027251, 2020. a
Ouali, D., Chebana, F., and Ouarda, T. B. M. J.: Fully nonlinear statistical and machine-learning approaches for hydrological frequency estimation at ungauged sites, J. Adv. Model. Earth Sy., 9, 1292–1306, https://doi.org/10.1002/2016MS000830, 2017. a
Ouarda, T. B. M. J., Charron, C., Hundecha, Y., St-Hilaire, A., and Chebana, F.: Introduction of the GAM model for regional low-flow frequency analysis at ungauged basins and comparison with commonly used approaches, Environ. Modell. Softw., 109, 256–271, https://doi.org/10.1016/j.envsoft.2018.08.031, 2018. a
Parascandolo, G., Kilbertus, N., Rojas-Carulla, M., and Scholkopf, B.: Learning Independent Causal Mechanisms, in: Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018, edited by: Dy, J. and Krause, A., PMLR, 80, 4036–4044, 2018. a
Pearl, J.: Probabilistic reasoning in intelligent systems: networks of plausible inference, Elsevier, 1988. a
Pearl, J.: Causality, Cambridge University Press, https://doi.org/10.1017/CBO9780511803161, 2009. a, b, c
Perez-Suay, A. and Camps-Valls, G.: Causal Inference in Geoscience and Remote Sensing From Observational Data, IEEE T. Geosci. Remote, 57, 1502–1513, https://doi.org/10.1109/TGRS.2018.2867002, 2019. a
Perković, E., Kalisch, M., and Maathuis, M. H.: Interpreting and using CPDAGs with background knowledge, arXiv [preprint], https://doi.org/10.48550/arXiv.1707.02171, 7 July 2017. a, b
Peters, J., Buhlmann, P., and Meinshausen, N.: Causal inference by using invariant prediction: identification and confidence intervals, J. Roy. Stat. Soc. B, 78, 947–1012, https://doi.org/10.1111/rssb.12167, 2016. a, b
Petersen, A. H., Osler, M., and Ekstrom, C. T.: Data-Driven Model Building for Life-Course Epidemiology, Am. J. Epidemiol., 190, 1898–1907, https://doi.org/10.1093/aje/kwab087, 2021. a, b, c, d
Pfister, N., Williams, E. G., Peters, J., Aebersold, R., and Bühlmann, P.: Stabilizing variable selection and regression, Ann. Appl. Stat., 15, 1220–1246, 2021. a
Pizarro, A. and Jorquera, J.: Advancing objective functions in hydrological modelling: Integrating knowable moments for improved simulation accuracy, J. Hydrol., 634, 131071, https://doi.org/10.1016/j.jhydrol.2024.131071, 2024. a
Pokhrel, P., Yilmaz, K. K., and Gupta, H. V.: Multiple-criteria calibration of a distributed watershed model using spatial regularization and response signatures, J. Hydrol., 418, 49–60, https://doi.org/10.1016/j.jhydrol.2008.12.004, 2012. a
Qian, S. S. and Miltner, R. J.: A continuous variable Bayesian networks model for water quality modeling: A case study of setting nitrogen criterion for small rivers and streams in Ohio, USA, Environ. Modell. Softw., 69, 14–22, https://doi.org/10.1016/j.envsoft.2015.03.001, 2015. a, b
Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., and D. Lawrence, N.: Dataset Shift in Machine Learning, MIT Press, Cambridge, MA, ISBN 9780262170055, 2009. a
Kaufman, L. and Rousseeuw, P.: Finding groups in data: an introduction to cluster analysis. John Wiley & Sons, https://doi.org/10.1002/9780470316801, 1990. a
Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N., and Prabhat: Deep learning and process understanding for data-driven Earth system science, Nature, 566, 195–204, https://doi.org/10.1038/s41586-019-0912-1, 2019. a, b
Rinderera, M., Ali, G., and Larsen, L. G.: Assessing structural, functional and effective hydrologic connectivity with brain neuroscience methods: State-of-the-art and research directions, Earth-Sci. Rev., 178, 29–47, https://doi.org/10.1016/j.earscirev.2018.01.009, 2018. a
Rubin, D.: Estimating causal effects of treatments in randomized and nonrandomized studies, J. Educ. Psychol., 66, 688–701, https://doi.org/10.1037/h0037350, 1974. a
Runge, J., Bathiany, S., Bollt, E., Camps-Valls, G., Coumou, D., Deyle, E., Glymour, C., Kretschmer, M., Mahecha, M. D., Munoz-Mari, J., van Nes, E. H., Peters, J., Quax, R., Reichstein, M., Scheffer, M., Schoelkopf, B., Spirtes, P., Sugihara, G., Sun, J., Zhang, K., and Zscheischler, J.: Inferring causation from time series in Earth system sciences, Nat. Commun., 10, 2553, https://doi.org/10.1038/s41467-019-10105-3, 2019a. a, b, c
Runge, J., Nowack, P., Kretschmer, M., Flaxman, S., and Sejdinovic, D.: Detecting and quantifying causal associations in large nonlinear time series datasets, Sci. Adv., 5, eaau4996, https://doi.org/10.1126/sciadv.aau4996, 2019b. a
Runge, J., Gerhardus, A., Varando, G., Eyring, V., and Camps-Valls, G.: Causal inference for time series, Nature Reviews Earth & Environment, 4, 487–505, https://doi.org/10.1038/s43017-023-00431-y, 2023. a, b
Sanchez-Romero, R., Ito, T., Mill, R. D., Hanson, S. J., and Cole, M. W.: Causally informed activity flow models provide mechanistic insight into network-generated cognitive activations, NeuroImage, 278, 120300, https://doi.org/10.1016/j.neuroimage.2023.120300, 2023. a
Sankarasubramanian, A., Vogel, R., and Limbrunner, J.: Climate elasticity of streamflow in the United States, Water Resour. Res., 37, 1771–1781, https://doi.org/10.1029/2000WR900330, 2001. a
Sawicz, K., Wagener, T., Sivapalan, M., Troch, P. A., and Carrillo, G.: Catchment classification: empirical analysis of hydrologic similarity based on catchment function in the eastern USA, Hydrol. Earth Syst. Sci., 15, 2895–2911, https://doi.org/10.5194/hess-15-2895-2011, 2011. a, b, c
Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., and Mooij, J.: On causal and anticausal learning, arXiv [preprint], https://doi.org/10.48550/arXiv.1206.6471, 27 June 2012. a
Schölkopf, B., Locatello, F., Bauer, S., Ke, N. R., Kalchbrenner, N., Goyal, A., and Bengio, Y.: Toward Causal Representation Learning, P. IEEE, 109, 612–634, https://doi.org/10.1109/JPROC.2021.3058954, 2021. a
Scutari, M.: Learning Bayesian networks with the bnlearn R package, J. Stat. Softw., 35, 1–22, https://doi.org/10.18637/jss.v035.i03, 2010. a
Scutari, M. and Nagarajan, R.: Identifying significant edges in graphical models of molecular networks, Artif. Intell. Med., 57, 207–217, https://doi.org/10.1016/j.artmed.2012.12.006, 2013. a, b
Sendrowski, A. and Passalacqua, P.: Process connectivity in a naturally prograding river delta, Water Resour. Res., 53, 1841–1863, https://doi.org/10.1002/2016WR019768, 2017. a
Seydi, S. T., Abatzoglou, J. T., AghaKouchak, A., Pourmohamad, Y., Mishra, A., and Sadegh, M.: Predictive understanding of links between vegetation and soil burn severities using physics-informed machine learning, Earth's Future, 12, e2024EF004873, https://doi.org/10.1029/2024EF004873, 2024. a
Singh, R., Reed, P. M., and Keller, K.: Many-objective robust decision making for managing an ecosystem with a deeply uncertain threshold response, Ecol. Soc., 20, 12, https://doi.org/10.5751/ES-07687-200312, 2015. a
Singh, S. K., McMillan, H., Bardossy, A., and Fateh, C.: Nonparametric catchment clustering using the data depth function, Hydrolog. Sci. J., 61, 2649–2667, https://doi.org/10.1080/02626667.2016.1168927, 2016. a
Sivapalan, M.: Pattern, process and function: elements of a unified theory of hydrology at the catchment scale, in: Encyclopedia of hydrological sciences, edited by: Anderson, M. G. and McDonnell, J. J., John Wiley & Sons, Ltd, https://doi.org/10.1002/0470848944.hsa012, 2006. a
Slater, L., Blougouras, G., Deng, L., Deng, Q., Ford, E., Hoek van Dijke, A., Huang, F., Jiang, S., Liu, Y., Moulds, S., Schepen, A., Yin, J., and Zhang, B.: Challenges and opportunities of ML and explainable AI in large-sample hydrology, Philos. T. Roy. Soc. A, 383, 20240287, https://doi.org/10.1098/rsta.2024.0287, 2025. a
Spieler, D. and Schuetze, N.: Investigating the Model Hypothesis Space: Benchmarking Automatic Model Structure Identification With a Large Model Ensemble, Water Resour. Res., 60, e2023WR036199, https://doi.org/10.1029/2023WR036199, 2024. a
Sultan, D., Tsunekawa, A., Tsubo, M., Haregeweyn, N., Adgo, E., Meshesha, D. T., Fenta, A. A., Ebabu, K., Berihun, M. L., and Setargie, T. A.: Evaluation of lag time and time of concentration estimation methods in small tropical watersheds in Ethiopia, Journal of Hydrology: Regional Studies, 40, 101025, https://doi.org/10.1016/j.ejrh.2022.101025, 2022. a
Tárraga, J. M., Sevillano-Marco, E., Muñoz-Marí, J., Piles, M., Sitokonstantinou, V., Ronco, M., Miranda, M. T., Cerdà, J., and Camps-Valls, G.: Causal discovery reveals complex patterns of drought-induced displacement, iScience, 27, 110628, https://doi.org/10.1016/j.isci.2024.110628, 2024. a
Todorovic, A., Grabs, T., and Teutschbein, C.: Improving performance of bucket-type hydrological models in high latitudes with multi-model combination methods: Can we wring water from a stone?, J. Hydrol., 632, 130829, https://doi.org/10.1016/j.jhydrol.2024.130829, 2024. a
Vandenberg-Rodes, A., Moftakhari, H. R., AghaKouchak, A., Shahbaba, B., Sanders, B. F., and Matthew, R. A.: Projecting nuisance flooding in a warming climate using generalized linear models and Gaussian processes, J. Geophys. Res.-Oceans, 121, 8008–8020, 2016. a
Verma, T. and Pearl, J.: Causal networks: Semantics and expressiveness, Mach. Intell. Patt. Rec., 9, 69–76, 1990. a
Viger, R. and Bock, A.: GIS features of the geospatial fabric for national hydrologic modeling, US Geological Survey, 10, F7542KMD, https://doi.org/10.5066/F7542KMD, 2014. a
Viglione, A., Parajka, J., Rogger, M., Salinas, J. L., Laaha, G., Sivapalan, M., and Blöschl, G.: Comparative assessment of predictions in ungauged basins – Part 3: Runoff signatures in Austria, Hydrol. Earth Syst. Sci., 17, 2263–2279, https://doi.org/10.5194/hess-17-2263-2013, 2013. a
Wang, Y., Yang, J., Chen, Y., De Maeyer, P., Li, Z., and Duan, W.: Detecting the Causal Effect of Soil Moisture on Precipitation Using Convergent Cross Mapping, Scientific Reports, 8, 12171, https://doi.org/10.1038/s41598-018-30669-2, 2018. a
Wood, S.: Mixed GAM computation vehicle with automatic smoothness estimation, R package version 1.8–12, Comprehensive R Archive Network (CRAN), https://doi.org/10.1201/9781315370279, 2018. a
Woodward, J.: Invariance, modularity, and all that: Cartwright on causation, in: Nancy Cartwright's philosophy of science, Routledge, 210–249, https://doi.org/10.4324/9780203895467, 2008. a
Yadav, M., Wagener, T., and Gupta, H.: Regionalization of constraints on expected watershed response behavior for improved predictions in ungauged basins, Adv. Water Resour., 30, 1756–1774, https://doi.org/10.1016/j.advwatres.2007.01.005, 2007. a
Yang, M. and Olivera, F.: Classification of watersheds in the conterminous United States using shape-based time-series clustering and Random Forests, J. Hydrol., 620, 129409, https://doi.org/10.1016/j.jhydrol.2023.129409, 2023. a
Zabaleta, A., Garmendia, E., Mariel, P., Tamayo, I., and Antigüedad, I.: Land cover effects on hydrologic services under a precipitation gradient, Hydrol. Earth Syst. Sci., 22, 5227–5241, https://doi.org/10.5194/hess-22-5227-2018, 2018. a
Zachariah, M., Mondal, A., and AghaKouchak, A.: Probabilistic assessment of extreme heat stress on Indian wheat yields under climate change, Geophys. Res. Lett., 48, e2021GL094702, https://doi.org/10.1029/2021GL094702, 2021. a
Zazo, S., Molina, J.-L., Ruiz-Ortiz, V., Vélez-Nicolás, M., and García-López, S.: Modeling river runoff temporal behavior through a hybrid causal–hydrological (HCH) method, Water, 12, 3137, https://doi.org/10.3390/w12113137, 2020. a
Zhang, Y., Vaze, J., Chiew, F. H. S., Teng, J., and Li, M.: Predicting hydrological signatures in ungauged catchments using spatial interpolation, index model, and rainfall-runoff modelling, J. Hydrol., 517, 936–948, https://doi.org/10.1016/j.jhydrol.2014.06.032, 2014. a
Zuk, O., Margel, S., and Domany, E.: On the number of samples needed to learn the correct structure of a Bayesian network, arXiv [preprint], https://doi.org/10.48550/arXiv.1206.6862, 27 June 2012. a
Short summary
Here, we represented catchments as networks of variables connected by cause-and-effect relationships. By comparing the performance of statistical and machine learning methods with and without incorporating causal information to predict runoff properties, we showed that causal information can enhance models' robustness by reducing the accuracy drop between the training and testing phases, improving the model's interpretability, and mitigating overfitting issues, especially with small training samples.
Here, we represented catchments as networks of variables connected by cause-and-effect...