30 Mar 2022
30 Mar 2022
Status: this preprint is currently under review for the journal HESS.

Evaluation of water flux predictive models developed using eddy covariance observations and machine learning: a meta-analysis

Haiyang Shi1,2,4,5, Geping Luo1,2,3,5, Olaf Hellwich6, Mingjuan Xie1,2,4,5, Chen Zhang1,2, Yu Zhang1,2, Yuangang Wang1,2, Xiuliang Yuan1, Xiaofei Ma1, Wenqiang Zhang1,2,4,5, Alishir Kurban1,2,3,5, Philippe De Maeyer1,2,4,5, and Tim Van de Voorde4,5 Haiyang Shi et al.
  • 1State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi, Xinjiang, 830011, China
  • 2University of Chinese Academy of Sciences, 19 (A) Yuquan Road, Beijing, 100049, China
  • 3Research Centre for Ecology and Environment of Central Asia, Chinese Academy of Sciences, Urumqi, China
  • 4Department of Geography, Ghent University, Ghent 9000, Belgium
  • 5Sino-Belgian Joint Laboratory of Geo-Information, Ghent, Belgium and Urumqi, China
  • 6Department of Computer Vision & Remote Sensing, Technische Universität Berlin, 10587 Berlin, Germany

Abstract. With the rapid accumulation of water flux observations from global eddy-covariance flux sites, many studies have used data-driven approaches to model site-scale water fluxes with various predictors and machine learning algorithms used. However, systematic evaluation of such models is still limited. We therefore performed a meta-analysis of 32 such studies, derived 139 model records, and evaluated the impact of various features on model accuracy throughout the modeling flow. SVM (average R-squared = 0.82) and RF (average R-squared = 0.81) outperformed over evaluated algorithms in both cross-study and intra-study (with the same training dataset) comparisons. The average accuracy of the model applied to arid regions is higher than other climate classes. The average accuracy of the model was slightly lower for forest sites (average R-squared = 0.76) than for cropland and grassland sites (average R-squared = 0.8 and 0.79), but higher than for shrub sites (average R-squared = 0.67). Among various predictor variables, the use of net/sun radiation, precipitation, air temperature, and the fraction of absorbed photosynthetically active radiation improved the model accuracy. Among the different validation methods, random cross-validation shows higher model accuracy than spatial cross-validation and temporal cross-validation, but spatial cross-validation is more important for the application for water flux predictive models when used for spatial extrapolation. The findings of this study are promising to guide future research on such machine learning-based modeling.

Haiyang Shi et al.

Status: open (until 27 May 2022)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on hess-2022-90', Anonymous Referee #1, 07 May 2022 reply

Haiyang Shi et al.

Haiyang Shi et al.


Total article views: 470 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
352 110 8 470 40 2 4
  • HTML: 352
  • PDF: 110
  • XML: 8
  • Total: 470
  • Supplement: 40
  • BibTeX: 2
  • EndNote: 4
Views and downloads (calculated since 30 Mar 2022)
Cumulative views and downloads (calculated since 30 Mar 2022)

Viewed (geographical distribution)

Total article views: 438 (including HTML, PDF, and XML) Thereof 438 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 26 May 2022
Short summary
There have been many machine learning simulation studies based on eddy-covariance observations for water flux and evapotranspiration. We performed a meta-analysis of such studies to clarify the impact of different algorithms, predictors, etc. on the reported prediction accuracy. It can, to some extent, guide future global water flux modeling studies and help us better understand the terrestrial ecosystem water cycle.