10 May 2022
10 May 2022
Status: this preprint is currently under review for the journal HESS.

Low flow estimation beyond the mean – expectile loss and extreme gradient boosting for spatio-temporal low flow prediction in Austria

Johannes Laimighofer1, Michael Melcher2, and Gregor Laaha1 Johannes Laimighofer et al.
  • 1University of Natural Resources and Life Sciences, Vienna, Department of Landscape, Spatial and Infrastructure Sciences, Institute of Statistics, Peter-Jordan-Strasse 82/I, 1190 Vienna, Austria
  • 2Institute of Information Management, FH JOANNEUM – University of Applied Sciences, Graz, Austria

Abstract. Accurate predictions of seasonal low flows are critical for a number of water management tasks that require inferences about water quality and the ecological status of water bodies. This paper proposes an extreme gradient tree boosting model (XGBoost) for predicting monthly low flow in ungauged catchments. Particular emphasis is placed on the lowest values (in the magnitude of annual low flows and below) by implementing the expectile loss function to the XGBoost model. For this purpose, we test expectile loss functions based on decreasing expectiles (from τ = 0.5 to 0.01) that give increasing weight to lower values. These are compared to common loss functions such as mean and median absolute loss. Model optimization and evaluation is conducted using a nested cross validation approach that includes recursive feature elimination to promote parsimonious models. The methods are tested on a comprehensive dataset of 260 stream gauges in Austria covering a wide range of low flow regimes. Our results demonstrate that the expectile loss function can yield high prediction accuracy, but the performance drops sharply for low expectile models. With a median R2 of 0.67, the 0.5 expectile yields the best performing model. The 0.3 and 0.2 perform slightly worse, but still outperform the common median and mean absolute loss functions. All expectile models include some stations with moderate and poor performance that can be attributed to some systematic error, while the seasonal and annual variability is well covered by the models. Results for the prediction of low extremes show an increasing performance in terms of R2 for smaller expectiles (0.01, 0.025, 0.05), though leading to the disadvantage of classifying too many extremes for each station. We found that the application of different expectiles leads to a trade-off between overall performance, prediction performance for extremes, and misclassification of extreme low flow events. Our results show that the 0.1 or 0.2 expectiles perform best with respect to all three criteria. The resulting extreme gradient tree boosting model covers seasonal and annual variability nicely and provides a viable approach for spatio-temporal modelling of a range of hydrological variables representing average conditions and extreme events.

Johannes Laimighofer et al.

Status: open (until 05 Jul 2022)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Johannes Laimighofer et al.

Johannes Laimighofer et al.


Total article views: 290 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
241 44 5 290 2 1
  • HTML: 241
  • PDF: 44
  • XML: 5
  • Total: 290
  • BibTeX: 2
  • EndNote: 1
Views and downloads (calculated since 10 May 2022)
Cumulative views and downloads (calculated since 10 May 2022)

Viewed (geographical distribution)

Total article views: 263 (including HTML, PDF, and XML) Thereof 263 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 26 May 2022
Short summary
Our study uses a statistical model for estimating low flows on a monthly basis, which can be applied to estimate low flows at sites without measurements. We use an extensive data set of 260 stream gauges in Austria for model development. As we are specifically interested in low flow events, our method gives specific weight to such events. We found that our method can improve the predictions of low flow events considerably and yields accurate estimates of the seasonal low flow variation.