Preprints
https://doi.org/10.5194/hess-2022-377
https://doi.org/10.5194/hess-2022-377
21 Nov 2022
 | 21 Nov 2022
Status: this preprint is currently under review for the journal HESS.

Comparing machine learning and deep learning models for probabilistic post-processing of satellite precipitation-driven streamflow simulation

Yuhang Zhang, Aizhong Ye, Phu Nguyen, Bita Analui, Soroosh Sorooshian, Kuolin Hsu, and Yuxuan Wang

Abstract. Deep learning (DL) models are popular but computationally expensive, machine learning (ML) models are old-fashioned but more efficient. Their differences in hydrological probabilistic post-processing are not clear at the moment. This study conducts a systematic model comparison between the quantile regression forest (QRF) model and probabilistic long short-term memory (PLSTM) model as hydrological probabilistic post-processors. Specifically, we compare these two models to deal with the biased streamflow simulation driven by three kinds of satellite precipitation products in 522 sub-basins of Yalong River basin of China. Model performance is comprehensively assessed by a series of scoring metrics from the probabilistic and deterministic perspectives, respectively. In general, the QRF model and the PLSTM model are comparable in terms of probabilistic prediction. Their performance is closely related to the flow accumulation area of the sub-basin. For sub-basins with flow accumulation area less than 60,000 km2, the QRF model outperforms the PLSTM model in most of the sub-basins. For sub-basins with flow accumulation area larger than 60,000 km2, the PLSTM model has an undebatable advantage. In terms of deterministic predictions, the PLSTM model should be more preferred than the QRF model, especially when the raw streamflow is poorly simulated and used as an input. But if we put aside the model performance, the QRF model is more efficient in all cases, saving half the time than the PLSTM model. This study can deepen our understanding of ML and DL models in hydrological post-processing and enable more appropriate model selection in practice.

Yuhang Zhang et al.

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on hess-2022-377', Anonymous Referee #1, 15 Dec 2022
    • AC1: 'Reply on RC1', aizhong ye, 08 Feb 2023
    • AC2: 'Reply on RC1', aizhong ye, 08 Feb 2023
  • RC2: 'Comment on hess-2022-377', Anonymous Referee #2, 19 Dec 2022
    • AC4: 'Reply on RC2', aizhong ye, 08 Feb 2023
  • RC3: 'Comment on hess-2022-377', Anonymous Referee #3, 08 Jan 2023
    • AC3: 'Reply on RC3', aizhong ye, 08 Feb 2023

Yuhang Zhang et al.

Yuhang Zhang et al.

Viewed

Total article views: 883 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
647 214 22 883 32 9 7
  • HTML: 647
  • PDF: 214
  • XML: 22
  • Total: 883
  • Supplement: 32
  • BibTeX: 9
  • EndNote: 7
Views and downloads (calculated since 21 Nov 2022)
Cumulative views and downloads (calculated since 21 Nov 2022)

Viewed (geographical distribution)

Total article views: 849 (including HTML, PDF, and XML) Thereof 849 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 25 May 2023
Download
Short summary
We compared probabilistic long short-term memory (PLSTM) model and quantile regression forest model (QRF). The results show the QRF model is more efficient, taking only half the time of the PLSTM model to do all the experiments in terms of model efficiency, the QRF model and the PLSTM model are comparable in terms of probabilistic (multi-point) prediction, the QRF model performs better in small watersheds and the PLSTM model performs better in large watersheds.