Comparing machine learning and deep learning models for probabilistic post-processing of satellite precipitation-driven streamflow simulation
Abstract. Deep learning (DL) models are popular but computationally expensive, machine learning (ML) models are old-fashioned but more efficient. Their differences in hydrological probabilistic post-processing are not clear at the moment. This study conducts a systematic model comparison between the quantile regression forest (QRF) model and probabilistic long short-term memory (PLSTM) model as hydrological probabilistic post-processors. Specifically, we compare these two models to deal with the biased streamflow simulation driven by three kinds of satellite precipitation products in 522 sub-basins of Yalong River basin of China. Model performance is comprehensively assessed by a series of scoring metrics from the probabilistic and deterministic perspectives, respectively. In general, the QRF model and the PLSTM model are comparable in terms of probabilistic prediction. Their performance is closely related to the flow accumulation area of the sub-basin. For sub-basins with flow accumulation area less than 60,000 km2, the QRF model outperforms the PLSTM model in most of the sub-basins. For sub-basins with flow accumulation area larger than 60,000 km2, the PLSTM model has an undebatable advantage. In terms of deterministic predictions, the PLSTM model should be more preferred than the QRF model, especially when the raw streamflow is poorly simulated and used as an input. But if we put aside the model performance, the QRF model is more efficient in all cases, saving half the time than the PLSTM model. This study can deepen our understanding of ML and DL models in hydrological post-processing and enable more appropriate model selection in practice.
Yuhang Zhang et al.
Status: final response (author comments only)
RC1: 'Comment on hess-2022-377', Anonymous Referee #1, 15 Dec 2022
- AC1: 'Reply on RC1', aizhong ye, 08 Feb 2023
- AC2: 'Reply on RC1', aizhong ye, 08 Feb 2023
RC2: 'Comment on hess-2022-377', Anonymous Referee #2, 19 Dec 2022
- AC4: 'Reply on RC2', aizhong ye, 08 Feb 2023
RC3: 'Comment on hess-2022-377', Anonymous Referee #3, 08 Jan 2023
- AC3: 'Reply on RC3', aizhong ye, 08 Feb 2023
Yuhang Zhang et al.
Yuhang Zhang et al.
Viewed (geographical distribution)
This study compares two post-processing methods of streamflow simulation obtained using different precipitation products based on satellite data. A comprehensive evaluation is performed on 522 sub-catchments located in China to assess the performances in terms of reliability, sharpness, and various hydrological skills. The paper is well-written and complete, the figures are clear and the interpretations of the results are convincing. My recommendation is that the paper can be accepted for publication after minor corrections which are listed below.
l.44-46: I strongly disagree with this statement. There is no evidence that satellite precipitation estimation is the most promising hydrological model input. As an example, ERA5 is mostly driven by satellite data and is not able to reproduce most of the precipitation features at a high spatial resolution (Bandhauer et al., 2022; Reder et al., 2022), does not reproduce the strong relationships between precipitation characteristics and the topography in mountainous areas, underestimate hourly and daily extreme values and overestimate the number of wet days (Bandhauer et al., 2022). At high spatial and temporal resolutions, the assimilation of ground measurements and/or radar data is needed to reproduce extreme events (Reder et al., 2022). However, I agree that satellite precipitation estimation is valuable in regions where ground measurements are scarce.
l.75: A more recent application of MOS method is provided by Bellier et al. (2018).
l.80: short memory: I guess that ‘term’ is missing between ‘short‘ and ‘memory’.
l.123: serval -> several.
l.195: “so the model is reliable”. Is it possible to rephrase the sentence to indicate that this is an assumption and not your personal judgement? As the authors do not provide evidence that the model is able to reproduce the natural runoff process (I understand that it is not possible), it would be fairer.
l.247: Klotze -> Klotz.
l.255-256: The terms “single-model” and “multi-model” are a bit misleading, as I understand that the authors refer to precipitation products here. I suggest replacing them by “single-precipitation” product and “multi-precipitation” or something similar.
l.348: Missing dot after “threshold”.
l.448: “Little precipitation events”: I was not sure if the authors refer to localized precipitation events here, or with moderate intensities. Is it possible to be more specific?
Bandhauer, Moritz, Francesco Isotta, Mónika Lakatos, Cristian Lussana, Line Båserud, Beatrix Izsák, Olivér Szentes, Ole Einar Tveito, and Christoph Frei. 2022. “Evaluation of Daily Precipitation Analyses in E-OBS (V19.0e) and ERA5 by Comparison to Regional High-Resolution Datasets in European Regions.” International Journal of Climatology 42 (2): 727–47. https://doi.org/10.1002/joc.7269.
Bellier, Joseph, Isabella Zin, and Guillaume Bontron. 2018. “Generating Coherent Ensemble Forecasts After Hydrological Postprocessing: Adaptations of ECC-Based Methods.” Water Resources Research 54 (8): 5741–62. https://doi.org/10.1029/2018WR022601.
Reder, A., M. Raffa, R. Padulano, G. Rianna, and P. Mercogliano. 2022. “Characterizing Extreme Values of Precipitation at Very High Resolution: An Experiment over Twenty European Cities.” Weather and Climate Extremes 35 (March): 100407. https://doi.org/10.1016/j.wace.2022.100407.