the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Deep learning based sub-seasonal precipitation and streamflow forecasting over the source region of the Yangtze River
Abstract. Hydrometeorological forecasting is crucial for managing water resources and mitigating the impacts of extreme hydrologic events. At sub-seasonal scales, readily available hydrometeorological forecast products often exhibit large uncertainties and insufficient accuracies to support decision making. We propose a deep learning based modelling framework for sub-seasonal joint precipitation and streamflow forecasts for a lead time of up to 30 days. This is achieved by coupling (1) a convolutional neural network (CNN) architecture with ResNet blocks for statistically downscaling of the ECMWF raw precipitation forecasts to (2) a hybrid hydrologic model integrating the conceptual Xin’anjiang model (XAJ) and the long-short term memory network (LSTM) for streamflow forecasting. The CNN incorporates a specialized loss function that combines the continuous form of threat score and mean absolute error. Applying the modeling framework to the source region of the Yangtze River Basin, results indicate that the CNN-based downscaling model exhibits ~13 % and ~10 % less RMSE than the raw ECMWF forecasts and the quantile mapping (QM) forecasts, respectively, averaged over the 30-day lead time. Similarly, the CNN achieves a ~2 % and ~5 % lower RMSE than raw forecasts and QM for precipitation events above the 90th percentile of historic daily precipitation. Using these precipitation forecasts as meteorological drivers for the hybrid XAJ-LSTM hydrologic model, we found that forecasted streamflow and flood peaks driven by CNN-based precipitation forecasts have 18 %–32 % lower relative errors and 13 %–22 % lower RMSE compared to those driven by raw forecasts. However, the standalone XAJ model shows marginal improvements, or in some cases, no improvement at all, with the same enhanced precipitation forecasts. This highlights the importance of understanding the effectiveness of the hydrologic model as part of the sub-seasonal hydrometeorological modeling chain. Our study is expected to provide implications for leveraging advanced AI techniques to enhance sub-seasonal hydrometeorological forecasting accuracy and operational efficiency for effective water resources management and disaster preparedness.
- Preprint
(1984 KB) - Metadata XML
-
Supplement
(179 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on hess-2024-212', Anonymous Referee #1, 21 Aug 2024
This manuscript assesses a convolutional neural network architecture with ResNet blocks to statistically downscaling of the ECMWF raw precipitation forecasts and uses a hybrid hydrologic model integrating the conceptual Xin’anjiang model and the long-short term memory network for streamflow forecasting. The results show that the CNN-based downscaling model exhibits ~13% and ~10% less RMSE than the raw ECMWF forecasts and the quantile mapping forecasts in the source region of the Yangtze River Basin and forecasted streamflow and flood peaks driven by CNN-based precipitation forecasts have 18%-32% lower relative errors and 13%-22% lower RMSE compared to those driven by raw forecasts using the precipitation forecasts as meteorological drivers for the hybrid hydrologic model. The manuscript is consistent with the journal and the conclusions are reasonable. However, some revisions are recommended before it can be considered for publication in this journal.
Major comments:
In introduction, L45-70 introduce the advantages and disadvantages of dynamic downscaling and statistical downscaling, which are well known to us, and just to illustrate a few deep learning methods. Failure to highlight the research focus of this paper, i.e., the progress of research on traditional statistical downscaling and statistical downscaling combined with deep learning.
I do not understand the role of L70-75, it seems to describe weather prediction models like Pangu and GraphCast which can achieve entirely through deep learning have demonstrated the potential to achieve forecast skills comparable to state-of-the-art numerical weather prediction systems. Is this relevant to the research in this paper?
About result, some of the conclusions seem too brief. L255-260 have no results about the relative error stands, the mean absolute error and relative error of simulated maximum daily flow. The author writes "while the RMSE of EC-QM forecasts sees a relatively steady reduction over all lead times"(L280-285), but from the Fig.4, we can clearly find that EC-QM showed more RMSE compared to EC for the lead times of 23-24days, 27 day. The author can explain the reason? And please draw the spatial distribution of the bias of EC-CNN, EC-QM and EC because it mentioned in Line 310.
In Chapter 4.3, "the EC-QM (EC-CNN) forecasts reduce the relative error of the raw forecasts by 12% (20%), 16% (24%) and 9% (21%), respectively, and reduces the relative error of maximum daily flow by 27% (29%), 16% (32%) and 11% (18%)", whether EC-QM or EC-CNN showed the RMSE decreased more or the lead times of 11-20 days than 1-10 days, but we think that the smaller the lead time, the better the results. Can the author give a reasonable explanation?
Since the article analyses individual cases, in conjunction with Figure 8, please give quantitative indicators to analyse the description.
Minner comment:
- L38: "S. Zhu et al., 2020" should be modified.
- L60: "Wilby, R. L., et al., 2004; Vrac, M., & Friederichs, P., 2015" should be modifed.
- L87-88: "For example, Humphrey et al. (2016) achieved improved streamflow forecast skills by combining Bayesian artificial neural networks with traditional models of GR4J. " What's the conclusion?
- L142-143: "streamflow and flood peak forecasts of XAJ-LSTM and standalone XAJ driven by EC-CNN forecasts are then quantitatively evaluated against those driven by raw and QM-based forecasts using a series of metrics. " Please write the full name for the first occurrence.
- L262-263: "The results indicate that the daily Nash-Sutcliffe Efficiency (NSE)", abbreviations have already been mentioned.
Citation: https://doi.org/10.5194/hess-2024-212-RC1 -
AC1: 'Reply on RC1', Ningpeng Dong, 01 Sep 2024
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2024-212/hess-2024-212-AC1-supplement.pdf
-
RC2: 'Comment on hess-2024-212', Anonymous Referee #2, 12 Sep 2024
Review comments on “Deep Learning based sub-seasonal precipitation and streamflow forecasting over the source region of the Yangtze River”
In the submitted manuscript, the authors apply CNN-ResNet to statistically downscale S2S precipitation forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF) at the source region of the Yangtze River Basin for streamflow predictions over a 30-day forecast horizon. Specifically, the S2S precipitation forecasts are downscaled, incorporating 19 additional forecast variables, to a 0.25-degree spatial resolution over the study region. The downscaled S2S precipitation forecasts are then run through the physically-based Xin’anjiang (XAJ) model and a hybrid model that combines XAJ with LSTM (XAJ-LSTM) to generate two sets of streamflow predictions. The results demonstrate that the proposed CNN-ResNet improves S2S precipitation forecast skill compared to the baseline Quantile Mapping (QM) approach. Consequently, the CNN-ResNet corrected S2S precipitation leads to more skillful streamflow predictions. A comparison between XAJ and the hybrid XAJ-LSTM indicates that XAJ-LSTM better translates the improved S2S precipitation forecast skill into streamflow predictions. Overall, the reviewer finds the manuscript to be well-written and of high potential for publication in the Journal of HESS. However, the reviewer has a few comments and suggestions that should be addressed. Therefore, it is recommended that the submitted manuscript undergo some substantial revisions. Detailed comments are listed below:
1. Methodology (QM and the corresponding evaluations):
Although QM is a widely applied technique and should be familiar to the general audience of HESS, the reviewer believes that some key information or references may need to be incorporated. Precipitation data is known to be highly skewed, making the selection of the distribution function critically important for the effectiveness of QM. However, such details appear to be missing in the current manuscript. Additionally, the reviewer is curious about the specific implementation of QM. Was seasonality or the variation in forecast lead times considered in the QM-based bias removal process? More detailed documentation on this aspect should be included in the manuscript.
Furthermore, given that QM is primarily designed for bias removal rather than enhancing the temporal correspondence between forecast time series and observations, the reviewer suggests that the authors also evaluate the resulting forecasts (both precipitation and streamflow) in terms of their bias. Specifically, the overall CDF of precipitation forecasts generated by different statistical downscaling methods should be compared. Given that these downscaled precipitation forecasts eventually run through lumped hydrologic models, CDF of the areal averaged precipitation forecasts is perhaps a good way to demonstrate the bias condition at all percentiles across the study region. While the proposed DL technique improves the predictive skill of S2S precipitation, it would be valuable to see whether it also reduces forecast bias compared to QM.
2. Methodology (Statistical downscaling of S2S precipitation forecasts):
The reviewer feels that the description of the employed statistical downscaling techniques is unclear in general. It appears that the proposed CNN-ResNet generates a single precipitation prediction value while using multiple spatially distributed forecast variables as inputs (Figure 2). If this is indeed the case, the proposed framework seems more like an "upscaling " rather than "downscaling" technique. This also raises questions about how the authors produced the spatially distributed precipitation climatology plot (Figure 6). Additionally, given that CNN-based structures typically produce square-shaped outputs, were any masks applied during the training of the proposed CNN-ResNet?
Similarly, is QM conducted at each pixel across the study watershed? If so, does the raw spatial resolution of the S2S precipitation forecast match that of the reference precipitation? These questions are particularly relevant considering the employed hydrologic models are lumped. It is important to clarify for the audience at which specific technical step(s) the spatially distributed forecast variables are converted into area averages.
In general, it is recommended that the entire methodology section be revised to avoid potential confusion and to ensure clarity on the steps involved in the downscaling process.
3. Seasonal and spatial skills and variabilities:
The reviewer suggests conducting additional seasonal and spatial analysis to better highlight the strengths and weaknesses of the proposed CNN-ResNet downscaling technique. First, the reviewer notes that the study watershed covers a broad area (around 10 degrees, or approximately 1000 km, in both the north-south and east-west directions). This suggests significant spatial and seasonal variability in terms of precipitation generation mechanisms, magnitude, frequency, etc. However, this variability is not discussed in the manuscript, limiting the audience's understanding of the study watershed.
Building on this, it would be interesting to examine whether the proposed framework is equally effective across different seasons and geospatial locations, or if its performance varies. The reviewer believes such an analysis would be crucial in further enhancing the quality of the manuscript. Consequently, it is recommended that the authors evaluate the post-processed precipitation both spatially and seasonally. Since the proposed method is a statistical downscaling technique, it is important to demonstrate its skill over such a large study region. This additional analysis would provide valuable insights into the effectiveness of the method across different conditions.
4. Probabilistic and ensemble forecasts:
If the potential workload is manageable, the reviewer strongly recommends that the authors utilize the entire ensemble of S2S precipitation forecasts from ECMWF in their experiments, rather than focusing on the ensemble means. The primary reason for this suggestion is that neither precipitation forecasts nor the corresponding streamflow predictions can be applied deterministically at a subseasonal timescale due to limited skills at longer forecast lead times.
At this timescale, probabilistic forecasts are typically constructed using multiple predictions (i.e., ensemble forecasts). While the proposed framework appears effective and interesting, the reviewer believes its full potential can be better demonstrated with a revised experimental design that aligns more closely with real-world needs (i.e., ensemble predictions).
Following this suggestion, the reviewer suggests the authors to incorporate additional probabilistic evaluation metrics, such as CRPS or CRPSS, for a more comprehensive assessment of the framework's performance for both post-processed precipitation forecasts and the corresponding streamflow predictions.
Other specific comments:
Lien 126: What is the naive spatial resolution of the collected S2S precipitation forecasts from ECMWF?
Line 142: EC-CNN is referenced here for the first time in the manuscript, but without a clear explanation.
Line 250: It seems a standardized metric is employed here (i.e., NSE) to evaluate the hydrologic model calibration. The reviewer wonders why switch to RMSE and other metrics for later streamflow predictive skill evaluation? While RMSE is a widely applied metric in many fields, standardized metrics such as NSE and KGE might be more familiar to researchers in the hydrology community.
Line 354: Perhaps “forecast issue date” is more appropriate for the titles of different panels in Figure 8. Also, it would be interesting to see these examples where the proposed framework delivers more accurate streamflow predictions. Overall skill evaluation would still be more informative in general. Perhaps these figures could be included in the supplementary material so that previous suggested additional evaluation and analysis could be included in the main manuscript.
Citation: https://doi.org/10.5194/hess-2024-212-RC2 -
AC2: 'Reply on RC2', Ningpeng Dong, 10 Oct 2024
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2024-212/hess-2024-212-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Ningpeng Dong, 10 Oct 2024
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
497 | 175 | 74 | 746 | 63 | 16 | 10 |
- HTML: 497
- PDF: 175
- XML: 74
- Total: 746
- Supplement: 63
- BibTeX: 16
- EndNote: 10
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1