Statistical post-processing of precipitation forecasts using circulation classifications and spatiotemporal deep neural networks
- 1College of Hydrology and Water Resources, Hohai University, Nanjing 210098, China
- 2CMA-HHU Joint Laboratory for HydroMeteorological Studies, Nanjing, Jiangsu
- 1College of Hydrology and Water Resources, Hohai University, Nanjing 210098, China
- 2CMA-HHU Joint Laboratory for HydroMeteorological Studies, Nanjing, Jiangsu
Abstract. Statistical post-processing techniques are widely used to reduce systematic biases and quantify forecast uncertainty in numerical weather prediction (NWP). In this study, we propose a method to correct the raw daily forecast precipitation by combining large-scale circulation patterns with local spatiotemporal information such as topography and meteorological factors. Particularly, we first use the self-organizing map (SOM) model to classify large-scale circulation patterns for each season, then build the convolutional neural network (CNN) model to extract spatial information (e.g., elevation, specific humidity, and mean sea level pressure) and long short-term memory network (LSTM) model to extract time series (e.g., t, t-1, t-2), and finally correct local precipitation for each circulation pattern separately. Furthermore, the proposed method (SOM-CNN-LSTM) is compared with other benchmark methods (i.e., CNN, LSTM, and CNN-LSTM) in the Huaihe River basin with a lead time of 15 days from 2007 to 2021. The results show that the proposed SOM-CNN-LSTM post-processing method outperforms other benchmark methods for all lead times and each season with the largest correlation coefficient improvement (32.30 %) and root mean square error reduction (26.58 %). Moreover, the proposed method can effectively capture the westward and northward movement of the western Pacific subtropical high (WPSH), which impacts the basin's summer rain. The results illustrate that incorporating large-scale circulation patterns with local spatiotemporal information is a feasible and effective post-processing method to improve forecasting skills, which would benefit hydrological forecasts and other applications.
- Preprint
(1839 KB) -
Supplement
(839 KB) - BibTeX
- EndNote
Tuantuan Zhang et al.
Status: open (until 27 Feb 2023)
-
RC1: 'Comment on hess-2022-432', Anonymous Referee #1, 17 Jan 2023
reply
Overall comment:
This manuscript proposed the SOM-CNN-LSTM post-processing method to correct the raw daily forecast precipitation by combining large-scale circulation patterns with local spatiotemporal information. The proposed method showed better performance than other benchmark methods (i.e., CNN, LSTM, CNN-LSTM). The paper is very interesting, well written and well structured. We highly recommend the paper for publication with moderate revision.
Major comments:
- I think it would be good for the readers if the authors could briefly add the meaning of four-fold cross-validation in this study.
- As for predictors, why didn’t the authors consider to use reanalysis data as predictors to establish the post-processing model? Based on my experience, the reanalysis data (e.g., ERA5) is more accurate than the forecast data. Meanwhile,in addition to the predictors mentioned in the paper, the vertical velocity affecting precipitation is also worth to be noted.
- Is the circulation pattern the same in each lead time? This point is not clear.
Minor issues:
- Line 102. “Study area and datasets” should be “Methodology”.
- Line 123. “” should be “”.
- Line 188-189. “southeast” and “southeastern” should be consistent.
- Line 232, “each season” is more appropriate than “every season” here.
- Line 305, “we compare the method” should be “We compare the method”
-
RC2: 'Comment on hess-2022-432', Anonymous Referee #2, 28 Jan 2023
reply
The authors introduced a new statistical post-processing method by incorporating large-scale circulation patterns with local spatiotemporal information, which is valuable for Hydrology and Earth System Sciences. However, it still has some questions and need a revision for publishing.
(1) Section 3 study area and datasets: The title is the same as section 2. Check the title carefully.
(2) Section 3.1 SOM model: Equation (1) may be incorrect, please check all equations to make sure all of them are correct.
(3) Section 3.1 SOM model: How to determine the larger domain (95–135°E, 12–53°N) for circulation classification? What is the impact of watershed in China on circulation classification?
(4) Section 3.2 CNN-LSTM model: How to consider spatial information in the CNN model? It is not clear.
(5) Section 3.2 CNN-LSTM model: In data preparation, the authors took summer precipitation as an example for explanation, so it might be better to add “Take summer precipitation as an example” before the sentence “First, each predictor is normalized…”
(6) Section 3.2 CNN-LSTM model: The authors selected 14 predictors as the input of the CNN-LSTM model and were shown in Figure 2, but it may be better to add a table for 14 predictors with corresponding description.
(7) Section 5 discussion: The following references may be helpful to discuss the violent rain.
Chen G, Wang W C. Short‐Term Precipitation Prediction for Contiguous United States Using Deep Learning[J]. Geophysical Research Letters, 2022, 49(8): e2022GL097904.
Li J, Sharma A, Evans J, et al. Addressing the mischaracterization of extreme rainfall in regional climate model simulations–A synoptic pattern based bias correction approach[J]. Journal of Hydrology, 2018, 556: 901-912.
(8) L218: “Once the four post-processing” should be “once the four post-processing”. Please check the manuscript to avoid similar errors.
(9) L307 & L316: “SHAP” should be “WPSH”.
(10) L320: Is “can” more accurate than “could”?
-
RC3: 'Comment on hess-2022-432', Anonymous Referee #3, 29 Jan 2023
reply
Review on “Statistical post-processing of precipitation forecasts using circulation classifications and spatiotemporal deep neural networks”
In this manuscript, the authors have proposed a statistical post-processing method that can simultaneously take into account the effects of large-scale circulation patterns and local spatiotemporal information to calibrate the ECMWF forecast dataset for the Huaihe River basin. The study is well developed and the expected results have been achieved. The new model proposed by the authors has the best calibration capability for different seasons, lead times and precipitation intensities. Overall, the study is innovative and has a high degree of completion which deserves to be published, but some issues still need to be corrected or further clarified.
Major comments:
- In the construction of the SOM-CNN-LSTM post-processing methodology, the SOM model was used to identify and classify different large-scale circulation patterns. In selecting of the SOM node, the authors have tested that the 2×3 configuration is physically interpretable. It should be explained what the node here refers to in the SOM model and what their role is. Also explain why 2×3 is interpretable.
- The authors used three statistical metrics in their study to evaluate the prediction skill and the ability of the correction, but only one of them was used to evaluate and present the results in each of the relevant experiments shown in Figures 7 to 9, respectively. Consideration could be given to including the results of all evaluation metrics from the relevant experiments in the supporting material to more fully demonstrate the features and advantages of the SOM-CNN-LSTM method
- The study focuses on the Huaihe River basin in China. The application and development of similar research in the region should be described in the manuscript to further highlight the main purpose and innovation of this study.
Minor comments:
- L82,305 ‘we’ => ‘We’.
- L95 ‘contains’ => ‘contain’.
- L102 The title of section 3 is wrong.
- L123 The formula is incomplete.
- Change the use of color table in Figure 8. The authors use only one color table in Figure 8 to represent two types of data, correlations and changes in correlations, which can be confusing. Also, this color table is more appropriate to represent the variation between positive and negative values, which is not the case for the two variables in this figure.
- In Figure 9, the conclusion the authors most wanted to express would have been the difference between the precipitation predictions for different years, but at the same time they also point out that the SOM-CNN-LSTM method performs the best. However, the color table used and the type of Figure 9 make the latter conclusion very unclear, at least compared to the other figures in the paper. Also, the correspondence between color table and value is not fixed.Therefore,the author should consider a more appropriate way of presenting the relevant conclusions.
- L307 Is the 'SHAP' used here incorrectly? If not, it is needed to clarify this abbreviation.
Tuantuan Zhang et al.
Tuantuan Zhang et al.
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
246 | 76 | 12 | 334 | 23 | 3 | 3 |
- HTML: 246
- PDF: 76
- XML: 12
- Total: 334
- Supplement: 23
- BibTeX: 3
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1