the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
GDHPM: A Geostatistical Disaggregation approach for generating hourly Precipitation in Mountainous regions preserving complex temporal patterns
Abstract. Accurate precipitation estimation with high temporal resolution is crucial to monitor and predict natural hazards in mountainous regions. While rain gauges are the reliable source of precipitation data, they lack continuous fine resolution at desired locations, such as avalanche and landslide sites. In this context, temporal disaggregation approaches can be used to obtain continuous hourly precipitation time series that account for the issues observed in mountainous regions, such as, (i) filling gaps in the data, (ii) capturing fine resolution statistical properties using available nearest station record, and (iii) availing longer historical records for better hindcasting. Multiple Point Geostatistics (MPS) approaches are known to mimic spatial patterns from the observed physical reality. This study introduces GDHPM, a temporal disaggregation approach, that investigates the possibility of MPS to search and generate complex temporal patterns. The objective is to simulate hourly precipitation time series from observed daily precipitation at multiple avalanche sites. Moreover, combinations of auxiliary time series from different locations and in varying numbers are tested as covariates. The results reveal that GDHPM produces hourly precipitation ensembles of realistic time patterns over a complex and extensive mountain terrain to improve avalanche and landslide forecasting.
- Preprint
(16703 KB) - Metadata XML
-
Supplement
(1256 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on hess-2024-155', Anonymous Referee #1, 12 Jul 2024
The manuscript proposed a temporal disaggregation approach to downscale precipitation time series from daily to hourly resolution. The approach is applied to 20 sites in the Indian Himalayan Region, in which the study compared the downscaling performance of different covariate selection and averaging techniques. While the choice of the mountainous region is interesting, the innovative contribution of this study is limited. The proposed method, although integrated with several covariate selection techniques, is essentially an analog method that samples historical data to generate future predictions. Previous studies of similar techniques or concepts can be found in Pierce et al., 2014 and Gutmann et al., 2022, albeit with different application scenarios. Therefore, I am not sure if the current manuscript’s novelty and contribution are sufficient to be published in this journal.
Comments:
- One potential improvement is to analyze the influence of elevation on downscaling performance. As the study site is in a mountainous region, you may discuss how much does orography (elevation) influences the analysis. It would be more interesting to see analysis and discussion related to orographic precipitation, beyond comparison between different statistical techniques.
- Another analysis that can improve this study is to explore using covariates other than precipitation. Numerical weather models may have lower biases in simulating other atmospheric variables (e.g., water vapor, vertical wind shear, updraft velocity) compared to precipitation. Investigating the performance of using other covariates can provide insights into covariate-based precipitation simulation in mountainous regions.
- The manuscript needs to revise the methodology section to clarify the procedure of the proposed approach.
- Line 110: This overview paragraph needs to be revised. It contains so many terms such as “training data”, “conditioning data”, “simulation grid”, “auxiliary data”, etc. Readers can get confused since they haven’t read the following content. I suggest simply this overview or directly begin with describing your methodology instead.
- Line 117-118: “Simulation grid” feels like you are describing a spatial feature, but essentially this refers to a time series, which is confusing. Terms such as “auxiliary variable” and “informed time step” need to be clarified or replaced by more common names.
- Line 120 “Target variable simulation is carried out in non-consecutive time steps along a random path”: What does that mean? Do you mean randomly pick up a time step? Please simply the statement.
- Line 125: Why do you need to use both a radius R and the N nearest time steps? You can directly select N nearest time step without specifying a radius.
- Line 139 “for all informed variable”: does “informed variable” here mean precipitation? This is confusing.
- Line 139 “Otherwise, the procedure is repeated from step 1 to 4 until a suitable…”: This is confusing, please clarify.
- Line 160-165: Avoiding mixing the description of the approach and the data processing procedure. It is hard to follow.
- Line 167-169: Simply the data processing statement here.
- Overall, I think the methodology section is overcomplicated. It is basically an analog method or a sampling method based on historical data. Please simply it.
- Line 156: “Training data” and “Conditioning data”: Consider using more common terms such as training/testing data.
- Line 173: “Covariates” here are daily precipitation from one or more nearby sites. But the daily precipitation input in the training data and conditioning data (which is called auxiliary variable I think) is also a covariate, right?
- Figure 1a: The bottom panel figure 1c should be put after figure 2 if following the describing order in your methodology section.
- Figure 2: This figure needs to be revised. The Simulation Grid appeared twice, which seems redundant. There are lots of blank space left in this figure that did not provide useful information. Consider using conceptual time series instead of real rainfall data for better visualization. Also, highlight that it is Xt1 that is being processed.
- Figure 4: How are the 50 ensembles generated? Please clarify it in the methodology.
- Figure 12: Plot the histogram alongside the points might be helpful in showing the distributions.
References:
Pierce, David W., Daniel R. Cayan, and Bridget L. Thrasher. “Statistical Downscaling Using Localized Constructed Analogs (LOCA)*.” Journal of Hydrometeorology 15, no. 6 (December 1, 2014): 2558–85. https://doi.org/10.1175/JHM-D-14-0082.1.
Gutmann, Ethan D., Joseph. J. Hamman, Martyn P. Clark, Trude Eidhammer, Andrew W. Wood, and Jeffrey R. Arnold. “En-GARD: A Statistical Downscaling Framework to Produce and Test Large Ensembles of Climate Projections.” Journal of Hydrometeorology 23, no. 10 (October 2022): 1545–61. https://doi.org/10.1175/JHM-D-21-0142.1.
Citation: https://doi.org/10.5194/hess-2024-155-RC1 -
AC1: 'Reply on RC1', Sanjeev Kumar Jha, 21 Sep 2024
Response to the comments of Reviewer #1
Overview:
The manuscript proposed a temporal disaggregation approach to downscale precipitation time series from daily to hourly resolution. The approach is applied to 20 sites in the Indian Himalayan Region, in which the study compared the downscaling performance of different covariate selection and averaging techniques. While the choice of the mountainous region is interesting, the innovative contribution of this study is limited. The proposed method, although integrated with several covariate selection techniques, is essentially an analog method that samples historical data to generate future predictions. Previous studies of similar techniques or concepts can be found in Pierce et al., 2014 and Gutmann et al., 2022, albeit with different application scenarios. Therefore, I am not sure if the current manuscript’s novelty and contribution are sufficient to be published in this journal.
Response: Thank you for the comment. We agree with the Reviewer that the techniques from the mentioned studies (i.e., Pierce et al., 2014 and Gutmann et al., 2022) share with our approach some mathematical principles as they fall under the same umbrella of data resampling. Nevertheless, the above literature describes spatial downscaling, while we present a temporal disaggregation technique. Moreover, both studies mentioned by the Reviewer used datasets with very low spatial resolution, focusing mainly on generating gridded forecasts with bias correction. This type of approach is unreliable in mountainous terrains as they involve steep slopes and complex topography. For these reasons, we developed, for the first time to our knowledge, a resampling method adapted to mountainous regions, focused on single sites that are critical for disaster impact. One main novelty point of the proposed approach is the choice and test of a specific set of covariates, including multi-site data. Moreover, the DS parameter allows a variable-configuration data pattern with a high degree of flexibility changing during the disaggregation process. This allows to capture complex temporal structures and nonlinear relationships more effectively than traditional resampling methods.
In the revised manuscript, we will add a paragraph in the introduction stressing on the novelties of the DS algorithm and the application proposed here. We will further highlight the novelty of this study.
General Comments:
Comment 1. One potential improvement is to analyze the influence of elevation on downscaling performance. As the study site is in a mountainous region, you may discuss how much does orography (elevation) influences the analysis. It would be more interesting to see analysis and discussion related to orographic precipitation beyond comparison between different statistical techniques.
Response: We thank the Reviewer for pointing out this. We agree with the Reviewer that adding an analysis based on orography may produce a better insight into the results. We will compare the results from a selected experimental run by adding an elevation vs. RMSE plot for each site. If we get a significant effect of orography on the results, then we will add the plot in the revised manuscript. Also, a section will be added in the results explaining the Elevation vs. RMSE plot. The discussion and conclusion related to this plot and analysis will be added in the discussion and conclusion section.
Comment 2. Another analysis that can improve this study is to explore using covariates other than precipitation. Numerical weather models may have lower biases in simulating other atmospheric variables (e.g., water vapor, vertical wind shear, updraft velocity) compared to precipitation. Investigating the performance of using other covariates can provide insights into covariate-based precipitation simulation in mountainous regions.
Response: Thank you for the comment. The study is based on using the DS algorithm in one dimension, which can simulate precipitation time steps. We agree with the Reviewer that adding atmospheric 3D variables such as water vapour, vertical wind shear, and updraft velocity can provide information about vertical atmospheric processes occurring at different elevations in the mountains. However, increasing the dimensionality of the model would lead to a more complex set-up, which can lead to worse results (the data pattern to search becomes too complex) and lower applicability (more input data required). Nevertheless, we are testing our approach with one atmospheric variable as added covariate. The selection of the atmospheric variable will be based on its correlation with precipitation from the dataset HAR-v2, providing atmospheric variables such as water vapour mixing ratio, vertical water vapour flux, pressure, horizontal wind speed on mass grid points, and cloud fraction. This new simulation run will be compared with the one based on precipitation only. A more extensive study with climate covariates at multiple sites will be the object of our future research.
Response to comments on methodology:
The manuscript needs to revise the methodology section to clarify the procedure of the proposed approach.
Comment 1. Line 110: This overview paragraph needs to be revised. It contains so many terms such as “training data”, “conditioning data”, “simulation grid”, “auxiliary data”, etc. Readers can get confused since they haven’t read the following content. I suggest simply this overview or directly begin with describing your methodology instead.
Response: We will simplify the overview and define the terms regarding the algorithm in the methodology section in lines 109 to 115.
Comment 2. Line 117-118: “Simulation grid” feels like you are describing a spatial feature, but essentially this refers to a time series, which is confusing. Terms such as “auxiliary variable” and “informed time step” need to be clarified or replaced by more common names.
Response: We agree with the Reviewer that a simulation grid can be confused with a spatial feature. These terms were chosen in part to keep continuity with previous literature. We will replace the term ‘simulation grid’ with ‘simulated time series’ and ‘time steps’. Also, we will replace the phrase ‘auxiliary variable’ with ‘covariates’. Regarding ‘informed time steps’, In the direct sampling algorithm, the training data contains both coarse and fine temporal scale information for a particular period. The time steps involved in that time period are called ‘informed time steps’. Likewise, the conditioning data contains only coarse scale information, and the fine-scale information is not provided. Hence, time steps involved in conditioning period are called uninformed time steps. We would clarify this clearly in the revised manuscript at line 117-118.
Comment 3. Line 120 “Target variable simulation is carried out in non-consecutive time steps along a random path”: What does that mean? Do you mean randomly pick up a time step? Please simply the statement.
Response: We will clarify this principle in the revised manuscript. In the DS algorithm, data at each time steps is simulated in a random order and sparsely to cover the time progressively and uniformly. As a consequence, this approach first generates patterns of sparse data distant in time, then the simulated time steps become progressively dense and local temporal patterns are completed.
Comment 4. Line 125: Why do you need to use both a radius R and the N nearest time steps? You can directly select N nearest time step without specifying a radius.
Response: Following Oriani et al., (2016), for the principle explained in the previous answer, the DS algorithm first considers sparse-data patterns, then progressively dense-data local patterns in time. In this mechanism, N controls the maximum number of data used to define the patterns, chosen as the closest data available in the simulated time series. R limits the distance in time where the data are considered, i.e. data beyond the radius R are too far in time to be considered.
Comment 5. Line 139 “for all informed variable”: does “informed variable” here mean precipitation? This is confusing.
Response: In line 139, the ‘for all uninformed variables’ suggests the variable to be simulated. In the conditioning data, only fine-scale precipitation is uninformed. The variable phrase in ‘for all uninformed variable’ is creating confusion of usage of multiple variable in the simulation. Hence, we will replace ‘for all uninformed variables’ in line 139 with ‘for all uninformed time steps’.
Comment 6. Line 139 “Otherwise, the procedure is repeated from step 1 to 4 until a suitable…”: This is confusing, please clarify.
Response: This line means that, when searching for N grids in the training data for a time step ti, if the threshold t is not acquired, the process is repeated until the t is less than the given threshold. This process can be understood following Figure 2 of Dembélé et al. (2019).
Comment 7. Line 160-165: Avoiding mixing the description of the approach and the data processing procedure. It is hard to follow.
Response: We agree with the reviewer. The text can be modified as below:
“The DS algorithm requires the input datasets in the form of training data (TD) and conditioning data (CD). The TD is used to train the model using DS algorithm, and the CD is used as a reference to generate the simulations. The time period of data is split into TD and CD in a ratio that CD will contain only the years to be simulated (target years). To simplify, suppose we have n years of data, and nt is the target years; then TD will contain (n-nt) years of data. The DS algorithm can choose the year in the CD as the target year, which means we can generate a simulation for any year from the available time period. The TD contains both coarse and fine-resolution data, while the CD only contains coarse resolution. The fine and coarse resolution data i.e., P_hourly and P_daily, respectively, are provided to the algorithm coherently by replicating P_daily 24 times in TD and CD. Repeating the same values 24 times in P_daily gives the algorithm a flat time series, leading to uncertainty in the simulation of hourly precipitation. Hence, we applied a moving mean technique to every 3-hour and 6-hour P_daily values using a sliding window across neighbouring daily values to smoothen the time series.
Furthermore, the TD and CD also include covariates to assist the algorithm in greater sampling of patterns from TD. Hence, we provided precipitation information from the nearest available sites to the algorithm as covariates. We have used single and multiple covariates in the algorithm in different experiments by keeping in mind their correlation to the site to be simulated. ”
The current explanation of the arrangement of training and conditioning data (from line 157 to 166) can be moved to section 3.4, experimental design; after line number 216.
Comment 8. Line 167-169: Simply the data processing statement here.
Response: The lines can be simplified as below in the revised manuscript.
“The fine and coarse resolution data i.e., P_hourly and P_daily, respectively, are provided to the algorithm coherently by replicating P_daily 24 times in TD and CD. Repeating the same values 24 times in P_daily gives the algorithm a flat time series, leading to uncertainty in the simulation of hourly precipitation. Hence, we applied a moving mean technique to every 3-hour and 6-hour P_daily values using a sliding window across neighboring daily values to smoothen the time series.”
Comment 9. Overall, I think the methodology section is overcomplicated. It is basically an analog method or a sampling method based on historical data. Please simply it.
Response: With the Direct Sampling (DS) the resampling workflow is more complex than a simple analog method, with two added factors: (i) the use of a random simulation path, and (ii) a flexible variable conditioning scheme based on the N closest neighbors to each simulated time step. These two factors allow the model to account for large-scale patterns early in the simulation and gradually focus on more detailed, small-scale patterns as the simulation progresses. For instance, by setting R = 100 and N = 10, the model will use 10 or fewer widely spaced time steps for conditioning at the start, but as the simulation advances, it will use 10 much closer time steps. This creates a varying time dependence, capturing statistical patterns across multiple scales without the need for a complex statistical model. Additionally, when simulating multiple variables together, their relationships are preserved, ensuring realistic results.
This explanation can be added in the methodology section in a paragraph to distinguish the approach from other analog approaches.
General comments:
Comment 3. Line 156: “Training data” and “Conditioning data”: Consider using more common terms such as training/testing data.
Response: Thank you for the suggestion, the terms will be more clearly introduced in the method section. In the context of the direct sampling multiple point geostatistical (MPS) technique, the terms "training data" and "conditioning data" have specific roles that may differ slightly from the traditional "training/testing" terminology. The term “training” here is not in the classical sense (as in model training), but rather as a dataset that guides the simulation. In DS, the training data is a set of historical data, and it is used to find patterns and similarities for generating new data points. In DS, conditioning data is a subset of variables or previous time steps on which the simulation is based. The "conditioning" happens when the algorithm searches for similar patterns in the historical data based on these variables. We prefer to use this terminology to keep continuity with previous MPS studies (Mariethoz et al., 2010; Meerschman et al., 2013; Oriani et al., 2014; Dembélé et al., 2019).
Comment 4. Line 173: “Covariates” here are daily precipitation from one or more nearby sites. But the daily precipitation input in the training data and conditioning data (which is called auxiliary variable I think is also a covariate, right?
Response: Yes, in this study, we have used the term ‘auxiliary variable’ as a synonym to ‘covariate’. In the previous studies related to MPS, such as (Oriani et al., 2016; Singhal et al., 2023), different auxiliary variables are used, assigning multivariate weight to each. In this study, we used only precipitation from other sites as covariates, giving the same weight to all the covariates in the simulation. Hence, agreeing with the Reviewer, we will maintain the name ‘covariate’ throughout the manuscript to avoid confusion.
Comment 5. Figure 1a: The bottom panel figure 1c should be put after figure 2 if following the describing order in your methodology section.
Response: We agree with the Reviewer and we will restructure Figure 1(c) after Figure 2.
Comment 6. Figure 2: This figure needs to be revised. The Simulation Grid appeared twice, which seems redundant. There are lots of blank space left in this figure that did not provide useful information. Consider using conceptual time series instead of real rainfall data for better visualization. Also, highlight that it is Xt1 that is being processed.
Response: Thank you for the suggestion. We will modify the figure so that the simulation grid will not appear twice. Also, we will replace the real rainfall time series with the conceptual time series for better visualization.
Comment 7. Figure 4: How are the 50 ensembles generated? Please clarify it in the methodology.
Response: Thank you for the comment. The 50 ensembles are generated stochastically by simulating the time steps in random order and by randomly sampling the training dataset. This ensures that each realization is slightly different from the others, capturing the variability and uncertainty inherent in precipitation. In the revised manuscript, this information will be added in section 3.1 as point 7 in the methodology description.
Comment 8. Figure 12: Plot the histogram alongside the points that might be helpful in showing the distributions.
Response: We would like to clarify that Figure 12 plots the rainfall values greater than 95 percentile from 3 experimental runs in comparison to the reference data from HAR. Plotting histograms for values greater than 95 percentile will not show any relevant distribution. We feel the Reviewer is suggesting to add box plots where quartiles and median will be visible. We will plot the box and histogram plots and add whichever plot will provide better information.
References:
Pierce, David W., Daniel R. Cayan, and Bridget L. Thrasher. “Statistical Downscaling Using Localized Constructed Analogs (LOCA)*.” Journal of Hydrometeorology 15, no. 6 (December 1, 2014): 2558–85. https://doi.org/10.1175/JHM-D-14-0082.1.
Gutmann, Ethan D., Joseph. J. Hamman, Martyn P. Clark, Trude Eidhammer, Andrew W. Wood, and Jeffrey R. Arnold. “En-GARD: A Statistical Downscaling Framework to Produce and Test Large Ensembles of Climate Projections.” Journal of Hydrometeorology 23, no. 10 (October 2022): 1545–61. https://doi.org/10.1175/JHM-D-21-0142.1.
Dembélé, M., Oriani, F., Tumbulto, J., Mariéthoz, G., and Schaefli, B.: Gap-filling of daily streamflow time series using Direct Sampling in various hydroclimatic settings, J. Hydrol., 569, 573–586, https://doi.org/10.1016/j.jhydrol.2018.11.076, 2019.
Mariethoz, G., Renard, P., and Straubhaar, J.: The direct sampling method to perform multiple-point geostatistical simulations, Water Resour. Res., 46, 1–14, https://doi.org/10.1029/2008WR007621, 2010.
Meerschman, E., Pirot, G., Mariethoz, G., Straubhaar, J., Van Meirvenne, M., and Renard, P.: A practical guide to performing multiple-point statistical simulations with the Direct Sampling algorithm, Comput. Geosci., 52, 307–324, https://doi.org/10.1016/j.cageo.2012.09.019, 2013.
Oriani, F., Straubhaar, J., Renard, P., and Mariethoz, G.: Simulation of rainfall time series from different climatic regions using the direct sampling technique, Hydrol. Earth Syst. Sci., 18, 3015–3031, https://doi.org/10.5194/hess-18-3015-2014, 2014.
Oriani, F., Borghi, A., Straubhaar, J., Mariethoz, G., and Renard, P.: Missing data simulation inside flow rate time-series using multiple-point statistics, Environ. Model. Softw., 86, 264–276, https://doi.org/10.1016/j.envsoft.2016.10.002, 2016.
Singhal, A., Cheriyamparambil, A., Samal, N., and Jha, S. K.: Relating forecast and satellite precipitation to generate future skillful ensemble forecasts over the northwest Himalayas at major avalanche and glacier sites, J. Hydrol., 616, 128795, https://doi.org/10.1016/j.jhydrol.2022.128795, 2023.
Citation: https://doi.org/10.5194/hess-2024-155-AC1
-
RC2: 'Comment on hess-2024-155', Anonymous Referee #2, 14 Aug 2024
The authors claimed they present a novel approach named GDHPM, which utilizes a geostatistical disaggregation method to generate hourly precipitation data in mountainous regions. The approach leverages Multiple Point Statistics (MPS) and the Direct Sampling (DS) algorithm to simulate fine temporal resolution precipitation data from coarse daily data. The study focuses on the Indian Himalayan Region, to improve avalanche and landslide forecasting by providing more accurate high-resolution precipitation data. The authors conducted several experiments with different configurations and covariates to test the effectiveness of their approach. However, many major points need to be clearly clarified.
Major Comments:
The application of MPS-based DS for temporal disaggregation significantly contributes to the field, particularly in the context of hydrological modeling in complex terrains like the Himalayas. The approach addresses a critical need for high-resolution precipitation data, which is crucial for disaster risk management in mountainous regions. However, there is a high-resolution precipitation dataset in the study area, i.e., the High Asia Refined analysis version 2 (HAR-v2) data at daily and hourly time scales. The authors seem to link the relationship between HAR-v2 and station observations. However, the observational precipitation dataset is not presented. Further, the authors just correct the biases of HAR-v2, which far beyond the authors’ ambitions.
The authors have provided a detailed account of the DS algorithm and its application in this context. The use of various covariate selection techniques (nearest neighbor, Pearson correlation coefficient, complex networks) adds robustness to the study. However, the manuscript would benefit from a more detailed explanation of how the fixed and varied parameters were selected, and the sensitivity of the model results to these parameters. This information would help in understanding the generalizability and potential limitations of the proposed approach. Moreover, how to obtain the data of covariances is not presented in the current manuscript.
The results demonstrate the effectiveness of the GDHPM approach in generating hourly precipitation data that closely matches observed values. The use of multiple error metrics (RMSE, MAE, BIAS%) provides a clear evaluation of model performance. However, the comparative results between the derived new dataset and the HAR-v2 are not clear.
The manuscript could be improved by providing a clearer discussion on the physical significance of the differences observed between the different experimental setups. For instance, why did the 3-hour moving average generally perform better than the 6-hour moving average? Additionally, the authors should discuss the implications of the overestimations and false simulations observed in some experiments.
The discussion section provides a good summary of the results, but it could be strengthened by comparing the GDHPM approach with other existing methods in the literature. How does this method perform relative to traditional disaggregation techniques? Are there specific scenarios where GDHPM is particularly advantageous or, conversely, less effective?
The conclusions are sound and highlight the potential applications of the GDHPM approach. However, the manuscript would benefit from a more explicit discussion of the limitations and possible future improvements, particularly in the context of rare extreme events where the model showed some difficulties.
Citation: https://doi.org/10.5194/hess-2024-155-RC2 -
AC2: 'Reply on RC2', Sanjeev Kumar Jha, 21 Sep 2024
Response to the Comments from Reviewer #2
The authors claimed they present a novel approach named GDHPM, which utilizes a geostatistical disaggregation method to generate hourly precipitation data in mountainous regions. The approach leverages Multiple Point Statistics (MPS) and the Direct Sampling (DS) algorithm to simulate fine temporal resolution precipitation data from coarse daily data. The study focuses on the Indian Himalayan Region, to improve avalanche and landslide forecasting by providing more accurate high-resolution precipitation data. The authors conducted several experiments with different configurations and covariates to test the effectiveness of their approach. However, many major points need to be clearly clarified.
Major Comments:
Comment 1. The application of MPS-based DS for temporal disaggregation significantly contributes to the field, particularly in the context of hydrological modeling in complex terrains like the Himalayas. The approach addresses a critical need for high-resolution precipitation data, which is crucial for disaster risk management in mountainous regions. However, there is a high-resolution precipitation dataset in the study area, i.e., the High Asia Refined analysis version 2 (HAR-v2) data at daily and hourly time scales. The authors seem to link the relationship between HAR-v2 and station observations. However, the observational precipitation dataset is not presented. Further, the authors just correct the biases of HAR-v2, which far beyond the authors’ ambitions.
Response: We thank the Reviewer for acknowledging the contribution of our research. We understand that there is a high-resolution precipitation dataset in the study area from HAR-v2 at 10 km x 10 km spatial resolution. Our primary goal was to demonstrate that the GDHPM approach can produce hourly precipitation using daily precipitation, particularly in regions where ground observations or station data are sparse and high-resolution datasets may have biases or inaccuracies. This is especially important in disaster risk management for landslide and avalanche forecasting, where precise localized precipitation data is crucial. This is explained in the introduction from lines 30 to 36, and the objectives are mentioned in lines 69 to 74. Further, we have not linked the relationship between HAR-v2 and station observations in this study. We have only used HAR-v2 data at a daily temporal scale to produce precipitation at an hourly temporal scale. We wanted to focus on the fact that the proposed GDHPM approach can produce future hourly precipitation using historical daily and hourly precipitation at any avalanche sites. For demonstration, the results with multiple realizations are then compared with the existing HAR-v2 hourly data to show GDHPM approach is capable of producing hourly precipitation that captures the uncertainty, too. Regarding the bias correction of HAR-v2, our primary contribution is the development and application of the MPS-based DS method to generate high-resolution hourly precipitation from coarse daily data. We will further clarify that the bias correction step was not in our aim of the study.
Comment 2: The authors have provided a detailed account of the DS algorithm and its application in this context. The use of various covariate selection techniques (nearest neighbor, Pearson correlation coefficient, complex networks) adds robustness to the study. However, the manuscript would benefit from a more detailed explanation of how the fixed and varied parameters were selected, and the sensitivity of the model results to these parameters. This information would help in understanding the generalizability and potential limitations of the proposed approach. Moreover, how to obtain the data of covariances is not presented in the current manuscript.
Response: Thank you for the comment, we will improve the methods section according to the following remarks. For setting up the model, we used the parameters similar to Oriani et al. (2014) which provides standard values for the DS setup for the simulation of rainfall time series. The parameters are set up to optimize the accuracy and computational time of the simulations. The parameters were adjusted, especially the values for maximum search distance (R) and number of points (n) in the neighborhood. Fixing the same R and n values for all the experimental runs helps to study the effect of covariate selection more conveniently and depicts the generalizability of the approach. We will explain the parameter sensitivity part in the revised manuscript.
Regarding the data of covariance, we guess the Reviewer is referring to how to finalize the simulation covariate, i.e. the additional variable used to guide the simulation of the target one. Here, we have used the precipitation from the nearest site in a pool of selected sites as covariates. For selecting the covariate, we have used three techniques such as nearest neighbor, correlation coefficient, and complex networks (explained in line 170 to 192). The nearest neighbor is decided according to the Euclidean distance among the latitude and longitude of the sites. The Pearson correlation coefficient is calculated among all the precipitation time series to find the most correlated time series. In complex networks, the correlation coefficient together with the correlation threshold decides the sites to be used as covariates.
Comment 3: The results demonstrate the effectiveness of the GDHPM approach in generating hourly precipitation data that closely matches observed values. The use of multiple error metrics (RMSE, MAE, BIAS%) provides a clear evaluation of model performance. However, the comparative results between the derived new dataset and the HAR-v2 are not clear.
Response: Thank you for the comment. The results section is divided into three parts to explain the three experiments regarding no covariate (section 4.1), one site covariate (section 4.2) and multi-site covariate (section 4.4). However, we see that section 4.3 added at line 343 is a typing error. The typing errors observed in section 4 and corrections will be as below:
Line number
Existing section number
Correct section number
343
4.3
4.2.2
371
4.4
4.3
372
4.4.1
4.3.1
390
4.5
4.3.2
412
4.6
4.3.3
422
4.7
4.4
In the revised manuscript, we will rename the sections to correct the order of the result representation. Also, we will clarify the interpretation of comparative results through error statistics in each experiment.
Comment 4: The manuscript could be improved by providing a clearer discussion on the physical significance of the differences observed between the different experimental setups. For instance, why did the 3-hour moving average generally perform better than the 6-hour moving average? Additionally, the authors should discuss the implications of the overestimations and false simulations observed in some experiments.
Response: Thank you for your valuable feedback. We recognise the need to better explain the performance of the 3-hour moving average compared to the 6-hour moving average. In the revised manuscript, we will add the following key points to the discussion after line number 462:
i) In complex mountainous regions like the Himalayas, precipitation can vary significantly over short distances and time periods due to localized convective events and orographic lifting. The 3-hr window allows for more accurate tracking of these short-term, localized precipitation patterns, leading to improved simulation results. However, 6-hr moving average may smooth out important short-term precipitation events. By averaging over a longer time period, the model loses its ability to capture rapid changes in precipitation intensity.
ii) Overestimation is not observed in any experiments. Some of the ensembles show overestimation, which helps to cover the uncertainty in the hourly precipitation.
iii) Generating false simulation is seen in some experiments which contain no covariates or nearest site covariates. In no covariate experiment, unrealistic simulations can be generated by aligning with the daily precipitation at every hour. unrealistic simulation can occur in the nearest site covariate experiment due to misalignment between the nearest site precipitation and the target site. In highly variable terrains like the Himalayas, precipitation patterns may not be spatially uniform, and relying on neighboring sites can sometimes result in incorrect predictions.
Comment 5: The discussion section provides a good summary of the results, but it could be strengthened by comparing the GDHPM approach with other existing methods in the literature. How does this method perform relative to traditional disaggregation techniques? Are there specific scenarios where GDHPM is particularly advantageous or, conversely, less effective?
Response: Thank you for the comment. To our knowledge, we are implementing temporal disaggregation of precipitation in mountainous regions for the first time using multiple point geostatistics. However, considering the Reviewer’s comment, we have found some relevant studies which can be added to the discussion section in the revised manuscript to provide a comparison. Nourani and Farboudfam, (2019) has implemented hybrid wavelet-artificial intelligence methods to disaggregate rainfall time series in mountainous regions and showed RMSE up to 0.9 for certain sites, whereas the GDHPM approach has reduced the RMSE up to 0.1 to 0.4 at different sites.
Artificial Intelligence techniques can be a valid candidate for comparison with DS, but they rely on different training conditions: typically larger and more redundant training data. An extensive comparative study focusing on different data availability scenarios and mountain settings, instead of optimizing the algorithm setup, would be necessary to ensure a fair comparison. That can constitute a study on its own in future research.
Moreover, a paragraph showing a comparison with other analog approaches will be added in the introduction in the revised manuscript to distinguish GDHPM from others. The limitation of this study can be given by mentioning Acharya et al., (2022), who have proposed a temporal disaggregation approach that assimilates observed datasets and produces better accuracy in the simulations. This limitation can be added in the discussion.
Comment 6: The conclusions are sound and highlight the potential applications of the GDHPM approach. However, the manuscript would benefit from a more explicit discussion of the limitations and possible future improvements, particularly in the context of rare extreme events where the model showed some difficulties.
Response: Thank you for the comment. We will add a paragraph in the conclusion after line 502, discussing the following limitations of the study:
i) The recent rare extreme events occurring in the Himalayan region due to climate change (eg., flash floods) may not be well-represented in the historical training dataset, which can lead to difficulties in simulating such events accurately.
ii) In the case of rare extreme events, using nearby site precipitation as covariates could lead to over- or underestimations when neighboring sites are not experiencing the same extreme conditions.
In addition to the already mentioned future improvements, we will discuss the following points in the revised manuscript:
i) The DS can be set up incorporating atmospheric variables as covariates, such as vertical wind, water vapour, updraft velocity etc. that can give insights into the vertical movement of water vapour and help improve precipitation simulations.
ii) The study can incorporate the use of observation data at specific avalanche sites which can either help in data assimilation or bias correction of the simulated data to provide more accurate precipitation results.
References:
Acharya, S. C., Nathan, R., Wang, Q. J., and Su, C. H.: Temporal disaggregation of daily rainfall measurements using regional reanalysis for hydrological applications, J. Hydrol., 610, 127867, https://doi.org/10.1016/j.jhydrol.2022.127867, 2022.
Nourani, V. and Farboudfam, N.: Rainfall time series disaggregation in mountainous regions using hybrid wavelet-artificial intelligence methods, Environ. Res., 168, 306–318, https://doi.org/10.1016/j.envres.2018.10.012, 2019.
Citation: https://doi.org/10.5194/hess-2024-155-AC2
-
AC2: 'Reply on RC2', Sanjeev Kumar Jha, 21 Sep 2024
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
240 | 72 | 26 | 338 | 43 | 17 | 16 |
- HTML: 240
- PDF: 72
- XML: 26
- Total: 338
- Supplement: 43
- BibTeX: 17
- EndNote: 16
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1