The value of hydroclimatic teleconnections for snow-based seasonal streamflow forecasting in central Asia

Umirbekov, Atabek; Peña-Guerrero, Mayra Daniela; Didovets, Iulii; Apel, Heiko; Gafurov, Abror; Müller, Daniel

doi:https://doi.org/10.5194/hess-29-3055-2025

Articles | Volume 29, issue 14

https://doi.org/10.5194/hess-29-3055-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/hess-29-3055-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 29, issue 14

Research article

| Highlight paper

|

18 Jul 2025

Research article | Highlight paper |

| 18 Jul 2025

The value of hydroclimatic teleconnections for snow-based seasonal streamflow forecasting in central Asia

Atabek Umirbekov, Mayra Daniela Peña-Guerrero, Iulii Didovets, Heiko Apel, Abror Gafurov, and Daniel Müller

Download

Final revised paper (published on 18 Jul 2025)
Supplement to the final revised paper
Preprint (discussion started on 26 Jul 2024)

Interactive discussion

Status: closed

RC1:
'Comment on hess-2024-174', Anonymous Referee #1, 16 Aug 2024

Review of ‘The value of hydroclimatic teleconnections for snow-based seasonal streamflow forecasting,’ Umirbekov et al., HESS Discussions
This submission summarizes the motivation for, and development, implementation, and performance of, a data-driven model for seasonal river runoff volume forecasting in Central Asia. The method uses as predictors a combination of SWE data from existing large-scale operational remote sensing and land surface modeling products, and indices of various atmosphere-ocean circulation patterns, as predictors; it employs a multi-model ensemble model structure, which has some machine learning elements; and is intended to serve as an actual operational forecasting tool for directly supporting water management decision-making.
My overall recommendation is for publication pending minor revisions.
The article will be an excellent contribution to HESS. The paper is succinct and well-organized. The study location considered has been historically understudied, though admittedly not so severely as some other regions of the developing world. The candidate predictor data combinations are geophysically sensible, but also to some degree original in the particular way they are used here. Though it is possible to quibble with certain aspects of the predictor selection process used, it is a reasonable and defensible approach for the task at hand. Its multi-model ensemble philosophy is fully consistent with a large body of evidence demonstrating its value, yet it is only one of a tiny handful of hydrologic modeling examples where several separate data-driven/statistical/machine learning modeling systems are used and their results pooled to form a best estimate, and it is also novel as implemented here. Unlike some research papers claiming to present a forecast model, this article makes a point of clearly confirming that the predictor datasets considered are available going forward on a near-real-time basis, a necessity if a modeling system is actually going to be useful for real-world operational forecasting. The submission also clearly identifies (e.g., lines 400 to 410) where, when, and why each of the candidate predictor datasets are or are not useful for water supply forecasting, which is crucially important for physical credibility of the forecasts and the systems and data generating them, and which is sometimes overlooked in research around data-driven models.
That said, I think a few improvements will be needed before the paper is ready for publication. I hope the following comments will be helpful to the authors if they move forward with submitting a revised manuscript:
1. Though in general the article is well-crafted, some passages are written poorly enough that their meaning is unclear. For example, the wrong word is used, or words are used incorrectly, or elaborate vocabulary or phrasing is used when simpler wording would do.
2. The literature review is not quite adequate. While there seems to be sufficient reference to prior work in the study area, this article is not just a case study, and HESS is an international journal. More broadly, the methods described here need to be placed in the wider and deeper context of previous work, not just locally but globally, in order for readers to understand its contributions and wider implications – the methods used here may be applicable in entirely different regions of the world. The wider literature does not need to be discussed in detail, nor do the methods used in this submission need to be compared against them, but the paper does need to leave some clues for readers about relevant prior publications. What stood out for me is that prior research (and practice) around seasonal water supply forecasting in western Canada and the western US, directly relevant to this study, has not been adequately acknowledged. Some points of particular note are the following (full citations are provided at the end of this review):
2.a. It would probably be helpful to note in the article that the type of seasonal river discharge volume forecast modeling considered here is widely referred to as “water supply forecasting” (WSF) in the western North American operational hydrology and water management communities. They don't need to use the term throughout the article, but just pointing out at the start of the paper that they're working on what is commonly called WSF will help readers connect the study to a large existing body of prior research and practice.
2.b. Contrary to what seems to be implied in this article, given the way certain passages are phrased and the sparseness of literature citations, combining teleconnection indices with snow data as inputs to statistical seasonal water supply forecasting models is neither new nor rare. It appears to have first been implemented decades ago (Garen, 1998) in the large-scale (hundreds of forecast locations) operational forecasting systems of the US Department of Agriculture’s Natural Resources Conservation Service (NRCS) (Perkins et al., 2009), predictions from which are a staple for water managers across the American West. These principal component regression models (Garen, 1992) have used a combination of SWE, accumulated precipitation, and in some cases antecedent streamflow and El Niño-Southern Oscillation indices (Garen, 1998) as predictors of seasonal river flow volumes; these methods have since be adopted widely across western North America by other operational forecast agencies. Furthermore, continued applied R&D on combined use of snowpack observations and teleconnection indices as input variables to statistical seasonal discharge forecast models has been continued by many, such as Gobena et al (2013) in western Canada, and Moradkhani and Meier (2010) and Regonda et al. (2006a, 2006b) in the western US, to give just a few examples. It has also been extended to discovering new climate prediction skill using nonlinear methods or new indices in areas where conventional linear teleconnections are weak, such as southern Oregon and northern California (Kennedy et al., 2009; see also Fleming and Dahlke 2014).
2.c. Though the computational model presented here appears to be novel, major elements of its philosophy and structure are strongly reminiscent of other recent advances in data-driven predictive modeling of seasonal river discharge volumes. Of note here is the multi model machine-learning metasystem (M4), which was developed for and is currently being operationally implemented by the US Department of Agriculture NRCS as its new western US-wide seasonal river discharge volume forecast model. This system has been run using in-situ SWE, precipitation, and antecedent streamflow data, as well as combinations of in-situ and remotely sensed snow data, as predictors (Fleming et al., 2021, 2024). It uses a multi-model ensemble approach in which six data-driven (statistical and machine learning) forecast systems are run independently and the results are pooled to form a best estimate, closely analogous to the modeling philosophy used in this HESS contribution. There are also significant differences between M4 and the method used in this submission, but citing M4 will better-place this HESS article’s contributions in the larger research and applications literature, and provide literature support to the methods the submitted paper uses. By the same token, some related exploratory work on methods for combining outputs from multiple data-driven seasonal river discharg forecast models by Najafi and Moradkhani (2016) should be cited in this regard as well.
2.d. The foregoing are just some examples I happen to be familiar with. I’d suggest that the authors scour the literature for other prior work, including work in other regions globally, that ought to be at least briefly cited in their revised paper.
3. The following are a few additional suggestions for improvements:
3.a. Line 35 and elsewhere: this paper distinguishes between what it calls “dynamic” vs. “statistical” approaches. This jargon tends to be used more in other (broadly related) disciplines like regional climate modeling, with “process-based” vs. “data-driven” being more common in the operational hydrology literature. Also, it’s usually “dynamical” not “dynamic”, and “data-driven” also tends to be preferable to “statistical” today because of the increasing popularity of machine learning techniques (including this submission). The authors can use “dynamical” and “statistical” if they like, but to better orient readers, including the operational water resource forecasting community, to which this article seems to be in part addressed, please provide some synonyms where the terms are first introduced (line 35). It could read something like “generated using either dynamical (process-based, physics-oriented) or statistical (data-driven including machine learning and conventional statistical) modeling approaches” or something similar.
3.b. Line 46, “statistical forecasts of seasonal streamflow often rely solely on accumulated snowpack.” Yes and no. Yes, data on winter-spring seasonal snowpack provides the primary source of predictive skill in data-driven forecast models of spring-summer river runoff volume in snowmelt-dominated rivers. But these models, in both the research literature and (in particular) in operational practice, at least in western North America, also almost always use additional predictor data types. Examples include wintertime accumulated precipitation, early-season precipitation, and at some locations, antecedent streamflow and/or El Niño indices. See point 2.b above.
3.c. Line 50: excellent point!
3.d. Lines 52-53 and elsewhere: if the authors want to call the Apr-Sep target period the “vegetation season,” that’s fine I suppose, but it’s not standard nomenclature. Typically this would be called either the “growing season,” looking at it from an agricultural water supply or broader ecological perspective, or the “runoff season”, looking at it from a hydrological perspective. And given that they call Nov-Mar the “cold season” rather than the “snowpack accumulation season”, it might also be more consistent to simply call Apr-Sept the “warm season.” Overall, “growing season” seems like it might be the best fit here?
3.e. Figure 1: this figure is good, but for a wide international readership, please provide an additional map showing the location of the study area within the larger geographic context of Eurasia.
3.f. Line 124: predictand, not predicant
3.g. Line 125: An 18 year data record – in other words, 18 samples - is pretty short; it’s enough to defensibly create one of these models, but just barely. Commensurate limitations to the authors’ ability to train model parameters and validate model predictions could be viewed as a source of uncertainty in this study; the counterargument, of course, is that with rapid climate change in mountain regions such as this study area, the statistical nonstationarity in a longer data record would have reduced its value anyway. This might be worth a sentence or two here. A brief explanation of why the record doesn’t go back further or continue to the present could be helpful to readers as well. My understanding of the political history of this region isn’t great, but I think this was part of the Soviet Union, which (its grave misdeeds notwithstanding) wasn’t too bad at keeping streamflow records, so one might have been forgiven for guessing that there might be some usable historical data here?
3.h. Lines 129-130: excellent point re: near-real time input data availability – this is a prerequisite for an operational forecasting model, and it’s sometimes overlooked in research articles.
3.i. Lines 173-175: a little more information about the constituent models (“base models”) is needed here. What link function was used in the GLM? And why were linear kernels used in the GP and SVR models? Does this imply that most of the base models are essentially variants of standard, multiple linear regression? If so, what are the pros and cons? Note that work in the western US has shown that the relationships between winter-spring hydroclimatic forcing and spring-summer runoff response in data-driven WSF models range from nearly linear to moderately nonlinear, with clear physical explanations for these inferred functional forms (see Fleming et al., 2021).
3.j. Lines 181-185: in defense of their methodological choice, which has no literature citations attached to it in the submission, the authors might wish to note that LOOCV is standard practice in western US WSF modeling; see references in point 2.b above.
3.k. Lines 187-190: to improve accessibility to a broad readership which may not be uniformly well-versed in machine learning, it might be helpful to add just a sentence or two, with an additional reference or two, explaining the concept of a meta-learner. It might also be helpful, in terms of connecting this concept to prior work in data-driven WSF, to refer to the work of Najafi and Moradkhani (2016) on exploring different methods for creating multi-model ensembles from the predictions of several data-driven models.
3.l. Line 205: in the context of operational hydrologic prediction models, data “assimilation” has a very specific connotation: formal methods for using new observational data, such as observed snowpack, to update the internal states, such as predicted snowpack, of a process-based (dynamical, physics-oriented) streamflow simulation model, often using fairly complex methods like ensemble Kalman filtering. It is not normally used to refer to the use of some particular data type, such as snow data, as an input predictor variable in a data-driven (statistical or machine-learning) streamflow model.
3.m. Lines 287-288: excellent point. The authors might wish to cite literature that backs up this result, such as the excellent overview article of Hagedorn et al. (2005) and the multi-model ensemble WSF modeling article of Fleming et al. (2021).
3.n. Figure 6: this a great illustration! I do have one question though: are all the base models used for the Vaksh and Kashkadarya rivers? It’s hard to tell from the figure panels.
3.o. Line 348, might suggest rephrasing this in a more specific way, such as “suggest that useful near-real time SWE estimates, suitable for operational seasonal river discharge volume forecasting, can be effectively”
3.p. Line 350: “and enlarge during the snow ablation phase” – confusing wording
3.q. Lines 365-370: the entire paragraph (except for the excellent final sentence) is muddled. Please rewrite more simply and clearly.
3.r. Line 373: “confirms this assumption” – what assumption?
3.s. Lines 398, “is assumingly reasoned by their compensation” – this is meaningless, please rewrite.
3.t. Lines 400-410: excellent points.
References:
Fleming SW, Dahlke HE. 2014. Parabolic northern-hemisphere river flow teleconnections to El Niño-Southern Oscillation and the Arctic Oscillation. Environmental Science Letters, 9, 104007, doi:10.1088/1748-9326/9/10/104007.
Fleming SW et al. 2021. Assessing the new Natural Resources Conservation Service water supply forecast model for the American West: a challenging test of explainable, automated, ensemble artificial intelligence. Journal of Hydrology, 602, 126782.
Fleming SW et al. 2024. Leveraging next-generation satellite remote sensing-based snow data to improve seasonal water supply predictions in a practical machine learning-driven river forecast system. Water Resources Research, 60, e2023WR035785, https://doi.org/10.1029/2023WR03578.
Garen DC. 1992. Improved techniques in regression-based streamflow volume forecasting. Journal of Water Resources Planning and Management, 118, 654-669.
Garen DC. 1998. ENSO indicators and long-range climate forecasts: usage in seasonal streamflow volume forecasting in the western United States, American Geophysical Union Fall Conference, San Francisco, CA.
Gobena AK et al. 2013. The role of large-scale climate modes in regional streamflow variability and implications for water supply forecasting: a case study of the Canadian Columbia Basin. Atmosphere-Ocean, 51, 380-391.
Hagedorn R et al., 2005. The rationale behind the success of multi-model ensembles in seasonal forecasting – I. basic concept. Tellus, 57A, 219-233.
Moradkhani H, Meier M. 2010. Long-lead water supply forecast using large-scale climate predictors and independent component analysis. Journal of Hydrologic Engineering, 15, 744-762.
Najafi MR, Moradkhani H. 2016. Ensemble combination of seasonal streamflow forecasts. Journal of Hydrologic Engineering, 21, 10.1061/(ASCE)HE.1943-5584.0001250.
Kennedy AM, Garen DC, Koch RW. 2009. The association between climate teleconnection indices and Upper Klamath seasonal streamflow: Trans-Niño index. Hydrological Processes, 23, 973-984.
Perkins TR et al. 2009. Innovative operational seasonal water supply forecasting technologies. Journal of Soil and Water Conservation, 64, 15-17.
Regonda SK et al. 2006a. A multimodel ensemble forecast framework: application to spring seasonal flows in the Gunnison River Basin. Water Resources Research, 42, doi:10.1029/2005WR004653.
Regonda SK et al. 2006b. A new method to produce categorical streamflow forecasts. Water Resources Research, 42, doi:10.1029/2006WR004984.

Citation: https://doi.org/10.5194/hess-2024-174-RC1
- AC1: 'Reply on RC1', Atabek Umirbekov, 18 Oct 2024
  
  Dear Reviewer,
  Thank you much for your review and valuable comments. Please find our point-to-point responses attached.
  With regards,
  Atabek Umirbekov, on behalf of all authors
  
  Citation: https://doi.org/10.5194/hess-2024-174-AC1
RC2:
'Comment on hess-2024-174', Anonymous Referee #2, 02 Sep 2024
In this manuscript, the authors explore the relative contribution of large-scale climate oscillation predictors and snow water equivalent on the quality of April-September seasonal streamflow forecasts in eight catchments located across the Pamir and Tian-Shan mountains (central Asia). To this end, the authors first examine the correlation between climate modes of variability and (i) catchment-scale precipitation over the peak precipitation season (February-July), and (ii) April-September seasonal streamflow. Then, the authors adjust 16 models resulting from the combination of four statistical models and four SWE products, using SWE (at four forecast initialization times) as one of the predictors, and large scale climate indices as additional predictors. The total sample size (i.e., 18 points obtained from 18 years with data) is split into a sample of 15 points for cross-validation, and the remaining points are used for additional testing. The authors conclude that their technique is “a novel way to reduce uncertainties in seasonal discharge predictions in data-scarce snowmelt-dominated catchments”.
This is basically a seasonal hindcasting study, generally well written and concisely presented. Nevertheless, my main critiques with this work are (1) the overselling, especially in the title, abstract and conclusions, (2) the lack of forecast uncertainty characterization (which is highlighted by the authors as a key contribution), and (3) the limited sample size, and the way the authors address this problem in their analyses. Therefore, I think that the manuscript needs major revisions before being considered for publication in HESS.
Major comments
1. Title, abstract and conclusions: it is well known that the value of hydroclimatic teleconnections on seasonal streamflow forecasts is huge in snowmelt-driven catchments, especially during the preceding Fall season, when initial hydrologic conditions have not been fully developed (e.g., Mendoza et al., 2017) – as the authors write in L22-24, and conclude in L403-404. There is a long history on the use of large-scale climate information for seasonal streamflow forecasting (e.g., Piechota et al., 1998), and what the authors state in L20-21 and other parts of the manuscript was neatly shown nearly two decades ago using custom-based climate indices in two western US catchments (see Figure 8 in Grantz et al., 2005; and also Regonda et al., 2006; Opitz-Stapleton et al., 2007; Bracken et al., 2010; Mendoza et al., 2014, etc.). Additionally, the use of simulated catchment-averaged SWE as a predictor to feed statistical models (L105-106) is not new either (e.g., Rosenberg et al., 2011; Mendoza et al., 2017). In other words, the findings reported by the authors are not novel and, based on this, I think that they should refine the title, abstract and conclusions to make them more specific to their actual contribution to the existing literature.
2. L25: the authors declare that their approach “provides a novel way to reduce uncertainties in seasonal discharge prediction”. Do they refer to the spread of seasonal forecasts? Although they describe an ensemble stacking framework to produce a final forecast, only deterministic evaluation metrics (coefficient of determination and normalized mean absolute error) are reported, and no characterizations of hydrological prediction uncertainties are presented. A popular to do so is through ensembles (Georgakakos et al., 2004; also, see publications produced by the HEPEX community on this topic), analyzing, for example, the statistical consistency of seasonal forecasts with graphical devices like rank histograms (Hamill, 2001) or Q-Q plot (Renard et al., 2010), complementing with ensemble verification metrics (e.g., De Lannoy et al., 2006). Therefore, I recommend the authors to take advantage of the multiple models developed to characterize forecast uncertainty or, alternatively, delete any references to “forecast uncertainty” from their manuscript (which I think would diminish the quality of their research).
3. Sample size (L126-127): this is a major issue in seasonal streamflow forecasting, since only one training/verification point is available per year. Therefore:
In my opinion, the sample size is not large enough to support – being extremely generous – more than three predictor variables in their models (the authors report up to five predictors in Figure 5 for the Chu River basin), given the high risk of overfitting (see Wilks, 2011 or any other book on Statistics). Hence, I think that the authors should revisit their statistical models, removing combinations of predictors that may introduce multicollinearity.

I do not think it is appropriate to split their sample of points (n = 18) into a smaller sample for leave-one-out cross validation (with n =15), and another sample for verification that contains three (L314) or even two points. I recommend the authors using the entire sample to perform cross-validation and compute verification metrics. Further, they should characterize the impact of sampling uncertainty, which could be done by adding confidence intervals created through bootstrapping with replacement (see section 5.5 in Araya et al., 2023). This is a critical point that the authors should address, given the very small sample size.

Specific comments
4. L13: The authors use the term “predictions”, which is an excessively ample word for what they really do. In this line, I recommend the authors using the word “forecasts”, and consider using the words “hindcasts” and “hindcasting” in the remainder of the manuscript, especially when describing their methods and results (please see section 3 in Beven and Young, 2013).
5. L30: This population estimate is for almost ten years old. I suggest updating the number and the reference.
6. L35: Sometimes you use “dynamic”, and sometimes “dynamical”. Please pick one term and be consistent.
7. L36-37: This sentence is incorrect. Climate forecasts are not used until the IHCs have been produced by running a model with a historical meteorological dataset up to the forecast initialization time.
8. L39: I disagree with the authors’ statement, since computational demand depends on model complexity and, therefore, a model simulation might take from seconds (e.g., GR4J, SAC-SMA) to several minutes (e.g., VIC, SUMMA) in a home PC.
9. L40: Note that meteorological variables obtained from numerical climate models ARE prone to uncertainties.
10. L45-46: I think that the authors should cite more papers when referring to the relevance of SWE as a predictor in mountainous catchments (e.g., Garen, 1992; Rosenberg et al., 2011; Mendoza et al., 2014). In general, I recommend the authors strengthening the literature review in this paragraph.
11. L46: “statistical forecasts of seasonal streamflow often rely solely on accumulated snowpack”. I disagree with this statement. The current operational systems managed by the NRCS for the western US and the DGA for Chile use, besides SWE, in situ measurements of precipitation, air temperature and streamflow measured in the preceding months.
12. L74: Are the authors referring to hydrological droughts? I think that any paper by Anne Van Loon (e.g., Van Loon, 2015) may be useful to clarify this point.
13. L88-89: This approach was proposed and tested more than two decades ago (e.g., Piechota et al., 1998).
14. L97: It would be good clarifying here that SWE can be directly obtained from reanalysis, or estimated by combining satellite remotely sensed snow depth and a snow density model.
15. L99-101: Please note that ensemble techniques have been used for decades in seasonal streamflow forecasting (e.g., Twedt et al., 1977; Day, 1985; Regonda et al., 2006; Wang et al., 2011; Arnal et al., 2018; Emerton et al., 2018; Lucatero et al., 2018; Girons Lopez et al., 2021; Araya et al., 2023).
16. Table 1: I suggest adding the period used to compute the variables and more hydroclimatic descriptors, like mean annual runoff (mm/yr), mean annual runoff ratio and aridity index. Please change the units of seasonal discharge to mm/yr,
17. L173: what link function did you use in your GLM?
18. L189: Looks like the SVR works as a post-processor, right?
19. L190-191: given the small sample size, I recommend deleting this step from your workflow (see comment #3).
20. L205, L206, L297, L351 and L353 and everywhere else: the authors use the term “assimilate” when referring to the use of modeled SWE as a predictor in their statistical model. Nevertheless, such term is typically used when referring to a family of techniques that combine imperfect models with uncertain observations to improve dynamical model estimates (e.g., Liu and Gupta, 2007; Reichle, 2008; Kumar et al., 2016; Smyth et al., 2022). Since the authors do not refer to the former concept anywhere in this manuscript, I suggest deleting the words “assimilate” or “assimilation”.
21. L221: what do you mean with the word “underperforming”?
22. L221-222: I think that this sentence contradicts the previous one. Also, if ERA5-L and MSWX are better, why don't you just pick one of these products for subsequent analyses? Some of your subsequent figures are unnecessarily complicated.
23. Section 5.2 and Figure 4: since your target variable is seasonal streamflow, you could show correlation results between this variable and climate indices here, and move the correlation results with precipitation to supplementary material.
24. Figure 5: I do not think you can support more than three predictors with a sample size n = 18 (see comment #3).
25. L295: Do you mean winner among statistical models? Can you please be more specific?
26. L301-302: I do not think that the authors are quantifying uncertainty (see comment #2).
27. Figure 6 is quite difficult to read. Since the focus of the paper is on the relevance of climate information in seasonal streamflow forecasting, why don't you just show the best-performing statistical model, with the best SWE product? Further, you should include the assessment period in each figure caption.
28. L313: This is not true for all catchments. See, for example, the red bars for the Kashkadarya and Chu basins.
29. L322: Do you mean larger errors? Are you comparing against the results obtained with SWE and climate information? In that case, I really think you should define a Skill Score for a comparative assessment.
30. Figure 8: I recommend presenting these results using scatter plots (eight panels), along with the 1:1 line, percent bias, MAE and R².
31. L348: What do you mean with 'effectively'? That near real-time SWE estimates are actually useful for seasonal streamflow forecasting?
32. L350: I do not think the authors have presented any uncertainty or error propagation analysis (please see comment #2)
33. L352: Did you actually assess the accuracy of SWE products using in-situ observations?
34. L420: In my opinion, models adjusted with such a small sample cannot be regarded as “reliable”.
Suggested edits
35. L28: “where it sustains” -> “sustaining”.
36. L32: “Accurate water availability forecasts” -> “accurate water supply forecasts”.
37. L36: “current hydrologic conditions” -> “initial hydrologic conditions”.
38. L42: “multiple variables” -> “multiple predictor variables”.
39. L43: delete “the context of”.
40. L61 and L63: replace “from now on” by “hereafter”.
41. L67: delete “from satellite”.
42. L73: “ENSO in its cold phase” -> “the cold phase of ENSO”.
43. L74: delete “ENSO’s”.
44. L95-96: “used to conduct” -> “conducted”.
45. L124: I think that the right word is “predictand”.
46. L130: delete “in near real-time”.
47. L132: “we simulated” -> “we obtained”.
48. L155-156: “precipitation levels” -> “precipitation amounts”.
49. L174: add “SVR” after “support vector regression”.
50. L214-215: I suggest deleting this sentence.
References
Araya D, Mendoza PA, Muñoz-castro E, McPhee J. 2023. Towards robust seasonal streamflow forecasts in mountainous catchments: impact of calibration metric selection in hydrological modeling. Hydrology and Earth System Sciences 27 (24): 4385–4408 DOI: 10.5194/hess-27-4385-2023
Arnal L, Cloke HL, Stephens E, Wetterhall F, Prudhomme C, Neumann J, Krzeminski B, Pappenberger F. 2018. Skilful seasonal forecasts of streamflow over Europe? Hydrology and Earth System Sciences 22 (4): 2057–2072 DOI: 10.5194/hess-22-2057-2018
Beven K, Young P. 2013. A guide to good practice in modeling semantics for authors and referees. Water Resources Research 49 (8): 5092–5098 DOI: 10.1002/wrcr.20393
Bracken C, Rajagopalan B, Prairie J. 2010. A multisite seasonal ensemble streamflow forecasting technique. Water Resources Research 46: W03532 DOI: 10.1029/2009WR007965
Day GN. 1985. Extended Streamflow Forecasting Using NWSRFS. Journal of Water Resources Planning and Management 111 (2): 157–170 DOI: 10.1061/(ASCE)0733-9496(1985)111:2(157)
Emerton R, Zsoter E, Arnal L, Cloke HL, Muraro D, Prudhomme C, Stephens EM, Salamon P, Pappenberger F. 2018. Developing a global operational seasonal hydro-meteorological forecasting system: GloFAS-Seasonal v1.0. Geoscientific Model Development 11 (8): 3327–3346 DOI: 10.5194/gmd-11-3327-2018
Garen DC. 1992. Improved Techniques in Regression-Based Streamflow Volume Forecasting. Journal of Water Resources Planning and Management 118 (6): 654–670 DOI: 10.1061/(ASCE)0733-9496(1992)118:6(654)
Georgakakos KP, Seo D-J, Gupta H, Schaake J, Butts MB. 2004. Towards the characterization of streamflow simulation uncertainty through multimodel ensembles. Journal of Hydrology 298: 222–241 DOI: 10.1016/j.jhydrol.2004.03.037
Girons Lopez M, Crochemore L, G. Pechlivanidis I. 2021. Benchmarking an operational hydrological model for providing seasonal forecasts in Sweden. Hydrology and Earth System Sciences 25 (3): 1189–1209 DOI: 10.5194/hess-25-1189-2021
Grantz K, Rajagopalan B, Clark M, Zagona E. 2005. A technique for incorporating large-scale climate information in basin-scale ensemble streamflow forecasts. Water Resources Research 41: W10410 DOI: 10.1029/2004WR003467
Hamill TM. 2001. Interpretation of Rank Histograms for Verifying Ensemble Forecasts. Monthly Weather Review 129 (3): 550–560 DOI: 10.1175/1520-0493(2001)129<0550:IORHFV>2.0.CO;2
Kumar S V., Zaitchik BF, Peters-Lidard CD, Rodell M, Reichle R, Li B, Jasinski M, Mocko D, Getirana A, De Lannoy G, et al. 2016. Assimilation of Gridded GRACE Terrestrial Water Storage Estimates in the North American Land Data Assimilation System. Journal of Hydrometeorology 17 (7): 1951–1972 DOI: 10.1175/JHM-D-15-0157.1
De Lannoy GJM, Houser PR, Pauwels VRN, Verhoest NEC. 2006. Assessment of model uncertainty for soil moisture through ensemble verification. Journal of Geophysical Research 111 (D10): D10101 DOI: 10.1029/2005JD006367
Liu Y, Gupta H V. 2007. Uncertainty in hydrologic modeling: Toward an integrated data assimilation framework. Water Resources Research 43 (7): W07401 DOI: 10.1029/2006WR005756
Van Loon A. 2015. Hydrological drought explained. Wiley Interdisciplinary Reviews: Water 2 (4): 359–392 DOI: 10.1002/WAT2.1085
Lucatero D, Madsen H, Refsgaard JC, Kidmose J, Jensen KH. 2018. Seasonal streamflow forecasts in the Ahlergaarde catchment, Denmark: The effect of preprocessing and post-processing on skill and statistical consistency. Hydrology and Earth System Sciences 22 (7): 3601–3617 DOI: 10.5194/hess-22-3601-2018
Mendoza PA, Rajagopalan B, Clark MP, Cortés G, McPhee J. 2014. A robust multimodel framework for ensemble seasonal hydroclimatic forecasts. Water Resources Research 50 (7): 6030–6052 DOI: 10.1002/2014WR015426
Mendoza PA, Wood AW, Clark E, Rothwell E, Clark MP, Nijssen B, Brekke LD, Arnold JR. 2017. An intercomparison of approaches for improving operational seasonal streamflow forecasts. Hydrology and Earth System Sciences 21 (7): 3915–3935 DOI: 10.5194/hess-21-3915-2017
Opitz-Stapleton S, Gangopadhyay S, Rajagopalan B. 2007. Generating streamflow forecasts for the Yakima River Basin using large-scale climate predictors. Journal of Hydrology 341 (3–4): 131–143 DOI: 10.1016/j.jhydrol.2007.03.024
Piechota TC, Chiew FHS, Dracup JA, McMahon TA. 1998. Seasonal streamflow forecasting in eastern Australia and the El Niño-Southern Oscillation. Water Resources Research 34 (11): 3035–3044 DOI: 10.1029/98WR02406
Regonda SK, Rajagopalan B, Clark M, Zagona E. 2006. A multimodel ensemble forecast framework: Application to spring seasonal flows in the Gunnison River Basin. Water Resources Research 42: W09404 DOI: 10.1029/2005WR004653
Reichle RH. 2008. Data assimilation methods in the Earth sciences. Advances in Water Resources 31 (11): 1411–1418 DOI: 10.1016/j.advwatres.2008.01.001
Renard B, Kavetski D, Kuczera G, Thyer M, Franks SW. 2010. Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors. Water Resources Research 46 (5): W05521 DOI: 10.1029/2009WR008328
Rosenberg EA, Wood AW, Steinemann AC. 2011. Statistical applications of physically based hydrologic models to seasonal streamflow forecasts. Water Resources Research 47 (3): W00H14 DOI: 10.1029/2010WR010101
Smyth EJ, Raleigh MS, Small EE. 2022. The Challenges of Simulating SWE Beneath Forest Canopies are Reduced by Data Assimilation of Snow Depth. Water Resources Research 58 (3): 1–21 DOI: 10.1029/2021WR030563
Twedt TM, Schaake JCJ, Peck EL. 1977. National Weather Service extended streamflow prediction. In Western Snow Conference52–57.
Wang E, Zhang Y, Luo J, Chiew FHS, Wang QJ. 2011. Monthly and seasonal streamflow forecasts using rainfall-runoff modeling and historical weather data. Water Resources Research 47 (5): 1–13 DOI: 10.1029/2010WR009922
Wilks DS. 2011. Statistical Methods in the Atmospheric Sciences, Volume 100 - 3rd Edition. Academic Press Inc.
Citation: https://doi.org/10.5194/hess-2024-174-RC2
- AC2: 'Reply on RC2', Atabek Umirbekov, 18 Oct 2024
  
  Dear Reviewer,
  Thank you much for your review and valuable comments. Please find our point-to-point responses attached.
  With regards,
  Atabek Umirbekov, on behalf of all authors
  
  Citation: https://doi.org/10.5194/hess-2024-174-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Reconsider after major revisions (further review by editor and referees) (09 Nov 2024) by Hilary McMillan

AR by Atabek Umirbekov on behalf of the Authors (19 Jan 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (20 Jan 2025) by Hilary McMillan

RR by Anonymous Referee #1 (27 Jan 2025)

Suggestions for revision or reasons for rejection

Review of “The Value of Hydroclimatic Teleconnections for Snow-based Seasonal Streamflow Forecasting in Central Asia” (revised manuscript) by Umirbekov et al.:

This applied research article synthesizes and integrates a variety of methods and data types to create a convincing set of seasonal water supply forecast models for several rivers in Central Asia. The work is innovative, well-thought-out, technically sound, and socially relevant given its potential to support water management in an under-served region that would benefit from this kind of operational forecast system. Free public availability of the author’s code and all the required input data means the forecast model could be quickly and easily implemented in production systems. The paper additionally presents a number of original research outcomes that should inform other studies around interactions between snowpack data, atmosphere-ocean circulation patterns, water supply forecasting, and practical machine learning systems.

In my opinion, this paper will be a strong contribution to HESS, and I recommend publication as-is (or with some very minor additional revisions). The revised article fully addresses the review comments I provided on the original submission, and I have only a few small additional suggestions to make here:

• First sentence of the abstract: the phrasing is awkward – I suggest improving this to read something more like, “Due to the long memory of snow processes, statistically based seasonal streamflow prediction models in snow-dominated catchments can successfully leverage, but also typically rely on, snowpack estimates.”

• Lines 36-44: This is much-improved over the original manuscript but would still benefit from a little additional work. A few points the authors might bear in mind: both process-simulation and data-driven water supply forecast models can ingest seasonal to subseasonal climate forecasts as input (see for example Lehner et al., Geophysical Research Letters, 2017, https://doi.org/10.1002/2017GL076043); process-based models have the advantage of explicitly capturing process physics, enhancing credibility and interpretability; data-driven models have the advantage of not requiring assumptions about the relevant physics and how to represent it in a pragmatic computational model. Rewriting the passage just a little more to acknowledge these points would lend greater credibility to the article overall.

• Lines 176-186: it might be worth explicitly mentioning here that while several models were used in the ensemble, each of them individually is quite simple and parsimonious, with a single target variable (seasonal discharge volume) and three or fewer input variables (as subsequently noted on lines 215-216). That means there’s only a small number of parameters to estimate from the limited sample size. In other words, the methodology used here is suitable for application to short datasets. There is some precedent in water supply forecast modeling for deliberately fine-tuning statistical and machine learning architectures to maximize parsimony and minimize the number of parameters to be estimated, enabling application to short datasets with good out-of-sample performance, as well as improved regularization and geophysical explainability (some examples are Fleming et al., 2021, 2024, which are already cited in the manuscript).

• I really like Figure 5, but I think there’s a typographical error. The caption reads, “For example, the April 1st forecast models for the Amudarya use as predictors the SWE estimate as of the beginning of March, the state of the PDO index in November, and the SCAN index in February.” I think the figure states, though, that these models use the SCAN index in January, not February, for that forecast date and river.

Hide

RR by Anonymous Referee #2 (10 Feb 2025)

Suggestions for revision or reasons for rejection

I want to thank the authors for incorporating most of the suggestions provided in the first round of revisions. Despite the manuscript has been improved considerably, I have a suite of comments that I would like the authors to address before this paper is accepted for publication.

Minor comments

1. L41: I think it would be more appropriate re-writing as "…some process-based models have higher computational demand (e.g., Oleson et al., 2010; Niu et al., 2011; Clark et al., 2021)…”.

2. L88, L312 and everywhere else: please define the winter season (DJF?), since most readers will not be familiar with your study domain. The same comment applies to the remaining seasons.

3. L131, L230 and everywhere else: I advise replacing “prediction” with “hindcast”, since you are actually presenting results from retrospective forecasting (i.e., hindcasting) experiments. Please be precise and consistent with the terminology.

4. L233-234: if I understood well, you assess – for each model – cross-validated (deterministic?) hindcasts to get 16 evaluation metrics, and then select the k < 16 that fulfill the requirement R2>0.2, right? Please clarify.

5. Section 4.1: Did you try correlating seasonal averages (i.e., temporally averaged 2-month, 3-month, etc.) of your climate indices against seasonal precipitation for predictor screening?

6. Section 4.2: it remains unclear what is the ensemble size of the final hindcasts produced with the meta-learner (SVR) model. Additionally, how do you compute the MAE and R2 (L359)? Do you compare the ensemble median or ensemble mean against observations?

7. I strongly advise the authors to move Figure S1 to the main manuscript and move Figure 4 to the supplement, since your paper is about seasonal streamflow (and NOT precipitation) forecasting. Also, the current Figures S1 and 4 are nearly identical.

8. L339-342: the statements concerning the numbers of models are very hard to visualize. I recommend adding those numbers in Figure 6 for each forecast initialization.

9. L351: I presume you are referring to R2 results here, right? I do not think you should refer to “uncertainty”, since you are not quantifying forecast spread or providing confidence intervals. Please be more precise and refer to what you are actually showing.

10. Figure 7: Please clarify whether you are showing the results from the meta-learner model.

11. Figure 8: Are you displaying the ensemble hindcast mean along the y-axis? Please clarify. Also, these results should not be in sub-section 5.4, since you are not illustrating any predictive uncertainty.

12. L398-399: This description should be in the methods section. How many times did you resample the data? Note that this step would be redundant if the SVM meta learner produced ensemble hindcasts.

13. L401: I strongly advise the authors to be more quantitative when judging the “width of uncertainty bounds”. To this end, they can compute the alpha reliability index (Renard et al., 2010), as in previous seasonal hindcasting studies (e.g., Mendoza et al., 2017; Araya et al., 2023).

14. L409: please clarify what you mean with “more consistent”.

15. L139, L391, L406 and everywhere else: please avoid using “significant” or “significance”, unless you refer to statistically significant result.

16. L458-459: “The resulting forecast models generate credible simulations…”. Your models are producing hindcasts and NOT simulations (please revise Beven and Young, 2013). Also, the sentence reads as overselling, since your results are not good for all lead times. I suggest deleting.

17. L464-465: “In most catchments, the SOI, PDO, or both were utilised, indicating the dominant influence of ENSO”. The Pacific Decadal Oscillation (PDO) can modulate ENSO, but PDO and ENSO are different modes of variability. I suggest re-wording to avoid confusing readers.

18. L490: I think what you should write here is “more accurate seasonal streamflow hindcasts”. Note that the term “reliable” has a very specific connotation in probabilistic forecasting, and is related with the degree to which forecast probabilities match relative observed frequencies (see Wilks, 2019).

Suggested edits

19. L28: delete “other”.
20. L33: Add “Additionally,” before “accurate”.
21. L42: revise “data-drivenapproaches”.
22. L44 and L47: I suggest deleting “primarily”.
23. L63: “multi ensemble” -> “multi-model ensemble”.
24. L75: higher prediction accuracy and better quantify -> “the quantification of”.
25. L94-95: “One approach” -> “The first approach”.
26. L111-112: “for forecasting” -> “to forecast”.
27. L331: I suggest rewriting as “…with varying R2…”, since this is what you are actually showing.

References

Araya D, Mendoza PA, Muñoz-castro E, McPhee J. 2023. Towards robust seasonal streamflow forecasts in mountainous catchments: impact of calibration metric selection in hydrological modeling. Hydrology and Earth System Sciences 27 (24): 4385–4408 DOI: 10.5194/hess-27-4385-2023
Beven K, Young P. 2013. A guide to good practice in modeling semantics for authors and referees. Water Resources Research 49 (8): 5092–5098 DOI: 10.1002/wrcr.20393
Clark MP, Zolfaghari R, Green KR, Trim S, Knoben WJM, Bennett A, Nijssen B, Ireson A, Spiteri RJ. 2021. The numerical implementation of land models: Problem formulation and laugh tests. Journal of Hydrometeorology 22 (6): 1627–1648 DOI: 10.1175/JHM-D-20-0175.1
Mendoza PA, Wood AW, Clark E, Rothwell E, Clark MP, Nijssen B, Brekke LD, Arnold JR. 2017. An intercomparison of approaches for improving operational seasonal streamflow forecasts. Hydrology and Earth System Sciences 21 (7): 3915–3935 DOI: 10.5194/hess-21-3915-2017
Niu G-Y, Yang Z-L, Mitchell KE, Chen F, Ek MB, Barlage M, Kumar A, Manning K, Niyogi D, Rosero E, et al. 2011. The community Noah land surface model with multiparameterization options (Noah-MP): 1. Model description and evaluation with local-scale measurements. Journal of Geophysical Research 116 (D12): D12109 DOI: 10.1029/2010JD015139
Oleson KW, Lawrence DM, Gordon B, Flanner MG, Kluzek E, Peter J, Levis S, Swenson SC, Thornton E, Dai A, et al. 2010. Technical Description of version 4.0 of the Community Land Model (CLM). Boulder, Colorado, USA.
Renard B, Kavetski D, Kuczera G, Thyer M, Franks SW. 2010. Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors. Water Resources Research 46 (5): W05521 DOI: 10.1029/2009WR008328
Wilks DS. 2019. Statistical Methods in the Atmospheric Sciences. Elsevier.

Hide

ED: Publish subject to minor revisions (review by editor) (11 Feb 2025) by Hilary McMillan

AR by Atabek Umirbekov on behalf of the Authors (28 Feb 2025) Author's response Author's tracked changes Manuscript

ED: Publish as is (28 Mar 2025) by Hilary McMillan

AR by Atabek Umirbekov on behalf of the Authors (11 Apr 2025) Manuscript

Executive editor

This social relevant work is innovative, well-thought-out, and technically sound. The study has the potential to support water management in an under-served region that would benefit from this kind of operational forecast system (Central Asia). Free public availability of the author’s code and all the required input data means the forecast model is easily implemented in production systems.