the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Hybrid forecasting: using statistics and machine learning to integrate predictions from dynamical models
Louise Slater
Louise Arnal
Marie-Amélie Boucher
Annie Y.-Y. Chang
Simon Moulds
Conor Murphy
Grey Nearing
Guy Shalev
Chaopeng Shen
Linda Speight
Gabriele Villarini
Robert L. Wilby
Andrew Wood
Massimiliano Zappa
Abstract. Hybrid hydroclimatic forecasting systems employ data-driven (statistical or machine learning) methods to harness and integrate a broad variety of predictions from dynamical, physics-based models – such as numerical weather prediction, climate, land, hydrology and Earth system models – into a final prediction product. They are recognised as a promising way of enhancing prediction skill of meteorological and hydroclimatic variables and events, including rainfall, temperature, streamflow, floods, droughts, tropical cyclones, or atmospheric rivers. Hybrid forecasting methods are now receiving growing attention due to advances in weather and climate prediction systems at sub-seasonal to decadal scales, a better appreciation of the strengths of machine learning, plus expanding access to computational resources and methods. Such systems are attractive because they may avoid the need to run a computationally-expensive offline land model, can minimize the effect of biases that exist within dynamical outputs without explicit bias correction and downscaling, benefit from the strengths of machine learning models, and can learn from large datasets, while combining different sources of predictability with varying time-horizons. Here we review recent developments in hybrid hydroclimatic forecasting and outline key challenges and opportunities. These include obtaining physically-explainable results, assimilating human influences from novel data sources, integrating new ensemble techniques to improve predictive skill, creating seamless prediction schemes that merge short to long lead times, incorporating modelled initial land surface and ocean/ice conditions, acknowledging spatial variability in landscape and atmospheric forcing, and increasing the operational uptake of hybrid prediction schemes.
Louise Slater et al.
Status: final response (author comments only)
-
RC1: 'Comment on hess-2022-334', Anonymous Referee #1, 20 Oct 2022
Thank you for the opportunity to review Slater et al. “Hybrid forecasting: using statistics and machine learning to integrate predictions from dynamical models”. Overall, I find this to be a timely and informative review. However, I do have a variety of comments, detailed below. I recommend at least a minor revision, if not a major revision.
My biggest concern in reading this paper is the number of different models and approaches etc. that are discussed. The paper is full of acronyms (so Table 2 is certainly helpful) such that I routinely found myself lost in the details and trying to remember the bigger picture or category that the details were supporting. If I’m someone coming to this review trying to figure out where to start with hybrid modeling, I think I would really struggle. How would I begin? Would I choose a model/paper from Table 1? How would I discriminate or know how to choose among the myriad of options? If the authors can provide some answers or guidance to these types of questions, I think it would be very helpful. Also, if there is any way to more clearly emphasize the main points even among all the details.
Terminology is really important in this paper. Can you please provide some definitions of the differences between physics-based vs. conceptual models?
One question I had was whether any hybrid schemes are currently operational. But, this is partially answered in line 93. Also wanted to see what the authors think it would take to make these models operational, which is partially addressed in the conclusion. Any further details that can be provided on this topic would be greatly appreciate (i.e., are there ANY examples of operational hybrid schemes? And if so, can they serve as pilot projects? i.e., what can we learn from their implementation that might help hybrid schemes become more widely used?).
Lines 100 and on list many hybrid models… but not all the references are in Table 1 as well. Any reason? (e.g., Miller et al., 2021)
Section 2.4 seems to have a different focus than what is indicated on line 122.
The grammar of the sentence spanning lines 122-124 isn’t quite correct. Same for the sentence spanning lines 273-274.
Lines 243: seems like a concluding statement (summarizing the overall point of the paragraph) is needed here.
Line 249: the reference to Madadgar et al., 2016 – where was this study applied?
Lines 264-266: Is this sentence a description of “mode-matching”? And if so, can that be made clear. If not, please provide a brief idea of what mode-matching is.
Line 409: by “national” does that mean the United States?
Line 440: what does “surface water” mean?
Lines 454-461: this paragraph, especially the last sentence, seems to imply there are no limitations to hybrid models.
Lines 491-509: are these paragraphs in the correct place? The information presented within seems to go in Section 2.1 on pre- and post-processing.
Lines 598-599: this is a really important point that I’m glad was made (i.e., the marginal improvement might be not worth the effort). It seems to me that dealing with this issue is critical to making hybrid schemes more widely accepted. Is there any way we can determine a priori the marginal improvement (without having to build both models in parallel and then compare)? For example, the Mai et al. (2022) study in line 616 – would be good to comment if the demonstrated superiority was enough to justify the extra effort.
Table 1: (a) Are any of these operational? (b) Any rationale for inclusion/exclusion of studies in this table? (c) Can you add another column that describes how the statistical and dynamial models are combined? (d) Regarding column headings, in the text, “data-driven” seems to be the most generic term (lines 25-26) but here the column header is “statistical” model (and elsewhere, “empirical” is used). Again, the importance of terminology in this paper. (e) Would this table become slightly easier to digest if it was first sorted by predictand type (i.e., streamflow vs. reservoir, etc) and then horizon? I’m not sure, but I think that predictand is a larger category (and what I would first be interested in), then horizon.
Some acronyms that are not defined anywhere: RCP8.5, FV3GFS (this is just the name of the atmospheric model?), PREVAH (also a model name?)
Table 3: (a) Shouldn’t “coupled” be included here also, since it is discussed in the text. (b) I find it interesting that Lee et al. (2002) is a primary reference for two of the options (serial and parallel) – given that it is now 20 years ago. Is that because it was such a foundational paper? Either way, can a more recent reference also be provided? As a corollary comment: It would be nice to have a discussion in the text of when these approaches were first tried (what was the foundational paper) on hydroclimate variables.
Figure 1: A few comments/questions on this graphic: (a) Please explain if the coloration of the boxes has any meaning. (b) Aren’t large-scale predictors etc. also inputs to the hybrid forecasting scheme (not just dynamical predictors) – in other words, the straightforward left-to-right is not actually quite so straightforward? (c) Bottom middle: shouldn’t it be “hydroclimate model” rather than “hydrological model” to be more general?
Figure 2: So, you obtain one value each for JJA, then take the max? Could be clarified in the caption text.
Citation: https://doi.org/10.5194/hess-2022-334-RC1 - AC1: 'Reply to RC1', Louise Slater, 06 Jan 2023
-
RC2: 'Comment on hess-2022-334', Anonymous Referee #2, 09 Nov 2022
Summary
This paper reviews - indeed it defines - the burgeoning field of hybrid dynamical-statistical hydrometeorological forecasting. The paper is timely and I believe it to be of wide interest to readers of HESS (and very likely beyond). I generally like to balance positive and negative feedback in reviews, but it was very difficult for me to find any suggestions to improve in this paper. It is skillfully organised, placing a very wide range of studies in sensible categories and highlighting specific themes with more detailed discussions of some papers. I didn't think there were really any major gaps in the literature and ideas they presented. The paper is also brilliantly written, with concise, lucid sentences making it an easy read - I believe even for non-experts. In short, in my view this review does everything a review should do: summarises the literature comprehensively, shapes the literature sensible themes, makes an argument - in this case the paper is essentially arguing for the recognition of hybrid forecasting as a distinct field (or at least a subfield within hydrometeorological forecasting) - and makes clear recommendations on the future direction of hybrid forecasting. I congratulate the authors on a remarkable review paper, one that I believe deserves to be widely cited.
Specific comments
L33 "We do not provide a prescriptive definition of hybrid forecasting as it exists along a continuum, which may include a wide range of modeling and ‘big data’ type Earth Observation (EO) datasets" Fair enough - a sensible choice.L156 "ML models are also employed during the dynamical climate model simulations to correct model biases" I suspect the use of 'ML' to describe Bayesian techniques like Schepen and bias-correction methods like Meyer may be a bit unusual to many. Suggest the broader term 'statistical models' or 'data driven models' (consistent with the definition given in the introduction) to encompass all these.
L156 "The use of ML..." same issue with this paragraph - I would say that neither Bennett et al. nor McInerney et al. really qualify as ML - they are error models, which I think in general usage don't get lumped in with ML. These distinctions may well be arbitrary, but I'd suggest if the authors want to broaden the common use of ML to include a wide range statistical models that this be defined up front somewhere (in the way the authors have done with 'data-driven').L453 "4 Key challenges and opportunities of hybrid forecasting" I guess I would add to the topics covered in this section the effective use of probabilistic forecasts in decision making. One of the major efforts in hybrid forecasting systems has been to achieve reliable predictive distributions; but it's not yet clear that this effort will necessarily result in better decisions. It's likely that automated decision systems/optimisation will be the means to take advantage of reliability in ensemble distributions. In my view this still requires considerable research - existing methods of optimsation do not necessary take advantage of this property. But I also understand that this may be outside the scope of what the authors wish to address - the paper is really comprehensive in the areas it does choose to address, so they may feel they cannot do this area justice (even if they agree that it is worth discussing). I will leave it to the authors to decide whether this is worth including in their paper.
L456 "ML models include the requirement for large datasets (previously discussed)" This review presents the availability of large datasets for ML as a strength of ML - which it of course is - but it presents few of the difficulties associated with using these datasets for prediction, for example some of the 'curse(s) of dimensionality' described by Altman & Krzywinski (2018). ML models are still subject to some of these issues - though I realise canvassing these is not the main aim of the paper. Whether these matters are best discussed in this paper is a subjective judgment: I am happy to defer to the authors on this point.
L465 "data-driven models were once thought to be unable to accurately predict values outside the range of the training" I'm not sure this is really true (or if it is, I haven't been exposed to it) - would be good to provide a reference in support of this statement. There is a long history of statistical extrapolation - not least in extreme value theory or design engineering - for exactly these purposes.
L487 "Explainability is sometimes useful to help develop trust in model predictions" this is a very interesting point - in my experience forecasting agencies frequently engage in this kind of story-telling, both for internal and external communications, so this is probably an important box to tick for the widespread adoption of hybrid forecasting systems. I'm not suggesting any change here, but I guess I also feel this kind of narrative building can be antithetical to the effective use of (usually carefully constructed) probability distributions that come out of hybrid forecasting systems.
L536 "For low flows skill may currently extend up to 20 days, but this is mostly due to the quality of the information on initial conditions and the memory effect of catchment storage" this statement may be true specifically for the study by Fundel et al. 2013, but it is phrased more generally. It is quite possible to get forecast skill of streamflow well beyond twenty days - even with simple ESP methods - (depending on catchment, time of year, etc.) so I think the authors should avoid a statement that posits a general limit on the prediction of streamflow of 20 days. Please reword this so that it is clear that this finding was specific to Fundel et al.
Fig 4: As you've used 'prediction' generically in the vertical axis label ('Prediction skill') - implying (correctly in my view) that all the models in this plot produce predictions - I suggest changing the label "Subseasonal to seasonal predictions" to the more specific "Subseasonal to seasonal forecasts" and the label "Climate predictions" to "Multi-year climate forecasts".
Typos etc.
L50 "While conceptual hydrological models..." suggest a paragraph break before 'While'
L71 Suggest paragraph break before 'Historically...'
L83 "to understand to which" typo - delete second 'to'References
Altman N, Krzywinski M. 2018. The curse(s) of dimensionality. Nature Methods 15: 399-400. DOI: 10.1038/s41592-018-0019-x.Citation: https://doi.org/10.5194/hess-2022-334-RC2 - AC2: 'Reply to RC2', Louise Slater, 06 Jan 2023
Louise Slater et al.
Louise Slater et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
1,375 | 723 | 21 | 2,119 | 23 | 16 |
- HTML: 1,375
- PDF: 723
- XML: 21
- Total: 2,119
- BibTeX: 23
- EndNote: 16
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1