the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Leveraging a Novel Hybrid Ensemble and Optimal Interpolation Approach for Enhanced Streamflow and Flood Prediction
Abstract. In the face of escalating instances of inland and flash flooding spurred by intense rainfall and hurricanes, accurately predicting rapid streamflow variations has become imperative. Traditional data assimilation methods face challenges during extreme rainfall events due to numerous sources of error, including model deficiencies, forcing biases and observational uncertainties. This study introduces a cutting-edge hybrid ensemble and optimal interpolation data assimilation scheme tailored to precisely and efficiently estimate streamflow during such critical events. Our hybrid scheme builds upon the ensemble-based framework of El Gharamti et al., integrating the flow-dependent background streamflow covariance with a climatological error covariance derived from historical model simulations. The dynamic interplay (weight) between the static background covariance and the evolving ensemble is adaptively computed both spatially and temporally. By coupling the National Water Model (NWM) configuration of the WRF-Hydro modeling system with the Data Assimilation Research Testbed (DART), we validate the performance of our hybrid prediction system using two impactful test cases: 1. West Virginia's flash flooding event in June 2016, and 2. Florida's inland flooding during Hurricane Ian in September 2022. Our findings reveal that the hybrid scheme significantly outperforms its ensemble counterpart, delivering enhanced streamflow estimates for both low and high flow scenarios, with an improvement of up to 50 %. This heightened accuracy is attributed to the climatological background covariance, mitigating bias and augmenting ensemble variability. The adaptive nature of the hybrid algorithm ensures reliability even with a very small time-varying ensemble. Moreover, this innovative hybrid data assimilation system propels streamflow forecasts up to 18 hours in advance of flood peaks, marking a substantial advancement in flood prediction capabilities.
- Preprint
(12649 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on hess-2023-269', Anonymous Referee #1, 11 Mar 2024
Summary
In this paper, the authors implement and test a novel data assimilation framework that weights the dynamic component (i.e., time-varying sample error covariance matrix) of the ensemble Kalman Filter (EnKF) and a static component (i.e., a climatology-based covariance matrix) when computing the prior ensemble covariance matrix, with the end goal of improving streamflow simulations and flood forecasts. The framework is implemented in the WRF-Hydro modeling system with the Data Assimilation Research Testbed (DART). The authors conduct a suite of experiments to demonstrate their approach, using two case study flood events occurred in West Virginia (June 2016) and Florida (September 2022). The results presented in the manuscript not only demonstrate the superiority of the proposed method through several verification metrics, but also the computational efficiency, since good evaluation scores can be obtained with only 20 members – as opposed to the 80-member implementation of the EnKF benchmark.
This is a scientifically solid and well written piece of work, with a neat collection of (beautiful) graphics and well-supported conclusions. I commend the authors for the high presentation quality. I have only one main comment that will require some work – though should not be hard to address –, along with several minor comments and editorial suggestions that may be helpful to improve the quality of this manuscript.
Main comment
- I think that the authors should make an effort to better connect their results with the existing literature (which is nicely reviewed in the introduction). This can be done in a separate section named “Discussion” (after section 4 and before the conclusions section), and a good starting point would be moving all the text in L614-635 to this new section.
Specific comments
- L3: “model deficiencies”. Do you mean hydrological model (i.e., structural and parametric) deficiencies?
- L6: to the best of my knowledge, the abstract should not contain citations. Please check the guidelines provided by HESS.
- L9: “validate”. I think what you are actually doing is to evaluate the effectiveness of your framework and, therefore, I recommend replacing the word “validation” with “evaluation” throughout the manuscript.
- L12: to avoid confusion among readers, please use the terms “significant” or “significantly” only when referring to statistically significant results. In this case, it seems that “considerably” or “substantially” are better options.
- L24: In this context, does flooding happen because of streamflow (and hence surface water level) increments? Please clarify.
- L30: you might want to refer to “hydrological data assimilation” here, and cite earlier studies on this topic (e.g., Houser et al. 1998; Margulis et al. 2002; Reichle 2008).
- L33: I think that it would be more appropriate to cite earlier studies introducing and clarifying the EnKF (e.g., Evensen 1994; Burgers et al. 1998).
- L35: what do you mean with “ensemble increments” here? Are you referring to differences between observed and modeled fluxes?
- L38: in my opinion, it is odd to cite McMillan’s work (which is amazing) without referring to the retrospective EnKF (Pauwels and De Lannoy 2009, 2006), which inspired the recursive EnKF.
- L44-45: you might want to cite the work of Caleb DeChant when listing particle filter studies (e.g., DeChant and Moradkhani 2011, 2014), and Steven Margulis’ particle batch smoother (Margulis et al. 2015).
- L64: since the paper should be self-contained, it would be good adding a concise explanation of “covariance hybridization” here or somewhere else.
- L119: I think it would be appropriate to include references for the Noah-MP model (Niu et al. 2011; Yang et al. 2011).
- L120-122: the explanation about channel, reservoir and conceptual groundwater component are unclear to me. Can you please elaborate and re-word?
- L131 and caption of Figure 1: I suggest replacing the word "forcings" with "input fluxes", since the former is typically used when referring to meteorological forcings in hydrological modeling.
- L139-140: I recommend the authors including a short description of the input ensemble and the channel parameter ensemble in an Appendix. Also, did you calibrate the model error parameters to achieve good statistical properties (i.e., spread, observation indistinguishable from ensemble members) of the open loop ensemble (e.g., Pauwels and De Lannoy 2009; Alvarez-Garreton et al. 2014)?
- L142-143: please add a few sentences describing the parameter estimation process. Note that there are large parameter sensitivities in the Noah-MP model structure (Mendoza et al. 2015b; Cuntz et al. 2016), and their calibration may affect the outcomes of hydrological applications considerably (e.g., Mendoza et al. 2015a).
- L172: please add the equation for Dy.
- L178: If alpha may vary with time, it would be good to add the subscript k.
- Figures 2 and 3: please note that not all your readers are familiar with US geography. I suggest merging these into a single figure, adding a panel with a map of the globe that shows the CONUS, and a rectangle showing the geographic extension of the subdomains of Figures 2 and 3. Please add a north arrow and a scale bar to each panel.
- L286-307: Are you selecting a temporal window and extracting years randomly? Since you have only 42 years, I presume you can repeat them to complete 1,000 realizations, right? Perhaps a diagram could help to clarify the procedure.
- L312: please refer to your performance measures as probabilistic and deterministic verification metrics. I recommend moving all this information to the Methods section (maybe in a table; see Table 2 in Araya et al. 2023), and add it to your methodology diagram.
- Related to the previous point, I really think you should add at least one probabilistic verification metric to assess reliability – i.e., adequacy of the simulated/forecast ensemble spread to represent the uncertainty in observations – and report this metric in your DA analysis/comparisons. A good choice would be the α index from the predictive quantile– quantile (QQ) plot (Renard et al. 2010).
- Section 4: I suggest renaming this section “Results”.
- L339-344: all this information should be in the Methods section.
- Figure 4: why don’t you include the KGE and NSE of the posterior estimates? You might want to add letters (a) and (b) to each column (this comment applies to all your figures).
- L359-360: How uncertain are the meteorological forcings (in particular, hourly precipitation) used for your model? Perhaps this explains why the hydrological model does not replicate some smaller flood waves.
- Figure 5: in the panels with KGE and NSE results, I suggest replacing the y axis title "KGE, NSE" with "metric value" (or something like that) to avoid confusion among readers, since the KGE and NSE are NOT comparable metrics, even if their ranges of possible values are the same (Knoben et al. 2019). I think it would be good to warn readers about this issue somewhere in the text. If the number of values in the boxplots corresponds to the number of stations, it would be good to provide that number in the axis titles or in the figure caption.
- Figures 5 and 11: I recommend the authors applying a statistical test to check whether the differences among the empirical probability distributions (i.e., differences between boxplots) are statistically significant. For example, they could apply a two sample t-test to check whether the sample means are statistically different at a specific significance level.
- L374-380: this description should be in the Methods section.
- Figure 6: in my opinion, there is no need to repeat the entire legend in all panels.
- L388: “ensemble uncertainty”. Please refer to ensemble spread throughout the paper, since the true uncertainty in your systems is unknown.
- L412-413: I suggest moving this text to the methods section and explaining why is it worth analyzing this.
- L415-416: it would be good adding two panels at the top of Figure 9, with precipitation and streamflow time series to visualize this.
- L425-430: this information should be in a section dedicated to verification metrics.
- L438-439: this text should be at the beginning of subsection 4.3.
- Table 1: I don't think you need more than three decimals.
- L441-444: please move this text to the methods section.
- Figure 10: It is really difficult to differentiate among dots in the top panels. If you want to highlight that EnKF is the worst DA strategy, I recommend changing the symbol type (perhaps replace dots by crosses?). Similarly, you could modify the symbols for other configuration standing out.
- L475, L534, L535 and everywhere else: despite this is a matter of style, I recommend deleting bombastic adjectives (e.g., “exceptional”, “remarkable”) and showing your numbers instead. Let the readers judge your proposed approach.
- L498-499: this is really hard to see because the dots in Figure 14 are too small. Please consider increasing their size.
- L503-504: this is very hard to visualize. Why don't you just show a scatter plot between the weights and the distance to the landfall to make the point?
- L518-519: please move this text to the methods section.
- L538-554: please move this text to the methods section.
- L555-558: please move this description to the caption of Figure 16.
- L558-559: “the EnKF effectively improves the ensemble spread”. I think that this statement is unsupported, unless you include QQ plots or rank histograms to assess how adequate is the spread relative to the observations.
Suggested edits
- L1-2: “accurately predicting” -> “the accurate prediction of”.
- L10: “test cases” -> “case studies”.
- L83: “streamflow flooding problems” -> “streamflow forecasting" or "flood forecasting" problems.
- L95: “September 15 to October 15, 2002”. Move to the end of the sentence, maybe in parentheses.
- L97 and L265: “hourly” -> “at hourly time steps".
- L118: I think a word is missing between “2.1” and “standard”. Maybe "including"?
- L126: “exceed” -> “exceeds".
- L363: delete “clearly”.
- L576: “…of large rivers. Large rivers…” -> “...of large rivers, which have an enduring memory...”
References
Alvarez-Garreton, C., D. Ryu, A. W. Western, W. T. Crow, and D. E. Robertson, 2014: The impacts of assimilating satellite soil moisture into a rainfall-runoff model in a semi-arid catchment. J. Hydrol., 519, 2763–2774, doi:10.1016/j.jhydrol.2014.07.041.
Araya, D., P. A. Mendoza, E. Muñoz-Castro, and J. McPhee, 2023: Towards robust seasonal streamflow forecasts in mountainous catchments: impact of calibration metric selection in hydrological modeling. Hydrol. Earth Syst. Sci., 27, 4385–4408, doi:10.5194/hess-27-4385-2023.
Burgers, G., P. J. Van Leeuwen, and G. Evensen, 1998: Analysis scheme in the ensemble Kalman filter. Mon. Weather Rev., 126, 1719–1724, doi:10.1175/1520-0493(1998)126<1719:ASITEK>2.0.CO;2.
Cuntz, M., J. Mai, L. Samaniego, M. Clark, V. Wulfmeyer, O. Branch, S. Attinger, and S. Thober, 2016: The impact of standard and hard-coded parameters on the hydrologic fluxes in the Noah-MP land surface model. J. Geophys. Res. Atmos., 121, 10,676-10,700, doi:10.1002/2016JD025097.
DeChant, C. M., and H. Moradkhani, 2011: Improving the characterization of initial condition for ensemble streamflow prediction using data assimilation. Hydrol. Earth Syst. Sci., 15, 3399–3410, doi:10.5194/hess-15-3399-2011.
DeChant, C. M., and H. Moradkhani, 2014: Toward a reliable prediction of seasonal forecast uncertainty: Addressing model and initial condition uncertainty with ensemble data assimilation and Sequential Bayesian Combination. J. Hydrol., 519, 2967–2977, doi:10.1016/j.jhydrol.2014.05.045.
Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, doi:10.1029/94jc00572.
Houser, P. R., W. J. Shuttleworth, J. S. Famiglietti, and D. C. Goodrich, 1998: modeling using data assimilation Oi = cr • Oi for all i. Water Resour. Res., 34, 3405–3420.
Knoben, W. J. M., J. E. Freer, and R. A. Woods, 2019: Technical note: Inherent benchmark or not? Comparing Nash-Sutcliffe and Kling-Gupta efficiency scores. Hydrol. Earth Syst. Sci., 23, 4323–4331, doi:10.5194/hess-23-4323-2019.
Margulis, S. A., D. McLaughlin, D. Entekhabi, and S. Dunne, 2002: Land data assimilation and estimation of soil moisture using measurements from the Southern Great Plains 1997 Field Experiment. Water Resour. Res., 38, 35-1-35–18, doi:10.1029/2001wr001114.
——, M. Girotto, G. Cortés, and M. Durand, 2015: A Particle Batch Smoother Approach to Snow Water Equivalent Estimation. J. Hydrometeorol., 16, 1752–1772, doi:10.1175/jhm-d-14-0177.1.
Mendoza, P. A., and Coauthors, 2015a: Effects of hydrologic model choice and calibration on the portrayal of climate change impacts. J. Hydrometeorol., 16, 762–780, doi:10.1175/JHM-D-14-0104.1.
——, M. P. Clark, M. Barlage, B. Rajagopalan, L. Samaniego, G. Abramowitz, and H. Gupta, 2015b: Are we unnecessarily constraining the agility of complex process-based models? Water Resour. Res., 51, doi:10.1002/2014WR015820.
Niu, G.-Y., and Coauthors, 2011: The community Noah land surface model with multiparameterization options (Noah-MP): 1. Model description and evaluation with local-scale measurements. J. Geophys. Res., 116, D12109, doi:10.1029/2010JD015139.
Pauwels, V. R. N., and G. J. M. De Lannoy, 2006: Improvement of modeled soil wetness conditions and turbulent fluxes through the assimilation of observed discharge. J. Hydrometeorol., 7, 458–477, doi:10.1175/JHM490.1.
Pauwels, V. R. N., and G. J. M. De Lannoy, 2009: Ensemble-based assimilation of discharge into rainfall-runoff models: A comparison of approaches to mapping observational information to state space. Water Resour. Res., 45, W08428, doi:10.1029/2008WR007590.
Reichle, R. H., 2008: Data assimilation methods in the Earth sciences. Adv. Water Resour., 31, 1411–1418, doi:10.1016/j.advwatres.2008.01.001.
Renard, B., D. Kavetski, G. Kuczera, M. Thyer, and S. W. Franks, 2010: Understanding predictive uncertainty in hydrologic modeling: The challenge of identifying input and structural errors. Water Resour. Res., 46, W05521, doi:10.1029/2009WR008328.
Yang, Z.-L., and Coauthors, 2011: The community Noah land surface model with multiparameterization options (Noah-MP): 2. Evaluation over global river basins. J. Geophys. Res., 116, 1–16, doi:10.1029/2010JD015140.
Citation: https://doi.org/10.5194/hess-2023-269-RC1 -
AC1: 'Reply on RC1', M.E. Gharamti, 03 Apr 2024
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2023-269/hess-2023-269-AC1-supplement.pdf
-
RC2: 'Comment on hess-2023-269', Anonymous Referee #2, 13 Mar 2024
General comments
This paper presents an innovative application of a hybrid data assimilation algorithm, EnKF-OI (Optimal Interpolation), for streamflow and flood prediction. The hybrid algorithm, developed by El Gharamti et al. (2021), offers novel advancements in hydrologic prediction. It enhances the ensemble spread of EnKF through two key mechanisms: (1) incorporation of time-invariant climatological error covariance into the prior ensemble covariance matrix, and (2) integration of along-the-stream localization. The study conducts a comprehensive evaluation of the hybrid algorithm using two case studies: flash floods in West Virginia and long-term flooding in Florida. Results are analyzed across four main dimensions: (1) weighting between dynamic and static covariances, (2) dynamic ensemble size, (3) adaptive weight adjustment, and (4) short-term streamflow forecasts. The hybrid data assimilation algorithm shows promising performance in two applications.
The paper fits the scope of the HESS. The innovation of this research is clear to me. The experiment design, result analysis, and the presentation of this paper are of good quality. It is a pleasure of reading this research. I suggest a minor revision.
Minor comments to the authors
- Line 6: Consider adding a reference to "El Gharamti et al." in the abstract.
- Line 44: Ensure consistency in abbreviating "United States" (USA or US).
- Line 85: Clarify the difference/relationship/innovation between the existing HydroDART system and the EnKF-OI algorithm employed in this study. Is the EnKF-OI algorithm newly developed or already in the HydroDART system? This clarification would highlight any methodological innovation in the paper.
- Line 118: Provide a brief explanation of "nudging."
- Lines 125-127: Consider rephrasing this sentence for clarity or conduct a grammar check.
- Line 202: Confirm if the sentence "The notation… is equivalent to the trace of matrix A" is used in the preceding equation.
- Line 206: Check the format of the reference "El Gharamti (2021)."
- Lines 397-398: Clarify if this sentence refers to the last subplot of Figure 7.
- Is it possible to shorten this paper by moving some results (e.g., 2nd case study relevant content) to the supplementary? The current manuscript is relatively long.
Citation: https://doi.org/10.5194/hess-2023-269-RC2 -
AC2: 'Reply on RC2', M.E. Gharamti, 03 Apr 2024
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2023-269/hess-2023-269-AC2-supplement.pdf
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
265 | 87 | 16 | 368 | 12 | 13 |
- HTML: 265
- PDF: 87
- XML: 16
- Total: 368
- BibTeX: 12
- EndNote: 13
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1