the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
State updating in the Xin'anjiang Model: Joint assimilating streamflow and multi-source soil moisture data via Asynchronous Ensemble Kalman Filter with enhanced Error Models
Abstract. Assimilating either soil moisture or streamflow individually has been well demonstrated to enhance the simulation performance of hydrological models. However, the runoff routing process may introduce a lag between soil moisture and outlet discharge, presenting challenges in simultaneously assimilating the two types of observations into a hydrological model. The Asynchronous Ensemble Kalman Filter (AEnKF), an adaptation of the Ensemble Kalman Filter (EnKF), is capable of utilizing observations from both the assimilation moment and preceding periods, thus holding potential to address this challenge. Our study first merges soil moisture data collected from field soil moisture monitoring sites with China Meteorological Administration Land Data Assimilation System (CLDAS) soil moisture data. We then employ the AEnKF, equipped with improved error models, to assimilate both observed outlet discharge and the merged soil moisture data into the Xin'anjiang model. This process updates the state variables of the model, aiming to enhance real-time flood forecasting performance. The testing on both synthetic and real-world cases demonstrates that assimilation of these two types of observations simultaneously substantially reduces the accumulation of past errors in the initial conditions at the start of the forecast, thereby aiding in elevating the accuracy of flood forecasting. Moreover, the AEnKF with the enhanced error model consistently yields greater forecasting accuracy across various lead times compared to the standard EnKF.
- Preprint
(2406 KB) - Metadata XML
-
Supplement
(229 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
CC1: 'Comment on hess-2024-211', zongping ren, 22 Jul 2024
The study is briefly based on the development of the Xin'anjiang hydrological model. For this aim, Asynchronous Ensemble Kalman Filter (AEnKF) with enhanced error model is used to joint assimilate streamflow and multi-source soil moisture data. Furthermore, this paper proposes a novel method to integrate CLDAS soil moisture data with in situ observations, enhancing the accuracy of the dataset. Wuqiangxi catchment is selected for the application. The results produced by the AEnKF assimilating different types of observations are then evaluated by some performance metrics. The work is extensive and well-structured. The subject is novel and the study is valuable in terms of the hydrological forecasting in terms of flood events in river basins. However, the discussion of main and latest studies on the subject needs to be further strengthened. Some suggestions and comments to the authors are presented below:
- What are main differences between AEnKF and EnKF? Supported and related studies about AEnKF should be strongly presented in Introduction to emphasize highlights of the paper.
- Line 231. The SM is mentioned but not explained. Please insert a definition the first time it is mentioned.
- Section 2.3. I am not clear on how the hydrological model simulates infiltration. What are the infiltration parameters? They don't seem to be shown in Table 1. It is not necessary to explain your hydrological model again in the manuscript, but I need to understand why infiltration parameters are not considered in the model perturbation.
- Fig. 2. I suggest replacing the Yangtze River Basin with China to make it more understandable to the reader.
- Section 5.1.2, Fig. 3. Could the authors explain why sometimes larger time window produces poorer results instead? Is it because observations that go too far back in time compromise the quality of real time?
- Line 505-520, Fig. 7. There are flood events where the performance of the AEnKF remains superior to the OL even after 24 hours. Have authors provided an explanation for why the model correction persists for such an extended period in these cases?
- Section 6. This part needs some improvements. Currently, it is presented in broad terms. Specifically, the conclusions drawn from the calculated performance metrics should be detailed explicitly.
- Are there any limitations or recommendations for the application of this study? Is the proposed methodology applicable to all regions? Can the method be applied in data-scarce regions with limited observations?
- As a crucial step in the study, the statistical characteristics of the data used (e.g., peak discharge) should be presented in detail. The statistical properties, including skewness, coefficient of variation, confidence intervals, boxplots for outlier data, distribution characteristics, minimum, maximum, and median values, etc., should be provided in a table.
- I recommend including the error estimation and evaluation metrics section from the supplement as an appendix in the main document, rather than providing it as a separate file. This will facilitate a more cohesive understanding for interested readers.
Citation: https://doi.org/10.5194/hess-2024-211-CC1 -
AC1: 'Reply on CC1', Junfu Gong, 26 Jul 2024
We would like to sincerely appreciate you for the commons of our manuscript. We have considered all the revisions you suggested and will incorporate them in the subsequent revisions manuscript. Below are the responses to the questions you raised regarding this study:
Reply to Question "What are main differences between AEnKF and EnKF? " We will explain the main differences between AEnKF and EnKF. EnKF is a synchronous assimilation method that assimilates observations at the current time into the hydrological model at the analysis step. This means that EnKF updates the state variables of the Xin’anjiang model based only on observations from the current time step. In contrast, AEnKF is a more advanced asynchronous assimilation method, allowing for the assimilation of both current and past observations during the analysis step. Specifically, in this study, AEnKF assimilates observations from the current time and the previous twhours (tw is the assimilation time window) into the Xin'anjiang model, updating the model's state variables. This asynchronous assimilation helps to consider the complex nonlinear relationships between observations at multiple times and the hydrological model's state variables.
Reply to Question "How the hydrological model simulates infiltration" We should explain to you the method of calculating the runoff of the Xin'anjiang model. The Xin'anjiang model is a conceptual hydrological model that generalizes the rainfall-runoff process. Its most prominent feature is performing runoff production calculations based on the saturation-excess runoff mechanism, meaning net rainfall is first entirely used to replenish soil water, and once the soil moisture content in the unsaturated zone reaches field capacity, all subsequent net rainfall is used to generate runoff. Therefore, the Xin'anjiang model does not involve infiltration parameters.Â
Reply to Question "Why sometimes larger time window produces poorer results instead?" This is primarily because a longer time window includes too much historical information, which may have a weak correlation with the current state variables. Including too much historical observational information in the assimilation system may lead to a degradation in assimilation performance. Tao et al. (2016) (https://doi.org/10.1016/j.jhydrol.2016.02.019) tested the performance of the standard AEnKF method with 1-3 hour assimilation time windows and obtained similar results. They found that the 2-hour time window generally yielded better assimilation results than the 3-hour time window, while the 1-hour time window performed the worst.
Reply to Question "Why the model correction persists for such an extended period (more than 24 hours) in some cases?" We believe that the assimilation effect of AEnKF can last for more than 24 hours, mainly because the soil moisture state variables were effectively updated in these events. The initial soil moisture state at the forecast start time reflects the basin's wetness at that moment and significantly impacts forecast accuracy for a considerably long lead time. In Rakovec et al. (2015) (https://doi.org/10.5194/hess-19-2911-2015), the average temporal persistence of the standard AEnKF assimilation effect could reach a 45-hour lead time, likely because they also updated the soil moisture state variables.
Reply to Question "Are there any limitations or recommendations for the application of this study? Can the method be applied in data-scarce regions with limited observations?" The Xin'anjiang model is based on the saturation-excess runoff generation mechanism, where net rainfall is first entirely used to replenish soil water, and once the soil moisture content in the unsaturated zone reaches field capacity, all subsequent net rainfall is used to generate runoff. This runoff generation mechanism is generally applicable to humid and semi-humid regions, making the Xin'anjiang model theoretically suitable only for these areas. Since humid regions in China are most affected by flood disasters and the Xin'anjiang model is currently the most widely used hydrological model in operational flood forecasting in China, this study uses the Xin'anjiang model as an example and tests it only in humid regions in China. However, it is important to emphasize that the AEnKF method with the enhanced error models proposed in this study can be easily coupled with any hydrological model, so its application is not limited to humid and semi-humid regions. In future research, we will focus on coupling and testing the AEnKF with enhanced error models and other hydrological models. The proposed method in this study involves assimilating observational data into the hydrological model, making it inapplicable in data-scarce regions.Â
Citation: https://doi.org/10.5194/hess-2024-211-AC1
-
RC1: 'Comment on hess-2024-211', Anonymous Referee #1, 28 Aug 2024
This study provides a comprehensive review of hydrological data assimilation for flood simulation (forecasting). It attempts to integrate soil moisture data from various sources and jointly assimilate them with runoff observations into a hydrological model. The uniqueness of this paper lies in its first-time application of the Asynchronous Ensemble Kalman Filter (AEnKF) for such joint assimilation, with a consideration of the temporal correlation of observation errors. The paper is well-structured, rich in content, and the results are presented clearly, which made it an engaging read for me. Overall, it is a well-conducted study. However, there are some areas that could be further improved, such as the insufficient discussion of the AEnKF method in the introduction. Below are some of my comments and suggestions:
1.The Asynchronous Ensemble Kalman Filter (AEnKF) is a simple yet effective data assimilation method, well-suited for state updating in hydrological models. However, the authors have not sufficiently discussed the existing research and applications of the AEnKF method. I recommend that the authors emphasize this discussion more prominently in the Introduction (page 3).
2.The discussion of the advantages of AEnKF should be included in the introduction rather than in the methodology section (page 5, Lines 160-163).
3.It is very interesting that the study considers the temporal correlation of observation errors and rainfall errors in data assimilation, as most studies assume these errors are independent. Could the authors provide more details on how this was specifically implemented? (page 8, Lines 220-222)
4.As far as I know, the Xin'anjiang model is based on the saturation-excess theory, making it suitable only for regions where this runoff generation mechanism dominates, such as humid and semi-humid areas. It is not applicable in regions where infiltration-excess theory is predominant, such as arid and semi-arid areas. Could the authors clarify whether the method proposed in this study is applicable to arid and semi-arid regions? (page 9, Lines 251-260)
5.How are the initial state values for the daily simulation model set? (page 15, Lines 363-364)
6.Why a longer assimilation time window sometimes leads to poorer results. Could the authors provide an explanation for this? (page 17, Lines 427-433)
7.What does "One-step prediction" refer to? Does it mean a one-hour forecast? Please clarify. (page 18, Line 444)
8.Why was the lead time set to 8 hours? (page 31, Figure 13)
9. Any limitations of the study should be openly discussed, along with suggestions for future research. (page 32)
Citation: https://doi.org/10.5194/hess-2024-211-RC1 -
AC2: 'Reply on RC1', Junfu Gong, 30 Aug 2024
Thank you very much for your concise paper summary and positive feedback on our research. We are honored that our paper has captured your interest. We have carefully considered all of your comments and will make the appropriate revisions in the upcoming revised manuscript. Below are our responses to each of your specific suggestions. It is important to note that this section only addresses the discussion of ideas; the specific revisions will be provided in the forthcoming revised manuscript and the corresponding revision notes.
- Thank you very much for your suggestion. We recognize this issue and will include a discussion on existing research and applications of the AEnKF method in the Introduction section of the revised manuscript.
- Thank you very much for your suggestion. We will delete this section in the revised manuscript, as the advantages of AEnKF have already been discussed in the Introduction (Page 3).
- Thank you very much for your comment. We addressed the temporal correlation of rainfall and runoff observation errors using a simple first-order autoregressive model. By designing an appropriate first-order autoregressive function, we ensured that the error model, which accounts for temporal correlation, maintains the same mean and standard deviation as the original error model after transformation. The specific calculation method will be detailed in the forthcoming revision notes.
- Thank you for your discussion of the Xin'anjiang model. We completely agree with your view on its runoff generation mechanism. The Xin'anjiang model is indeed only suitable for humid regions where the saturation-excess runoff mechanism is dominant and is not applicable to arid and semi-arid regions. However, it is important to note that the state updating method proposed in this study is not limited to coupling with the Xin'anjiang model. In fact, this method can be easily coupled with any hydrological model that includes state variables related to soil moisture and channel storage. When coupled with hydrological models suitable for semi-arid and arid regions, it can be effectively applied in those areas.
- Thank you for your comment. In this study, the initial values for the daily simulation are set with the soil moisture content at half of the saturation value, and the sub-reaches outflow was set as the observed discharge at the basin outlet on the start date, divided by the total number of sub-reaches. In fact, after an extended period of daily simulation, the initial values of the state variables have a negligible impact on the study, so they can be set to any reasonable value.
- Thank you for your comment. We have addressed this issue in our response to Comment CC1. This is primarily because a longer time window includes too much historical information, which may have a weak correlation with the current state variables. Including too much historical observational information in the assimilation system may lead to a degradation in assimilation performance. Tao et al. (2016) (https://doi.org/10.1016/j.jhydrol.2016.02.019) tested the performance of the standard AEnKF method with 1-3 hour assimilation time windows and obtained similar results. They found that the 2-hour time window generally yielded better assimilation results than the 3-hour time window, while the 1-hour time window performed the worst.
- We apologize for any confusion caused by this imprecise description. "One-step prediction" indeed refers to a one-hour forecast, and we will clarify this in the revised manuscript.
- Thank you for your comment. In our real-world cases, we selected an 8-hour lead time primarily due to the limitations of data length. To ensure the consistency of the forecast sequence length and the comparability of results, the forecast start time for different lead times within the same flood event was set to the same point -- specifically, the LT hour after the flood start time (the earliest available hourly data). LT represents the longest lead time in this study. If the longest lead time is set to LT = 8 hours, even for a 1-hour lead time, the forecast begins at the 8th hour after the flood start time. Given the overall short length of available hourly data, in some flood events, the peak occurs as early as the 9th or 10th hour after the forecast begins. If the lead time were set longer than 8 hours, the forecast sequence might not include the flood peak, rendering the results meaningless for flood forecasting. Therefore, in the real-data experiments, we set the maximum lead time to 8 hours. To compensate for the shorter lead time in the real-world cases, we extended the maximum lead time to 24 hours in the synthetic data experiments, which is fully adequate for flood forecasting in medium to small basins covering several thousand square kilometers.
- Thank you for your suggestion. In the revised manuscript, we will include a discussion on the limitations of the methodology used in this study, addressing its applicability, data constraints, and potential directions for future research.
Citation: https://doi.org/10.5194/hess-2024-211-AC2
-
AC2: 'Reply on RC1', Junfu Gong, 30 Aug 2024
-
RC2: 'Comment on hess-2024-211', Anonymous Referee #2, 29 Aug 2024
In the manuscript, joint assimilating streamflow and soil moisture data via Asynchronous Ensemble Kalman Filter with enhanced Error Models was conducted. The modelling results are improved compared with conventional methods. The findings are very helpful for real-time flood forecast. The following points should be further clarified in the revised version.
Â
(1) Methods section, I suggest ‘hydrological model’ should be introduced first. Then the readers could understand the model parameters easily in other sections.
(2) Figure 2(b), there are 3 discharge stations, namely Hexi, Gaochetou, and Wuqiangxibashang. But it is hard to see the controlled drainage area for these 3 stations. Although the rainfall station, soil moisture monitoring sites are can be seen, it should be described in the main text.
(3) In the study region, is there any hydraulic infrastructure to affect runoff generation?
(4) Line 389, ‘the maximum lead time is set to 8 hours to avoid missing peak flows’. I cannot understand the linkage between lead time and peak flows.
(5) Discussion is an important part. I suggest it be a separate section. If the proposed method are used in distributed hydrological models (i.e. distributed Xin’anjiang model), what will be the results?
(6) Section 5.1.3, only 6 flood events are selected for analysis, could you add some flood events in 2024? Could you please provide the simulated hydrographs by the assimilation schemes for the 6 events?
Citation: https://doi.org/10.5194/hess-2024-211-RC2 -
AC3: 'Reply on RC2', Junfu Gong, 07 Sep 2024
Thank you for your concise summary of the paper and for your positive feedback on our study. We have carefully considered all of your comments and will implement revisions in the upcoming manuscript. Below are our responses to each of your specific suggestions. Please note that this section focuses solely on the discussion of ideas; the specific revisions will be detailed in the forthcoming manuscript and accompanying revision notes.
- Thank you for your suggestion. We will adjust the structure of the manuscript in the revision, beginning with an introduction to the hydrological model.
- We apologize for the misunderstanding caused by the unclear image description. To clarify, of the three hydrological stations, Wuqiangxibashang provides outflow data at the basin outlet, while Hexi and Gaochetou provide inflow data to the basin. However, due to the lack of soil moisture and rainfall data within their control areas, these stations were not included in the study area.
- Thank you for your common. The study area is a natural watershed, and the only nearby large reservoir is located downstream of the Wuqiangxibashang station. As a result, it does not significantly impact the forecast results for the study area.
- Thank you for your comment. This issue was also addressed in our response to RC1. In our real-world cases, we selected an 8-hour lead time primarily due to the limitations of data length. To ensure the consistency of the forecast sequence length and the comparability of results, the forecast start time for different lead times within the same flood event was set to the same point -- specifically, the LT hour after the flood start time (the earliest available hourly data). LT represents the longest lead time in this study. If the longest lead time is set to LT = 8 hours, even for a 1-hour lead time, the forecast begins at the 8th hour after the flood start time. Given the overall short length of available hourly data, in some flood events, the peak occurs as early as the 9th or 10th hour after the forecast begins. If the lead time were set longer than 8 hours, the forecast sequence might not include the flood peak, rendering the results meaningless for flood forecasting. Therefore, in the real-data experiments, we set the maximum lead time to 8 hours. To compensate for the shorter lead time in the real-world cases, we extended the maximum lead time to 24 hours in the synthetic data experiments, which is fully adequate for flood forecasting in medium to small basins covering several thousand square kilometers.
- Thank you for your suggestion, which has been extremely helpful for improving the paper. In the revised manuscript, we will include a separate discussion section, focusing on topics such as typical flood events and the limitations of the proposed method. This will include the challenges of applying the method to distributed hydrological models. Semi-distributed hydrological models, like the Xin'anjiang model used in this study, have smaller state variable dimensions, allowing for the direct application of the proposed state updating scheme. However, in distributed models where each computational grid (e.g., DEM-based grids) has its own state variables, the state dimension becomes large, making direct application inefficient or prone to spurious correlations from distant observations. To resolve this, we recommend applying covariance localization to AEnKF (Janjić et al., 2011, https://doi.org/10.1175/2011MWR3552.1) or other localization techniques (Khaniya et al., 2022, https://doi.org/10.1016/j.jhydrol.2022.127651). For instance, in covariance localization, a localization radius RL is set, and the forecast error covariance matrix is adjusted using a correlation matrix derived from the Schur product theorem. This study focuses on jointly assimilating soil moisture and streamflow using AEnKF, and performing localization on AEnKF is beyond the scope of this research. We will explore this further in future work.
- Thank you for your suggestion. We are currently in contact with water management agencies to inquire about the availability of 2024 data. If the data become available, we will include the 2024 flood events. Additionally, we will provide simulated hydrographs for the six flood events in the appendix of the revised manuscript.
Citation: https://doi.org/10.5194/hess-2024-211-AC3
-
AC3: 'Reply on RC2', Junfu Gong, 07 Sep 2024
- AC4: 'Comment on hess-2024-211', Junfu Gong, 08 Oct 2024
- AC5: 'Changes to figure 2', Junfu Gong, 13 Nov 2024
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
411 | 179 | 83 | 673 | 33 | 16 | 15 |
- HTML: 411
- PDF: 179
- XML: 83
- Total: 673
- Supplement: 33
- BibTeX: 16
- EndNote: 15
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1