In recent year, floods becomes a serious issue in the Tibetan
Plateau (TP) due to climate change. Many studies have shown that ensemble
flood forecasting based on numerical weather predictions can provide an early
warning with extended lead time. However, the role of hydrological ensemble
prediction in forecasting flood volume and its components over the Yarlung
Zangbo River (YZR) basin, China, has not been investigated. This study adopts the variable infiltration capacity (VIC) model to forecast the annual maximum floods and annual first floods in the YZR based on precipitation and the maximum and minimum temperature from the European Centre for Medium-Range Weather Forecasts (ECMWF).
The Tibetan Plateau (TP), as the source of many major rivers, is known as “the world water tower” (Xu et al., 2008). Due to its special geological, topographic and meteorological conditions, the ecosystem in this area is vulnerable and susceptible to climate change (Zhao et al., 2005). According to previous studies, it is confirmed that the atmospheric and hydrological cycle in the TP has undergone significant changes. Evident climate warming (Guo and Wang, 2012; Wang et al., 2014; Yang et al., 2014), increased precipitation (Kuang and Jiao, 2016, Wang et al., 2017), glacier retreat and permafrost degradation (Cheng and Wu, 2007) can be recognized, and these impacts are expected to be exacerbated by future climate change (Su et al., 2013). As a result, frequent natural disasters, such as flooding and debris flow, take place, with an estimated direct economic loss amounting to RMB 100 million per year (Zhang et al., 2001). Thus, seeking advanced techniques to improve the accuracy of flood forecasts plays a critically important role in enhancing disaster resilience (Kalra et al., 2012; Yucel et al., 2015; Girons Lopez et al., 2017).
It is now a routine practice to introduce the numerical weather prediction (NWP) products into research and the operational flood forecasting system to generate ensemble streamflow forecasts (Cloke and Pappenberger, 2009). Compared with traditional single-value deterministic flood forecasts, forecasts based on the hydrological ensemble prediction system (HEPS) outperform the traditional deterministic ones, with higher accuracy and longer lead time (Bartholmes et al., 2009; Cloke et al., 2013, 2017; Li et al., 2017; Pappenberger et al., 2015; Todini, 2017). Flood forecasting is one of the most important topics applying the HEPS (Arheimer et al., 2011; Shi et al., 2015), but most of the studies only focus on peak flows (Alvarez-Garreton et al., 2015; Valeriano et al., 2010; Dittmann et al., 2009), and few studies have investigated the suitability of HEPS forecasts in typical accumulated flood volumes and the respective components contributing to the flood volumes, especially the snowmelt-induced component. It is shown that snow water availability and snow dynamics are issues of fundamental importance in high-mountain hydrology (Bavera et al., 2012). Investigating the components constituting the total runoff facilitates the understanding of the runoff-generation mechanism and further improves flood forecasting in high mountains, where our study area is located.
Investigating the skill of the HEPS in streamflow component simulation requires effective methods to separate total runoff into different components of interest. Numerous researchers have studied the methods for achieving hydrograph separation. Some researchers are interested in separating the base-flow or groundwater component from total runoff. For example, Partington et al. (2011) developed a hydraulic mixing-cell method to determine the groundwater component, and Luo et al. (2012) utilized the digital filter program to separate base flow from streamflow. However, many of the hydrological models per se have the ability to separate streamflow into base flow and surface runoff, like SWAT (Soil and Water Assessment Tool) (Luo et al., 2012) and variable infiltration capacity (VIC; Liang et al., 1994); thus the separation of the snow- and/or glacier-induced component from the rainfall-induced component is gaining increasing interest. The most common and historical practice for separating snowmelt and glacier-melt components is to conduct stable isotope analysis (isotopic hydrograph separation; IHS; Laudon et al., 2002). Sun et al. (2016) applied HIS to the Aksu River and successfully calculated the relative contribution of the glacier and snow meltwater to total runoff. Besides the experimental approaches, considerable studies obtain the snowmelt component via a simple ratio of rainfall and snowmelt from hydrological model simulation (Cuo et al., 2013a; Siderius et al., 2013), whereas these methods are often primitive and neglect the physical processes that affect the transformation from snow to runoff, such as evapotranspiration, sublimation and infiltration. Li et al. (2017) developed a new snowmelt-tracking algorithm in the VIC model to compute the ratio of the snow-derived runoff to the total runoff in consideration of systematic analyses, demonstrating promising performance in applications over the western United States.
Generally, evaluating model performance should be performed based on in situ observations. However, observed streamflow components are usually unavailable, making the evaluation of streamflow-component simulations and/or forecasts intractable. Meanwhile, with limited or unavailable observations, it is impossible to achieve rigorous calibration, and thus accounting for hydrological parameter uncertainty is necessary (Pappenberger et al., 2005). Yapo et al. (1998) showed that there is no single objective function that can represent all the features of runoff hydrographs, such as the time to the peak, peak flow and runoff volume. An increasing number of researchers have realized that multi-objective optimization can bring out better results than single-objective optimization, and currently the majority of the hydrological models are calibrated based on multi-objective optimization algorithms (Kamali et al., 2013; Troy et al., 2008; Voisin et al., 2011; Yuan et al., 2013). Multi-objective formulation will result in a set of Pareto-optimal solutions that represent trade-offs among different objectives (Wöhling et al., 2013). Thus, compromise is necessary (Gong et al., 2015). Most of the studies eventually select only one value from the Pareto front to represent the model parameter set for their simulation (Troy et al., 2008; Voisin et al., 2011; Yuan et al., 2013; Liu et al., 2017). This value is usually the compromised point that balances the diverse and sometimes conflicting requirements. However, these solutions provided by multi-objective optimization algorithms have the feature in which moving from one objective to another along the trade-off surface results in the improvement of one objective while causing deterioration in at least one other objective. Additionally, as mentioned by Kollat et al. (2012), it is difficult, in some cases, to cause the two-objective trade-off to collapse into one single point. Due to this limitation, utilizing an ensemble of parameter sets to represent uncertainty from a hydrological model is necessary. Pappenberger et al. (2005) used six different parameter sets to identify uncertainty from the hydrological model. Teutschbein and Seibert (2012) employed 100 different optimized parameter sets in HBV to simulate streamflow in order to consider parameter uncertainty. The basic principle in ensemble forecasts is using ensemble spread to quantify forecast uncertainty and thus to provide essential information to users (Bauer et al., 2015). Analogous to this concept, the benefit of adopting an ensemble of parameter sets from the Pareto-optimal front by a multi-objective optimization algorithm for flood forecasting in consideration of hydrological parameter uncertainty remains unresolved and is worthy of being investigated.
The two purposes of this study are therefore to investigate the suitability of the HEPS in forecasting flood volume and its components (rainfall-induced and meltwater-induced streamflow) over a cold and mountainous area and the impact of an ensemble of selected Pareto-optimal solutions on model simulation and forecasting compared to a single-parameter set. To this end, the paper is structured as follows: Sect. 2 describes the information of the study area and data used. The methodology description is in Sect. 3. Section 4 provides the result analysis, Sect. 5 discusses the main findings and points for future research directions, and the conclusion is presented in Section 6.
We focus our analysis on the Yarlung Zangbo River (YZR) basin, located at
the upper reaches of Brahmaputra River basin, which stretches across the
southern part of the TP from the west to the east, with a drainage area of
2.1 km
The gauged meteorological data, including daily precipitation, minimum and maximum temperature, wind speed, and relative humidity, from 1998 to 2015 are collected from 27 National Meteorological Observatory stations of the China Meteorological Administration (CMA) located in and around the YZR basin, as shown in Fig. 1. Daily streamflow from three hydrological stations is utilized in this study, i.e., from the Nugesha station, Yangcun station and Nuxia station, from the most upstream to downstream region. Except for data missing in 2009, the record period of observed streamflow at Nugesha and Nuxia is consistent with that of the meteorological data. The period of observed streamflow at Yangcun is shorter, spanning from 1998 to 2012. The first year is used as a warm-up period. Periods from 1999 to 2005, 2006 to 2008, and 2010 to 2012–2015 are adopted for calibration, validation and evaluation purposes, respectively.
Location of the study area, and distribution of hydrological and meteorological stations used in this study.
The daily quantitative precipitation forecasts (QPFs) and maximum and minimum temperature (MXT and MNT) from 2007 to 2015 are obtained from European Centre for Medium-Range Weather Forecasts (ECMWF), with lead times from 24 h to 360 h. To be consistent with the observations, the data issued at 00:00 UTC (coordinated universal time) is downloaded. ECMWF is selected in this study due to the well-known fact that forecasts from ECMWF are more skillful than other ensemble prediction systems (Aminyavari et al., 2018; Louvet et al., 2016; Hamill and Scheuerer, 2018).
Snow depth data, provided by Cold and Arid Regions Science Data Center at
Lanzhou, China (
The VIC (Liang et al., 1994, 1996) model is employed in this study to investigate the suitability of ensemble flood forecasting in YZR. VIC is a well-established and extensively used rainfall–runoff model, especially in areas where snowmelt and frozen soil exist (Tang and Lettenmaier, 2010; Cuo et al., 2013a; Su et al., 2016). A two-layer snow model is embodied in VIC which considers snow accumulation and ablation in a ground pack and an overlying forest canopy based on energy balance (Andreadis et al., 2009). The frozen soil algorithm makes it possible to represent the effects of seasonally frozen ground on surface water and energy fluxes (Cherkauer and Lettenmaier, 1999, 2003). These are two of the critical elements in VIC that are particularly relevant to our research.
In this study, VIC is operated at a 6-hourly time step in both the water and
energy balance model with a spatial resolution of
0.125
Model calibration is conducted by a parallel-programmed epsilon-dominance
non-dominated sorted genetic algorithm II (
As flood peaks and volumes are our focuses in this study, more weight is
given to high flows during calibration. Four objective functions are used
for model calibration at three hydrological stations: the Nash–Sutcliffe
efficiency and relative bias for all flows and for the top 10 % flows.
Detailed formulas are defined as
After calibration, a series of feasible solutions are produced by
There are two key attributes for this method. The first is the efficiency of
the
In this study, the POR is performed throughout all possible subspaces, and
the parameter which is not dominated by any of the subspaces is retained.
Additionally, some other points on the Pareto front are also retained: the
extreme value for each objective function (indicated by filled circles in
Fig. 2) and the compromised value in the two-objective trade-off (indicated
by star in Fig. 2). In this way, a limited number of parameter sets are
picked out to represent different scenarios of the model state. For convenience,
the simulations driven by the
Two-dimensional Pareto plots for bias and NSE at Nugesha. The crosses indicate all the non-dominated solutions, and the circle ones are selected
In this study, we are more interested in meltwater- and rainfall-induced
components. Thus, the glacier is modeled together with snow, as glaciers only
contribute about 2 % to the area of the YZR basin (Zhang et al., 2013)
and less than 10 % to the total runoff (Chen et al., 2017). The snowmelt
tracking algorithm (STA), proposed by Li et al. (2017), is thus an
appropriate method for achieving the needed hydrograph separation. In order to
obtain the streamflow derived from meltwater,
The fraction of meltwater-induced base flow (
A similar equation to Eq. (7) can be written for rain (
Unlike Li et al. (2017), all the aforementioned variables are integrated
values over the entire basin in units of millimeters. When performing
hydrograph separation, a 1-year warm-up is used to achieve fully
explained soil moisture sources. Total runoff is separated into four
components, namely the surface runoff derived from meltwater (
In order to improve the raw forecasts from ECMWF, we propose a post-processing method by coupling parameterized quantile mapping (QM) with the Schaake shuffle (hereafter referred to QM–SS). QM is adopted in this study, as it is a simple yet effective statistical bias-correction method in hydrological applications (Li et al., 2010; Xu et al., 2014; Salathé Jr. et al., 2014). In most cases, the empirical cumulative distribution function is used to present the data distribution in QM. However, many studies (Viste et al., 2013; Stauffer et al., 2017; Tao et al., 2014) have demonstrated that it is more appropriate to use fitted parametric distributions, as no frequent interpolation or extrapolation would be requested (Li et al., 2010). For QPFs, due to the strongly positively skewed distribution in rainfall (Stauffer et al., 2017), QM based on single-gamma distribution is recommended and utilized for bias correction in this study, although some studies found that a combination of double-gamma (Yang et al., 2010) and gamma-GEV (generalized extreme value distribution; Smith et al., 2014) can be more effective. There are two reasons for our choice here. Firstly, we compared the single-gamma distribution with double-gamma and gamma-GEV distributions and obtained similar performance scores according to the mean squared error. Secondly, the bias correction in this study is performed for each grid, each lead time and each variable. Given the heavy computation labor, the single-gamma distribution is selected here for efficiency and saving time. For MXT and MNT, four-parameter beta distribution is utilized as suggested by Li et al. (2010). Owing to the limited record of ECMWF forecast, the data excluding the forecast year are used as training data to determine the parameters of QM.
Since forecasts are post-processed for individual lead times, grids and variables, the forecast ensembles therefore tend to be inappropriately space–time correlated. To generate ensemble members with appropriate space–time correlations, the Schaake shuffle (Clark et al., 2004) is applied to link historical data to ensemble members and to create sequences with realistic spatio-temporal patterns; 38 years of historical data from 1978 onward are used to apply the Schaake shuffle procedure. Details for conducting the Schaake shuffle can be found in Clark et al. (2004) and Schepen et al. (2018).
The annual maximum flood is picked out of typical flood events. Meanwhile,
the first flood event in each year is also selected. The maximum flood is
determined by the maximum daily streamflow in a year. For the first flood, the
definition seems to be slightly subjective. Nevertheless, the first flood is
just introduced as an example to verify the skill of the VIC–ECMWF system in
forecasting the meltwater components. There are three criteria for us to
define the first flood: (1) the peak flow should be more than twice the
average daily streamflow during the dry period (November to March), (2) the
duration of the flood event should be longer than 7 d, and (3) the observed
snowpack should be present. Forecasts are issued for each chosen event. Considering
that the maximum flood events in YZR usually last for several months, it is impossible to cover flood
volume over the entire flood event by
medium-range weather forecasts. Four typical flood volumes are therefore
chosen to represent the volume performance, i.e., the peak flow (
The continuous ranked probability skill score (CRPSS; Hersbach, 2000) is
adopted to indicate the overall performance of the forecasts as a
comprehensive evaluation metric, which is calculated via normalizing the
continuous ranked probability score (CRPS) by a reference forecast. The
reference forecast in this study is an ensemble of hydrological forecasts
simulated by the VIC model using sampled historical meteorological
observations on the same calendar day as input to the model (Bennett et al.,
2014). For deterministic forecasts, the CRPS reduces to mean absolute
error (MAE) and can be directly compared. The CRPS and MAE are negatively
oriented and tend to increase with forecast bias or poor reliability
(Shrestha et al., 2015). The value of the CRPSS ranges from
Two specialized indicators for flood events are utilized according to works
by Smith et al. (2004), i.e., the percent absolute flood volume error
In this study, the performance of
Figure 2 shows an example of two-dimensional Pareto plots for the bias and NSE at
the Nugesha station. The performance of the selected
Information of
The observed and simulated hydrographs during the evaluation period at Nuxia
are presented in Fig. 3. An obvious underestimation can be observed in low-flow periods, which is similar to previous studies by Tong et al. (2014) and
Zhang et al. (2013). The absence of the glacier module in VIC is believed to have
limited influence on this underestimation, and similarly underestimated low
flow was found when glacier modeling was embedded in VIC (Zhang et al.,
2013). For our study, the underestimation is, meanwhile, caused by
the fact that the objective functions used for calibration have the tendency
to pay more attention to high flows, as the flood is the focus of our
investigation. As revealed in Fig. 3, the flood peaks are captured well by
Daily time series of simulated and observed streamflow at Nuxia station. The upper bar is the areal precipitation.
The indicators for typical flood volumes simulated by VIC for the first floods
and maximum floods during the whole study period are listed in Table 2. Two
statistical indictors are adopted here, i.e., the CRPS for
Typical simulated flood volumes versus observed ones. The crosses
in the figures are results by
CRPS and MAE for
VIC-simulated snow cover is compared with snow depth derived from passive
microwave remote-sensing data. Figure 5 shows the spatial distribution of
observed and simulated daily average snow depths during evaluation. For
simplicity, only the results at Nuxia are displayed. An acceptable agreement
(correlation coefficient of 0.63) can be found over the entire domain,
especially for the middle reaches. Some overestimation exists in the
upstream and downstream regions. Explanation for these errors in snow depth
will be further described in Sect. 5. We also compare the fraction of
meltwater-induced components to total runoff with previous studies (Liu,
1999; Cuo et al., 2014), as shown in Table 3. It is noticeable that the
results by
Spatial distribution of daily average snow depths derived from
remote sensing
Fractions of meltwater-induced streamflow to total runoff during the evaluation period for three stations.
Streamflow forecasts are driven by QM–SS post-processed QPF and temperature
data. A preliminary analysis of raw and post-processed ECMWF forecasts
reveals that QM–SS is effective in reducing errors, and the post-processed
forecasts are skillful enough for streamflow forecasting (see Fig. S1 in
the Supplement). Figure 6 displays the CRPSS values of different flood volumes at three hydrological stations. Lead times of day 3, 5, 7, 10, 12 and 14 are chosen as representatives to trace the forecast quality.
Generally, flood volumes tend to be captured better with the increase in
duration. One reason is that there are often larger errors in the simulated
flood peak, making the single-day flood volume more prone to bias. Another
reason is that when the duration increases, the bias in streamflow for this
relatively long period can offset itself. Performance of the
VIC–ECMWF system deteriorates with increasing lead time, as expected. The
lead time of skillful forecasts for the first floods is shorter than for the maximum
floods. This can be explained by the generation mechanism of the first floods.
The first floods are usually dominated by base flow and meltwater. Compared with
the maximum floods, the first floods normally occur in the same period within 1 year, so historical meteorological observations on the same calendar day can
provide skillful input. This fact results in a reference forecast which is
hard to beat. As for the maximum floods, streamflow can be predicted at least 10 d ahead. Similarly to Table 2, forecasts driven by the
CRPSS for different typical accumulated flood volumes against lead
time. The upper panels are results for first floods, and the lower panels are
for maximum floods. Scores derived from
Another statistical indicator computed from forecasted flood volumes driven
by
As demonstrated in Fig. 8, the
The errors in peak time prediction are displayed in Fig. 9. Figure 9 a, c and e
are subplots for the first floods, and the results for the maximum floods are shown
in Fig. 9b, d and f. Similar to
This subsection presents results of
CRPSS of four different streamflow components against lead time at Nugesha. Meltwater-induced components for first floods
From Fig. 10, it is noticeable that for the first floods at Nugesha, errors in
forecasting surface runoff components are the main source contributing to
errors in forecasting total runoff. Forecast skill for base-flow components
seems to be insensitive to lead time (Fig. 10a and b). On one hand, these
components are mainly generated by available water storage in the catchment.
On the other hand, the base-flow process often evolves slowly, possibly
making the forecast lead time unable to cover the base-flow variability.
As for the maximum floods, the errors derived from surface runoff forecasts are
similarly the main contributor to errors in total runoff forecasts, but the
base flow exhibits a similar tendency with surface runoff and total runoff,
deteriorating with lead times, as shown in Fig. 10c and d. This means that during
the period of maximum floods the infiltration is substantial in VIC and
makes the moisture in the bottom soil layer vary with the rainfall and meltwater
inputs. The information in Fig. 10c and d is in good agreement with results
displayed in Fig. 6. A fluctuating CRPSS in
Similar performance can be found at Yangcun, as shown in Fig. 11. Base flow
components for the first floods are consistently reproduced well by the system,
with a CRPSS greater than 0.8 for all the lead times. The variation in total
runoff is fairly consistent with surface runoff. However, a higher CRPSS in
both
CRPSS of four different streamflow components against lead time at Yangcun. Meltwater-induced components for first floods
The most noticeable phenomenon at Nuxia is that base-flow components for
the first floods at this station exhibit an obvious deterioration with lead
times (Fig. 12a and b). Nuxia is located in the most downstream reaches and
concentrates water from hundreds of tributaries. Some tributaries are fairly
small, with rapid response of base flow and surface runoff, and some
tributaries may have intensive interactions between the entire soil layer,
causing the base flow in the outlet to vary with lead time. The CRPSS of all the
flood components has similar changes to scores of total runoffs in Fig. 6.
Generally, the
CRPSS of four different streamflow components against lead time at Nuxia. Meltwater-induced components for first floods
In this study,
Spatial distribution of VIC snow depth and glacier in YZR basin.
As there is no glacier module in the current VIC model, similarly to previous
studies (Li et al., 2014; Liu et al., 2014; Sun et al., 2013), the
glacier-related process was considered together with the snow in this study.
In other words, the rainfall input into VIC is separated into only two
components, the liquid (rainfall) and solid parts (snow), and the portion of
rainfall which is supposed to turn into a glacier or ice is treated as snow
instead. That is why the snow depth simulated by VIC is somewhat higher than
that of the remote-sensing data shown in Fig. 5, while the meltwater proportion
is close to the records (Table 3). Additionally, comparing with the
distribution of used meteorological stations shown in Fig. 1, we can infer
that these positive biases were also induced by the interpolation using data
from stations at which there are more snow and glaciers present. To verify our
conclusion, we plot the VIC-simulated snow depth together with the
distribution of glaciers in the YZR basin. The glacier data are downloaded
from the “Second Glacier Inventory Dataset of China” (
For a streamflow component forecast, the biggest challenge is the absence of
data series of in situ streamflow components. Therefore, in this study the
simulation driven by observed forcing becomes an alternative to acting as a proxy,
and thus the error stemming from hydrological model is avoided. This is a
common practice when observation is absent (Arnal et al., 2018; Harrigan et
al., 2018). Without calibration of specific streamflow components,
a conclusion simply based on simulation of a single-parameter set may be risky,
and an ensemble from multi-parameter sets is believed to be more confident with
consideration of hydrological uncertainty. From our results, different
parameter sets behave similarly in streamflow component forecast, i.e., deteriorating with increasing lead time. However, when it comes to a specific skill score, slight differences can be viewed from Figs. 10–12. Sometimes,
The meltwater-induced components in streamflow are found to be difficult for
the system to forecast, and those in surface runoff are the toughest
part. This is reasonable, since the surface runoff is the most susceptible
variable to various hydrometeorological factors. Specifically,
In this study, a hydrological ensemble prediction system composed by VIC and
ECMWF medium-range precipitation and temperature forecasts was developed and
applied to the YZR basin to investigate the forecasting performance of flood
volumes and streamflow components. Two different simulation modes were
adopted. One is Flood forecast skill deteriorates with lead time. The forecast skill of flood volume increases with duration. At the Nugesha and Yangcun stations, base-flow components tend to be
insensitive to an increase in lead time due to the slowly evolved base-flow
process. At the Nuxia station, base flow exhibits similar patterns to total
runoff. The meltwater-induced component in surface runoff is the most difficult part for the proposed system to forecast, compared with reference forecasts, which can only be captured in 4–7 d. Well-forecasted rainfall-induced streamflow is the main contributor to successful flood forecasting.
Data sets are available upon request by contacting the correspondence author.
The supplement related to this article is available online at:
SP provided the methodology used to bias correct the raw ECMWF forecasts. ZB helped to develop the model code. YPX guided and supervised the study. LL performed the simulation and prepared the paper, with contributions from all the co-authors.
The authors declare that they have no conflict of interest.
The National Climate Center of the China Meteorological Administration and the Hydrology and Water Resource Bureau of Tibet are greatly acknowledged for providing meteorological and hydrological data used in the study area. QPFs and temperature forecasts were obtained from ECMWF's TIGGE data portal. Thanks are also given to ECMWF for the development of this portal software and for the archives of this immense dataset. We would like to acknowledge the editors and reviewers for their reviews and very constructive feedback.
This research has been supported by the National Natural Science Foundation of China (grant no. 91547106) and the National Key Research and Development Plan “Inter-governmental Cooperation in International Scientific and Technological Innovation” (grant no. 2016YFE0122100).
This paper was edited by Erwin Zehe and reviewed by Renata Romanowicz and two anonymous referees.