Uncertainty analysis of hydrological ensemble forecasts in a distributed model utilising short-range rainfall prediction

Abstract. Advances in mesoscale numerical weather predication make it possible to provide rainfall forecasts along with many other data fields at increasingly higher spatial resolutions. It is currently possible to incorporate high-resolution NWPs directly into flood forecasting systems in order to obtain an extended lead time. It is recognised, however, that direct application of rainfall outputs from the NWP model can contribute considerable uncertainty to the final river flow forecasts as the uncertainties inherent in the NWP are propagated into hydrological domains and can also be magnified by the scaling process. As the ensemble weather forecast has become operationally available, it is of particular interest to the hydrologist to investigate both the potential and implication of ensemble rainfall inputs to the hydrological modelling systems in terms of uncertainty propagation. In this paper, we employ a distributed hydrological model to analyse the performance of the ensemble flow forecasts based on the ensemble rainfall inputs from a short-range high-resolution mesoscale weather model. The results show that: (1) The hydrological model driven by QPF can produce forecasts comparable with those from a raingauge-driven one; (2) The ensemble hydrological forecast is able to disseminate abundant information with regard to the nature of the weather system and the confidence of the forecast itself; and (3) the uncertainties as well as systematic biases are sometimes significant and, as such, extra effort needs to be made to improve the quality of such a system.


Introduction
With advances in numerical weather prediction (NWP) in recent years as well as an increase in computing power, it is now possible to generate very high resolution rainfall forecasts at the catchment scale and therefore, more flood forecasting systems are tending to utilise quantitative precipitation prediction (QPF) from high-resolution NWPs Introduction EGU area where the model performance is highly dependent on the rapid availability of knowledge of rainfall distribution in advance (Ferraris et al., 2003). Many efforts have been made to utilise the QPF in the context of real-time flood forecasting in which one or more "state-of-the-art" QPFs stemming from different methods are to be integrated into the whole system (e.g. Bartholmes and Todini, 2005;De Roo et al., 2003;Kobold 5 and Sušelj , 2005;Verbunt et al., 2006;Xuan et al., 2005;Xuan and Cluckie, 2006). However, the effects of QPF uncertainty on the whole system can be easily appreciated either intuitively or by case studies (Xuan et al., 2005). Indeed, recent research on integrating QPF directly into the real-time flood forecasting domain reveals that direct coupling of the QPF with the hydrological model can result in large bias and uncertainty 10 which can result in not only severe underestimation (Bartholmes and Todini, 2005;Kobold and Sušelj , 2005) but over predicting as well, especially for mountainous areas (Verbunt et al., 2006) . It is logical to divide such a coupling system into two components, the weather modelling system (NWP) and hydrological modelling system, through which the uncertain- 15 ties from weather domain can be propagated into final system output space.

The weather model component
The weather models, which are routinely run in national weather centres, have such a coarse spatial resolution that hydrological models have difficulty applying the result from them as an effective input. The so-called downscaling procedure is needed 20 to bridge the scale gap between the large-scale weather forecast domains and catchment-sized flood forecasting domains. In this study, a dynamical-downscaling approach is applied to resolve the dynamics over 2 km grids. The forecasts/analyses from global weather models are used to settle the initial and lateral boundary conditions (IC/LBC) of a mesoscale model that is able to benefit from the resolvable terrain fea- 25 tures and physics at higher resolutions. This sort of mesoscale model is often referred to as a local area model (LAM). 3213 the nested mesoscale models. Given the large number of degrees of freedom of the atmospheric phase state, it is impractical to directly generate solutions of the probabilistic equations of initial states (Kalnay, 2002). The ensemble forecast method, runs the model separately over a probabilistically generated ensemble of initial states, each of which represents a plausible and equally likely state of the atmosphere, and projects 10 them into future phase space. As such the future state-space can be represented by the statistics of the ensemble.

The hydrological model component
The hydrological model, like weather models, is subject to the same factors regarding the uncertainty sources in weather modelling systems. For a stand-alone hydrologi-15 cal model, although uncertainties in data inputs, e.g., measurements or observations, have been regarded as one of the main sources of uncertainty, the model structure and parameterisations have drawn more attention from hydrologists (Wagener et al., 2001). Simply speaking, the uncertainty related to hydrological modelling is categorised as: that from the inability to obtain the state of the system and that due to inefficient mod-20 elling of reality.
As for the coupling system discussed here, it is important to recognise that the rainfall input uncertainty may always outweigh the impact of the model structure, owing to the fact that the inputs are not directly measured; rather, they are the direct outputs and already include a certain amount of uncertainty which can be magnified by a particular 25 coupling process.  Smith and Austin, 2000). The objective of this paper, however, is to address uncertainties of the ensemble hydrological forecasting driven by the high resolution ensemble rainfall forecast from the NWP, and to under-5 stand the implication of the spatio-and temporal-variability of rainfall forecast applied in the flood forecasting environment. To achieve this, we employ a simple distributed hydrological model to investigate the distribution effect of rainfall forecasts and a high resolution rainfall ensemble prediction system to provide rainfall forecasts at the catchment scale. A densely-gauged catchment with sufficient data records was chosen to 10 run the simulations, and this will be discussed in the following sections.

The Brue catchment
The Brue catchment, located in South West of England, UK, is an ideal experimental site for research on weather radar, quantitative precipitation forecasting and rainfall-run 15 off modelling, as it has been facilitated with a dense rain gauge network as well as the coverage by three weather radars. Numerous studies (Bell and Moore, 2000;Moore et al., 2000;Pedder et al., 2000;Cluckie et al., 2000) have been conducted regarding the catchment, notably during the period of the Hydrological Radar EXperiment (HYREX) which was a UK Natural Environment Research Council (NERC) Special

20
Topic Program. Figure 1 shows the locations of the Brue catchment and the gauging stations. The River Brue rises in clay uplands in the east of the catchment. The major land use is pasture land on clay soil and there are some patches of woodland in the higher eastern catchment, which form part of the unique landscape of the Somerset Levels and Moors. Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version Interactive Discussion

EGU
The catchment has a drainage area of 135 km 2 with the average annual rainfall of 867 mm and an average mean river flow of 1.92 m 3 /s, for the period from 1961 to 1990. Besides weather radar, there is a dense raingauge network which comprises 49 Cassella 0.2 mm tipping-bucket raingauges, having recording time resolution of 10 s (Moore et al., 2000). The network provides at least one raingauge in each of the 2 km 5 grid squares that lie entirely within the catchment. Datasets from HYREX are available through the British Atmospheric Data Centre (BADC) for the period from 1993 to 2000. With the abundant data in this region, the rainfall-runoff model used in this study was able to reasonably model the hydrological behaviour of the catchment.

10
A simplified grid-based distributed rainfall-runoff model (GBDM), which is based on the kinematics wave approach, is chosen in this study. A brief introduction of the model is given in Appendix A. This model has been successfully applied in different catchments of Taiwan (e.g. Yu and Jeng, 1997;Yu et al., 2001). In this study, this model was calibrated and verified using 17 historical storm events over the Brue catchment. The 15 shuffled complex evolution (SCE) method (Duan et al., 1992(Duan et al., , 1993(Duan et al., , 1994, which is a general-purpose global optimisation strategy designed to solve the various response surface problems in calibrating a non-linear simulation model, was utilised to calibrate the model parameters. 2.3 The short-range ensemble rainfall prediction system 20 A short-range ensemble QPF system was used to produce ensemble rainfall forecasts that can be ingested into the hydrological model to generate the hydrological river flow forecasts. The ensemble QPF system was initially implemented in the FLOODRELIEF project and has been used in several case studies (Xuan et al., 2005 (Persson, 2003), the model physics selector and a post-processing system. Using the ensemble approach, the system is capable of representing uncertainties due to perturbations in initial and boundary conditions and the efficiency of the model structure.
In this study, the system was configured to produce ensembles over a short time 5 range, i.e., 24 h. The model structure was not perturbed in terms of changing physics parameterisations, as previous research revealed that uncertainties of rainfall forecasts due to parameterisations are not always significant compared with those from inaccuracy of initial and/or boundary conditions (Xuan et al., 2005). The ensembles comprise only the direct downscaling of perturbed background fields from ECMWF, i.e., the fifty members of the ECMWF ensemble prediction system, and one operational forecast member which represents the best estimate of the atmospheric state.
In order to reach the spatial scale comparable to the rainfall-runoff modelling system, four nests have been used with the inner-most one having resolution of about 2 km and covering a region around 100 km by 100 km (Fig. 2). The domains were deliberately set 15 so that the target catchment was well centred inside all the nests in order to largely reduce the effects of spatial distortion owing to different map projections used in weather models and rainfall-runoff models.

The simulations
The simulation of this study contains two parts, i.e., the conventional calibration and 20 verification of GBDM in respect of the raingauge data; and the simulation with rainfall ensemble inputs to the calibrated hydrological model. We hereby give a brief description of the conventional calibration process and draw more attention to the simulation with ensemble rainfall inputs. The distributed hydrological model used in this study, GBDM, has been calibrated with 13 historical storm events and verified using 4 (see Table 1) that occurred over the Brue catchment from year 1993 to 2000 covering different seasons. The parameters of GBDM can be categorised as two groups, i.e., those physically-5 based parameters which can be generated directly from topographic, soil, and vegetation maps, and those that need to be calibrated from historical rainfall and flow data. Three calibration parameters (C S , C C and C h , see Appendix A) have been calibrated by applying both an optimisation technique and an objective function. In this study, the SCE method was adopted for model calibration. The SCE method is a general-purpose 10 global optimisation strategy designed to solve the various response surface problems encountered in calibrating a non-linear simulation model. Details can be found in Duan et al. (1992Duan et al. ( , 1993Duan et al. ( , 1994 .

The processing of rainfall inputs
Like most hydrological models, the GBDM uses the raingauge data for its calibration 15 and verification. Although the model is grid based and has a grid size of 500 m in this case, the rainfall value for each grid is actually obtained using the Thiessen polygon method. When the rainfall forecasts are used to drive the hydrological model, the rainfall values from the weather model were first subject to a projection transformation (from Lambert Conformal projection to National Grid Reference of the UK), and then 20 linear interpolation is adopted to transfer the rainfall from 2 km weather model grids to the hydrological grids which have a 500 m grid size. It has been long recognised that there are always considerable uncertainties regarding the NWP forecast of rainfall locations and timings. In order to account for the fact that the rainfall distribution was not correctly positioned, a "best match" approach was 25 introduced to find out the location of the forecast that best resembled the rainfall pattern in the catchment. A similar method (Ebert and McBride, 1993)  EGU the weather forecast of precipitation and other spatially distributed variables where it is found that the model performance was not sufficiently evaluated by the grid-by-grid criteria, e.g., critical success index (CSI), false alarm ratio (FAR), etc.. This approach involves three steps, which are: (1) Extract grid mask of the catchment which represents the shape of the catchment and the relative position of each grid within catchment. (2) 5 Find out the distribution of rainfall accumulation over catchment grid for a given period of time. A 24 h rainfall accumulation from rain gauges was adopted as a reference distribution.
(3) Within the weather model domain, move the catchment grid mask in both x (East-West) and y (South-North) direction step by step. For each step, calculate the correlation coefficient of the rainfall forecast extracted by the mask and the reference 10 distribution obtained in the previous step. Therefore, a series of correlation coefficients value are obtained in respect of different offsets of x and y. After this searching procedure, a best matched area has been identified and the data values within this area are available for future processing. Another common problem relates to the application of the QPF to a hydrological 15 model, where the final forecast can be severely underestimated if applied with a hydrological model calibrated using point rainfall from the raingauges. As the grid value for the weather model output represents a box-average of the grid where the sub-grid dynamics always play an important role in contributing great amount of spatial variability even within a small grid (2 km in this case), there is a high chance of a gauge-calibrated 20 hydrological model producing a simulation much lower in magnitude than if the rainfall forecast was applied directly. Again, we use a 24 h rainfall accusation to adjust the rainfall forecast bias, which is performed through the following procedure: (1) Obtain the total catchment rainfall over the specified 24 h period, which will be used as a reference value.
(2) Get the ratio of the amount obtained in step (1) to the total catchment rainfall 25 value for weather model in the same period.
(3) Adjust hourly rainfall value from the weather model by the factor obtained in step (2). It should be noted this simple bias correction is a sort of "posterior" method; it is used here for evaluation only and is not suitable for forecasting. The original verification results from these events are shown in Fig. 3. It can be seen that the model performed very well for event 16 but under-predicted the main flow peak of event 15. One reason for this is that the GBDM has difficulty identifying the transition state of the catchment in the interval in-between two consecutive heavy rainfall inputs.
In order to evaluate the uncertainty of the rainfall forecasts regarding the location, 10 the results of the best match in terms of maximum correlation coefficient are plotted in Fig. 4. Under the current configuration, the catchment mask can move within the MM5 innermost domain with a maximum distance in x and y direction of about 80 km. The origins in Fig. 4 are the initial settings, i.e., the original position of catchment. The highest value R max and the lowest one R min from all the "best-matched" ensemble 15 members are also depicted in the figure for both events. Figures 5 and 6 give the 24 h ensemble rainfall forecasts over the catchment in forms of box plots (Wilks, 1995). Note that in terms of areal rainfall, the median values of the ensembles still resemble each other after undergoing the location correction as mentioned above.

20
The ensemble rainfall forecasts are then ingested into a time window of 24 h for both events. For event 15, the ensemble rainfall forecast was initialised on 18 December 1999 12:00 UTC and the hourly rainfall forecasts were produced up to 19 December 1999 12:00 UTC, corresponding to the time window from the 34th step to the 57th step. The original rainfall values within this window have then been substituted by the 25 ensemble rainfall forecasts. The same procedure was also applied to event 16 but with a time start of 1 February 2000 12:00 UTC and the corresponding window from step 14 to step 37. The ensemble forecasts are displayed in Fig. 7 and Fig. 8  EGU shaded areas represent the spread of all ensemble members between the quantiles q 0 .1 and q 0 .9. Also shown are the control forecasts, based on the "best estimate" rainfall forecast results. Note that the rainfall values here are referred to the average of all ensemble members. As shown in Figs. 5 and 6, the difference of hourly catchment average rainfall between rainfall forecasts and the gauge network can be easily appreciated and again it represents the weather model's limitation in predicting spatial variability. The temporal distribution of catchment average rainfall, on the other hand, resembles that from raingauges for event 15 and therefore the hydrographs generated by the ensemble mean 15 are similar to the original, although they underestimate the latter.

The dispersion of location and its implications
The introduction of the best match rainfall field pattern in Sect. 3.2 provides a new prospective on the uncertainty of the QPF regarding location. Although this method is defined by the maximum correlation between the rainfall forecast and the reference 20 one, it also reveals the spatial relationship of all ensemble members. We can safely argue that members that agree with each other must have locations close to each other in the "best match" map as in Fig. 4; therefore the dispersion of those locations to some extent can be seen as a indicator of the agreements among ensemble members EGU and as such, imply the spread of the trajectories of members in the model's phase space. However, it is worth noting that due to the high nonlinearity of the precipitation process, the correlation coefficient in a linear sense can reveal only small fraction of the behaviour of ensemble forecasts. Interestingly, the rainfall ensemble provides a contrasting dispersion result for event 15 and event 16. There are several cluster-like 5 areas for event 16 where several members are supposed to produce similar results, while a quite dispersive set was obtained for event 15 where no clear pattern could be found.

Location correction and bias adjustment
Two distinct results are produced after applying location correction and bias adjustment 10 to both events. For event 16, the corrected version has clearly improved the quality of flow forecasts -the flow ensemble has successfully covered the observation, as shown in A2. As to event 15, the correction actually has not made things better as can be seen from Fig. 6 where the median forecast has under-predicted the river flow even with the average catchment rainfall being corrected to the level of the raingauge. 15 One important effect regarding the implication of the rainfall pattern can be found if the flow ensembles are compared with the results of the catchment average rainfall ensembles which are shown in Fig. 5 and Fig. 6. The temporal evolution of rainfall forecast in terms of the spatial average follows closer to the gauge value for event 15 than event 16. This is of particular interest because it reveals that the ensemble 20 members probably have not captured or approached the rainfall pattern that really happened even if the catchment average rainfall looks similar to the real one. The maximum correlation coefficient for this event, as seen in Fig. 4, is much lower than for event 16, which agrees with this speculation.

Summary and conclusions
A distributed hydrological model has been employed, together with a short-range rainfall ensemble, to produce hydrological ensemble forecasts which are then supposed to represent uncertainties both inherited in the system and cascaded through the model chain. The study draws attention to the meteorological model and hydrological model 5 boundary through which the transport of uncertainties can be investigated with the help of (1) the mesoscale ensemble short-range rainfall forecasts which can expose the uncertain nature of weather systems; and (2) the distributed hydrological modelling system which is capable of reflecting effects of spatial/temporal variability of rainfall inputs. The study catchment benefits from a dense raingauge network and provides a 10 reasonable reference for testing the performance as well as the uncertainty effects. It can be concluded from the study that: (1) ensemble hydrological forecasting driven by ensemble rainfall forecasts, can produce comparable results in respect of gaugedriven forecasts, together with the ability to reveal the uncertain nature of the modelling system. The ensemble rainfall inputs, however, need to be well adjusted for bias 15 in rainfall amount and properly shifted to match the rainfall pattern; (2) considerable uncertainty can be propagated from weather domains where a more chaotic weather system is concerned. In this case, even the short-range rainfall ensemble can end up with a substantially wide-spread that may completely fail the hydrological ensemble requirements as the uncertainties become unacceptable after the coupling process;

20
(3) the proposed "best matching" procedure is supposed to be a practical indicator for not only the quality of the ensemble rainfall forecast, but reflecting the uncertain nature of the ensemble itself; and (4) the inability of the weather models to resolve the sub-grid scale precipitation features still needs to be addressed. The bias, specifically the common underestimates of rainfall at fine scale, can result in unrealistic low river 25 flow forecasts; this has to be properly dealt with separately, not simply attributed to the system's uncertainties. Despite these difficulties the development of coupled models which allow the propagation of uncertainty are an important development in flow Infiltration is assumed to dominate the abstraction losses during storm and it is estimated by Horton equation.
Where f p (t) denotes the infiltration capacity, f 0 represents the initial infiltration capacity, f C is the final infiltration capacity, k denotes the decay constant, and t r is the time.

10
We set the actual cumulative infiltration equal to the integrated Horton's equation, to adjust the deficiency of the Eq. (A1), which always decreases with time even if the rainfall stops. Each storm event has its own antecedent condition and optimal initial infiltration capacity, so a calibrated parameter C h is used here to obtain the optimal initial infiltration parameters (Yu and Jeng, 1997).

A2 Flow governing equation
Non-linear conceptual approaches were used for calculating overland flow and channel flow routing. The continuity and storage equations are written as: EGU I, Q and S are input, output and storage at a grid. K and m are parameters. To consider the simplest case of overland flow in a grid with length L, width w and water depth y, the volume of water stored in the grid is: The discharge, Q, given by Manning's equation is Hence the storage coefficient K in Eq. (A8) is a function of Manning's roughness coefficient N and the slope. As the actual value of N and S b may involve some uncertainty, a lumped parameter, C, is introduced adjust the storage coefficient, and is calibrated 15 from the historical data.
The spatial variability of the storage coefficient can be obtained by using the Manning's coefficient N and the slope at each grid element as shown in Eq.  2006. 3213 Wagener, T., Boyle, D. P., Lees, M. J., Wheater, H. S., Gupta, H. V., and Sorooshian, S.: A framework for development and application of hydrological models, Hydrol. Earth Syst. Sci., 5, 13-26, 2001