Hydrological appraisal of operational weather radar rainfall estimates in the context of different modelling structures

Radar rainfall estimates have become increasingly available for hydrological modellers over recent years, espe- cially for flood forecasting and warning over poorly gauged catchments. However, the impact of using radar rainfall as compared with conventional raingauge inputs, with respect to various hydrological model structures, remains unclear and yet to be addressed. In the study presented by this paper, we analysed the flow simulations of the upper Medway catch- ment of southeast England using the UK NIMROD radar rainfall estimates, using three hydrological models based upon three very different structures (e.g. a physically based distributed MIKE SHE model, a lumped conceptual model PDM and an event-based unit hydrograph model PRTF). We focused on the sensitivity of simulations in relation to the storm types and various rainfall intensities. The uncertainty in radar rainfall estimates, scale effects and extreme rain- fall were examined in order to quantify the performance of the radar. We found that radar rainfall estimates were lower than raingauge measurements in high rainfall rates; the res- olutions of radar rainfall data had insignificant impact at this catchment scale in the case of evenly distributed rain- fall events but was obvious otherwise for high-intensity, lo- calised rainfall events with great spatial heterogeneity. As to hydrological model performance, the distributed model had consistent reliable and good performance on peak simulation with all the rainfall types tested in this study.


Introduction
The capability of providing instantaneous rainfall estimation at high spatial and temporal resolution renders radar rainfall an important alternative to raingauge data for river flow forecasting.It is even more so for real-time flood forecasting over ungauged or data-sparse areas.The applications of radar rainfall in hydrological modelling have been constantly highlighted in many studies (e.g.Collier and Knowles, 1986;Cluckie and Owens, 1987;Bell and Moore, 1998a,b;Carpenter et al., 2001;Borga, 2002;Tachikawa et al., 2003;Hossain et al., 2004;Reichel et al., 2009).However, the potential of the rainfall estimation using weather radar has often been limited by a variety of sources of errors, for instance, those due to hardware calibration, attenuation, ground clutter, anomalous propagation, vertical reflectivity profile, Z-R relationship, sampling effects.The corrections for those radar application issues have been investigated and discussed by many studies, which can be referred to Harrold et al. (1974), Browning (1978), Wilson and Brandes (1979), Fabry et al. (1992Fabry et al. ( , 1994)), Kitchen (1997), Krajewski and Smith (2002), Rico-Ramirez et al. (2007), etc.Moreover, the results of flow simulation with radar rainfall are further complicated by the hydrological models employed, which, depending on their structures, may produce drastically different outcomes.This scenario is also intertwined with various types of storm types and the distribution over the catchments of concern.
Many studies have been carried out to identify and to help developing hydrological modelling systems that can better utilise radar rainfall estimates in order to improve stream flow simulations.For example, one of the major goals of the Distributed Model Intercomparison Project (DMIP, Smith et al., 2004) was to understand how to utilise the NEXRAD (Next-Generation Radar, Smith et al., 1996) rainfall data to improve the river forecasts of the National Weather Service (NWS) of the US using its existing hydrological models applied in a lumped and semi-distributed fashion.Some key findings of DMIP can be referred to Ajami et al. (2004), Bandaragoda et al. (2004), Carpenter and Georgakakos (2004) and Liang et al. (2004).It is suggested that the impact on simulation accuracy is related more to the model formation, parameterisation and the skill of the modeller, rather than how the spatial structure is described (lumped or distributed).The runoff and evapotranspiration driven by the NEXRAD precipitation data showed more spatial heterogeneities than those forced by raingauge precipitation data in general (Guo et al., 2004;Reed et al., 2004).Additionally, Cole andMoore (2008, 2009) used three types of gridded rainfall estimation based on raingauge and radar measurements with two hydrological models -the lumped conceptual model PDM and a grid-to-grid, conceptual distributed model.It was found that there was little difference between the performance of the PDM and that of the gridto-grid model, whereas the frequent and spatially varying gauge adjustment was the key for accuracy improvement of radar rainfall estimates.Additionally, Liguori and Rico-Ramirez (2013) also implemented the PDM model for the assessment of probabilistic flow predictions.Rico-Ramirez et al. (2012) also used the PDM model for testing different radar rainfall algorithms.
However, there is an important question yet to be explicitly addressed: given the existing radar rainfall estimates which have already undergone the sophisticated postprocessor with best efforts of meteorological services, what is the implication of choosing hydrological models with different model structures in terms of utilising the radar rainfall inputs as an alternative to the raingauge.The question can be conveniently extended one step further as to considering the role of storm types in the context of catchment characteristics (i.e.localised convective storm or more uniformly stratiform one).In response to this, we chose and studied a catchment from southeast England which is well equipped with dense raingauge network and radar coverage, aiming to gain the insights into the question.Contrasting to previous studies that either focused only on the prospect of model structures or the prospect of rainfall sources, we analysed the impact of model structure on the flow simulations with the operational UK NIMROD radar data sets (Golding, 1998;Harrison et al., 2000), taking into account the variation of storm types, and then try to address the following questions: 1. How do the NIMROD rainfall products perform at different rainfall intensity, comparing to the raingauge measurement, in terms of the rainfall rate and rainfall detection reliability?
2. How do different rainfall estimators perform in hydrological models with respect to their mathematical structures?
3. How do different types of rainfall events impact on hydrological models with different levels of spatial complexity?
4. What is the recommendation to apply current radar rainfall products on hydrological simulation and flood forecasting at catchment scale?
In order to answer these questions, we built and tested three hydrological models representing different structures to carry out flow simulations with three types of rainfall estimators derived from raingauges and radar at different spatial and temporal resolutions.This paper is organised as follows: Sect. 2 describes the catchment of the case study and available hydrological data from raingauge and radar.Section 3 covers the model description, calibration and validation Sect. 4 details the analysis of rainfall comparison between the raingauges and the weather radars.The hydrological model assessment of the different rainfall estimators is presented in Sect. 5 and finally, discussion and some concluding comments are given in Sects.6 and 7.

Study catchment and available data
The upper Medway catchment is located to the south of London covering an area of around 220 km 2 .The average annual rainfall and potential evapotranspiration are 729 and 663 mm, respectively (MacDonald, 2003).The elevation of catchment terrain varies between 30m and 220m above mean sea level (see Fig. 1).The landscape of the catchment is dominated by the permanent grassland, while the geology of the catchment is a mixture of permeable (chalk) and impermeable (clay) and the dominant aquifers consist of the Ashdown Formation and the Tunbridge Wells Formation of the Hastings Group.
The catchment is equipped with 9 real-time, tippingbucket raingauges (TBRs) operated by the Environment Agency (EA).Figure 1 shows the locations of the raingauges (circles) and the flow gauges (triangles) on the catchment.And all the flow comparisons in this study were carried out at the Chafford flow gauge close to the catchment outlet.
The precipitation data used in this study originates from two sources: (1) the rainfall data from TBR measurements and (2) rainfall data from the NIMROD product which is produced from the weather radar network of the UK operated by the Met Office.The radar rainfall data has already been subject to a quality-control process and was calibrated using raingauges within the radar coverage area (Zhu and Cluckie, 2011).
The radar rainfall data used in this study was from an operational product, namely, the UK NIMROD system.The NIMROD system collects and processes radar rainfall estimates from a network of 15 C-band rainfall radars, using four or five radar scans at different elevations at each site in order to give the best possible estimate of rainfall at the ground.
The radar rainfall composite is then adjusted and evaluated by the raingauge measurement using a mean-bias adjustment factor and undergoes extensive processing to account for various sources of radar errors.Operationally speaking, the NIMROD radar rainfall data is one of the best available sources of rainfall information although it certainly is not free from errors.In order to address the impact from radar data at different resolutions, we made use of two radar data sets: one of which was available every 15 min with a spatial resolution of 5 km and the other was every 5min with a spatial resolution of 1 km.Both data sets are converted from same observed polar radar rainfall data and are given on a Cartesian grid based upon the UK National Grid Reference projection.

Hydrological modelling methodology and verification
To serve the purpose, we chose and built three hydrological models of different mathematical structures which are the physically based, fully distributed model: MIKE SHE model; the lumped conceptual model: probability distributed model (PDM) and an event-based unit hydrograph model: physical realisable transfer function (PRTF).The purpose of this choice was not to compare a specific set of models but rather to consider the impact of rainfall estimation processes on a set of mathematical model structures with dramatic differences that span from complex/sophisticated to simple/empirical and reflect a decreasing ability to specifically represent the spatial distributed nature of the rainfall-runoff process.
The PRTF model is a black box, data-driven system using mathematical and statistical concepts (transfer function technique) to link the rainfall (model input) to the runoff (model output), which is also known as a stochastic hydrology model.
In contrast, the PDM and MIKE SHE model are processbased hydrological models, which contain representations of surface runoff, subsurface flow, evapotranspiration, and channel flow, which are known as deterministic hydrology models.The difference is PDM is a lumped conceptual model that considers the whole catchment as a unit, whereas the MIKE SHE is a distributed model that takes the spatial variation of the inputs and the outputs into account by discretising the entire catchment into a large number of small grids or elements.
It is worth noting that all three models have been widely used across the world and are representative of a set of mathematical structures.More details of the model structures can be referred to Zhu and Cluckie (2011) and Zhu et al. (2013).

The MIKE SHE/MIKE 11 modelling system
The MIKE SHE/MIKE 11 modelling system is a result of further development based on the SHE concept (Abbott et al., 1986a, b).
The two-dimensional Saint-Venant equation is employed to describe the water movement on the surface in MIKE SHE, and solved by finite difference method.The water movement through the soil profile, along with the evapotranspiration is modelled by a simplified two-layer evapotranspiration/unsaturated model, which fits catchments that have a shallow groundwater table.It can be used in unsaturated zones to calculate the actual evapotranspiration and the amount of water that recharges the saturated zone.The dynamics of ground water is accounted for by employing a linear reservoir in this study.Finally, all the water content generated by MIKE SHE model is routed to the river channel and propagated to the catchment outlet by the one-dimensional hydrodynamic MIKE 11 model.

The Probability Distributed Moisture (PDM) model
The PDM model is a fairly general lumped rainfall-runoff model but internally uses a probability distribution function to describe the spatial distribution of soil moisture deficit across the catchment.The saturation excess runoff mechanism is employed to generate surface flow at any point in the catchment and the integrated flow is propagated to the catchment outlet by fast response pathways.The net rainfall not only fills up the soil stores and produces the overland flow, but also infiltrates and forms the groundwater recharge which is routed afterwards to the catchment outlet by the slow response pathways.Therefore, the total streamflow at the catchment outlet is summed by the flow yield by fast and slow response pathways (Moore, 1985(Moore, , 1986(Moore, , 1999;;Moore and Bell, 2002).

The Physically Realisable Transfer Function (PRTF) model
The PRTF model is an improved form of rainfall-runoff transfer function (TF) model (Yang and Han, 2006;Young, 2006;Pollard and Han, 2012) of which the process is equivalent to the combination of parameterisation and calibration for physically based hydrological models.Mathematically speaking, the PRTF model represents the simplest structure chosen to transfer the precipitation information to streamflow by replicating the non-linear and time variant nature of the rainfall-runoff process and matching the model response as closely as possible to the catchment response in terms of three real-time adjustment factors (shape, volume and timing).This is similar to the mathematical procedures adopted in the field of control engineering in terms of minimal realisation of model form and provides a powerful alternative to conventional linear systems theory as applies within hydrology.

Set-up and verification of three hydrological models
The three hydrological modes were all calibrated and validated by using the TBR data only while the radar rainfall data was fed to the models later to evaluate the impact of model structures as to the radar rainfall input.The hydrological data sets were divided into two sets with the first set (1 September 2003-28 February 2004) used for model parameterisation, and the second part (1 September 2006-28 February 2007) for model validation.Both the calibration and validation were carried out using raingauge measurements.This process was performed for a 6-month period, with the first two months for warming up, and the remaining four months for evaluating model outputs.A trial-and-error method was employed to calibrate the MIKE SHE model, which focused on the limited number of sensitive parameters that affect the peak flow and base flow in the model; whilst the PDM model was calibrated in simulation mode using a mix of manual and automatic parameter adjustment, driven by a simplex direct search procedure (Nelder and Mead, 1965).An auto calibration function was also employed to identify PRTF model parameters for the upper Medway catchment.Both the MIKE SHE model and PDM model were set to start with a complete dry condition before the calibration and a period of two months was needed for warming up purpose.The result of model calibration was assessed by four indices, namely the mean relative error (MAE), the root mean square error (RMSE), correlation coefficient (CC) and the Nash-Sutcliffe coefficient (NS): where n is the data length, o i is the observed discharge, and m i is the simulated discharge, o is the mean value of the observed discharges.
Table 1 shows the corresponding statistics of model performance for calibration and validation, which indicates a relatively good calibration for three hydrological models.It is worth noting that reducing the errors indicated by NS was the priority in model calibration, the other three indicators (MAE, RMSE, and CC) were assisted to examine and reinsure the improvement of model performance.
Additionally, Figs. 2 and 3 show a fairly good performance on model calibration and validation.The details of model calibration process and the model parameters can be referred to Zhu and Cluckie (2011) and Zhu et al. (2013).In order to  minimise the interference from model structure when evaluating the impact from different rainfall sources, all model structures and parameters had been intentionally kept unchanged after calibration and validation, which reflects our main objective that was to utilise the three principle model structures available in hydrology to evaluate the sensitivity of the different radar sources for rainfall data.
4 Analysis of weather radar rainfall data

Comparison of radar and raingauge measurement
Although we trust that the NIMROD radar rainfall data is one of the best data sets operationally available, it is still desirable to ensure that its quality is comparable as to feed the hydrological models.Limited by the data availability, a period from July 2006 to December 2007 (18 months in total) was selected for radar rainfall analysis.The areal rainfall over the catchment was taken as a measure to evaluate the radar rainfall estimates against that calculated from the raingauges.The areal rainfall from raingauges measurements was computed using the conventional Thiessen Polygon method while the radar rainfall was counted on the overlapped area between radar grids with various spatial resolutions (e.g. 1 and 5 km) and the catchment.
Figure 4 shows that the cumulative catchment rainfall from the 5 km/15 min resolution radar had a better agreement with the raingauge measurements than the 1 km/5 min radar resolution, in terms of the overall amount of precipitation. Figure 5 also suggests that the 5 km/15 min 1 h cumulative radar rainfall estimates had a slightly better overall performance than the 1 km/5 min data, according to the MAE and RMSE.Additionally, it clearly shows that the radar rainfall was considerably underestimated during the high rainfall rate events.
Figure 6 provides further comparisons in different range of rainfall intensities, based on the same data set as in Fig. 5.It indicates that the comparisons between radar rainfall and the raingauge measurements vary in different rainfall intensity.There are considerable amount of radar rainfall overestimates when the 1 h cumulative catchment raingauge rainfall intensity is less than 1 mm, showing some large radar rainfall values recorded while the raingauge measurement is fairly small.For the hourly cumulative rainfall intensity between 1 and 3 mm, the radar rainfall estimates tend to be underestimated marginally and the distribution of radar rainfall estimates versus raingauge measurements are rather dispersed.However, the trend of radar rainfall being underestimated is getting determinative when the rainfall intensity above 3 mm h −1 , in particular for the rainfall intensity above 5 mm h −1 , which implies that the higher the rainfall intensities are, the higher degree that radar rainfall underestimates.

Radar rainfall detection reliability analysis
The skills of radar rainfall estimates was further evaluated by another set of indicators, namely the critical success index (CSI) (Donaldson et al., 1975); the probability of detection (POD) (Panofsky and Brier, 1965) and the false alarm rate (FAR) (Schaefer, 1990).The three indicators can be readily understood with reference to the contingency table (Table 2) where X stands for the number of hits by both raingauge and radar, while Y is the number of hits that only occurred in radar, Z is the number of hits that radar are missing, compared to the raingauge.
With the help of Table 2, the three indices can be defined in a straightforward fashion: CSI = X X + Y + Z which is used here to measure how well the rainfall events are hit by radar according to the raingauge observation; POD = X X + Z which shows the proportion of the observed rainfall events has been matched by radar; and finally FAR = Y X + Y demonstrates the fraction of the observed rainfall events that did not occur on radar.
All three skill scores range from 0 to 1.The perfect score for CSI and POD is 1, while the perfect score for FAR is 0. As a matter for simplicity, the raingauge rainfall was used as ground truth as our focus was on the impact of radar rainfall utilisation with regard to various hydrological model structures.Moreover, the threshold was introduced in this analysis to identify the performance of radar rainfall detection reliability at various rainfall rates.For instance, if the threshold is set as P mm h −1 , X will only be accumulated when both raingauge measurement and radar estimates exceed the threshold, while Y will be accumulated when only the radar rainfall exceed the threshold, and Z will be accumulated when only the raingauge measurement exceed the threshold.Consequently, this process iterated through the whole rainfall series until all the skill scores were achieved for different thresholds.
Figure 7 shows the skills of radar rainfall estimates measured by three indices with regard to different rainfall intensities (with threshold at 0.2 mm h −1 ).The POD, which is quite sensitive to the number of correct hits, has a tendency to decrease as rainfall rate changes from 0 to 8 mm h −1 for both resolutions of radar rainfall data which echoes the finding indicated by the scatter maps shows in Fig. 6.Another interesting finding is that the POD actually rises again when the rainfall rate goes up to 8 mm h −1 and the radar performs better in detecting high-intensity rainfall, compared to the moderate rainfall rate.The CSI, which measures the overall reliability on detection, shows a similar tendency with POD, except for the rainfall rate ranging from 0 to 0.2 mm h −1 where the CSI actually increases as well.Since the false alarms also affect the CSI, it is reasonable to infer that for low intensity rainfall events (i.e.0-0.2 mm h −1 ), the radar rainfall is consistent with raingauges with lower chances of issuing false alarms.This is also evidently shown by the plot of FAR in Fig. 7 which shows the trend of FAR as we expected.
When looking at the difference in these scores with regard to the resolution of the radar data sets, they vary with the index of concern and more interestingly, with the rainfall rate.For CSI, the 5 km/15 min data considerably outperformed the 1 km/5 min data when rainfall rate was under 1 mm h −1 , but the latter became dominant when the rainfall rate was over 7 mm h −1 .Apart from that, the two resolution data sets had very similar performance on CSI.For POD, the coarser resolution data generally outperformed the other, especially when the rainfall rate was in the range of 4-7 mm h −1 .Furthermore, with CSI the finer resolution data set outperformed when the rainfall rate was over 7 mm h −1 .
Regarding the FAR, it is interesting to note that the finer resolution data set significantly outperformed when the rainfall rate was in the range of 3-8 mm h −1 .However, the FAR on coarser resolution dropped down quickly when rainfall rate was above 8.6 mm h −1 , which was much better than the other data set in this study.That was due to the edge effect from the algorithm (Harrison et al., 2009) employed to convert the polar cells into Cartesian cells, in which case, a bigger Cartesian grid size a greater edge effect will be suffered, especially when the rainfall rate is largely heterogeneous in cells of polar format.Therefore, the coarser resolution radar data was less likely to trigger the false alarm in high rainfall rate while the raingauge data did not exceed the threshold.
The aim of employing these forecast indicators (CSI, POD and FAR) in this study is to evaluate the reliability of radar detection with various rainfall intensities (the thresholds in this case).It is strongly related and consistent to the analysis in Sect.4.1, especially when the threshold analysis is introduced.Additionally, when rainfall rate remains in low to medium range (less than 7 mm h −1 ), the radar rainfall estimates at 5 km resolution, in general, achieved marginally higher CSI and POD score than the one at 1 km resolution.In contrast, in high rainfall rate situation, the 1 km resolution data set was considerably better on CSI and POD, but significantly worse on FAR.In terms of precipitation detection success rate, radar performs better when the rainfall rate is either relatively low (0.2-2.2 mm h −1 ) extremely high (8-10 mm h −1 ).For high rainfall rate events, the radar data at finer resolution tends to achieve better detection skill score.As to the impact of the resolution of NIMROD data, the simulations showed in Figs. 8 and 9 that the simulated streamflow in all three models had slight differences in terms of their overall performance for both 1 km/5 min and 5 km/15 min radar rainfall input.However, the simulation with 1 km/5 min data is considerably better when the peak flows are over 20 m 3 s −1 during the first evaluation period (see Fig. 8), in all three hydrological models.It suggests that the advantage of applying higher resolution radar rainfall data in hydrological models tends to be enhanced when a high rainfall rate has occurred, or the triggered flows are over 20 m 3 s −1 in this study.For comparison between the simulations driven by raingauge and radar rainfall, it was found that they were generally comparable for the low-flow parts but the radar driven one constantly underestimated the high flows for both evaluation period A and B. The first several low peaks in evaluation period A and the recession flow of evaluation period B driven by radar rainfall were higher than those caused by raingauge rainfall.This behaviour is more pronounced in the MIKE SHE model.However, for the following higher peaks (over 20 m 3 s −1 ), the radar rainfall could not drive the model to achieve the point close to the observed record, and compared to the raingauge measurement, a considerable amount of peak flow was underestimated.

Three evaluation periods (
This in fact agrees with the analysis of the radar rainfall as discussed previously where the radar rainfall usually failed to match the raingauge values for high rainfall rate events.This finding also implies that, in addition to the process already applied by the NIMROD system, a further correction to radar rainfall is necessary in order to feed the hydrological model with radar rainfall.It can be inferred further that such a correction method needs to be nonlinear and better to account for different precipitation types. Table 3 indicates that there is small amount of heterogeneity between the simulation results trigged by the 5 km/15 min and 1 km/5 min radar rainfall data, due to the smoothing effect from hydrological models, especially in normal low rainfall rate periods (like evaluation period A and B in this study).The raingauge measurements produce better performances on the peak flow in all three models than the radar rainfall estimates.With respect to the three different mathematical structures, although the Figs. 8 and 9 shows that the distributed MIKE SHE model have outperformed other two models in terms of the agreement of peak flow simulation, Table 3 suggested that PDM model have slightly better overall performance than MIKE SHE and PRTF, which is due to its better simulation on the low flow.Interestingly, an implication from this finding is that that the lumped hydrological model structure might be a better choice for simulation with low rate rainfall, considering the level of model complexity and computation cost.When flow peak is preferred, distributed model is more desirable.
While both evaluation periods A and B represent a normal flooding situation, it is also desirable to look into rainfall event with much higher intensity.The selection of evaluation period C is just to serve this purpose.The unusual rainfall magnitude of this evaluation period was triggered by a convective storm which produced 80 mm precipitation over the catchment recorded by the raingauges which caused over 40 m 3 s −1 peak discharge at the catchment outlet.Note that the peak of the rainfall took place on 20 July 2007 with 30 mm within 3 h from 08:00 to 11:00 UTC according to the raingauge measurement.Figure 10 shows the spatial distribution and movement of this rainfall peak in the MIKE SHE with 1 km resolution using the local model grid reference, indicating a very narrow band with very high intensity over the catchment.The rainfall rate at the centre of the storm reached as high as 112 mm h −1 .This period in fact highlights two important issues related to radar rainfall estimates and the inability of a lumped conceptual model to account for the heterogeneity of rainfall distribution.The impact of attenuation of C-band radar beam during high-intensity rainfall events is evident in this period where all three models with NIMROD inputs produced severely underestimated results (Fig. 11) due to the underestimated radar rainfall as a result of attenuation.Additionally, the situation becomes even worse with radar rainfall at coarser resolution (e.g. the 5 km data set in the study).It again suggests that the advantage of using finer resolution radar rainfall data is highlighted in high-intensity events with uneven spatial distribution.
By contrast, the simulations from the MIKE SHE and the PRTF models with raingauge input were able to get close to the observed peak with slight overestimates and a sharper peak.This indicates that even the raingauge network had difficulties in representing such highly non-evenly distributed rainfall.The PDM model which treats the catchment rainfall in a lumped way, produced the worst result even with the raingauge input as the heterogeneity of rainfall distribution becomes more evident and as such the inability to represent  the distribution is inevitably more obvious than that in events with much smooth and uniform rainfall distribution.Like those in periods A and B.
Interestingly and yet contrary to common belief, the PRTF model with simplest mathematical structure clearly exceeds its two counterparts as indicated in both Fig. 11 and Table 4.The model simulated the event reasonably well with raingauge data.Even with the radar data, the results from the PRTF are much better than both the MIKE SHE and the PDM.The reason for such behaviour may lies in the fact that the PRTF is a event-based model in a sense that it fits to simulate a single, independent event, instead of a continuous events.And the mechanism of PRTF model suggests that the agreement of peak flow in model simulation depends on the characteristic of peak flow in calibration, in terms of the shape, volume and timing, which offers it certain advantage as compared with the complex distributed model and the lumped conceptual model both suffering from the errors in radar data.

Discussion
The context at which the study is targeted is flood forecasting with available modelling tools and the best available operational radar data which in this case is the NIMROD data from the Met Office of the UK.The experiments with this setting, although limited by the availability of observations and showed a tendency of underestimating the peak flows for higher precipitation rate events, are yet able to provide a valuable insight into the effect of different rainfall measurements and the impact of spatial variability of rainfall at the scale of a middle size catchment, which result in some interesting findings are revealed for the first time.These findings are deemed to be very important for practitioners as to the choice of better model with radar rainfall input.The major findings are summarised as follows: -The radar rainfall estimates (in our case, NIMROD) as already subject to the process of calibration and correction, has a mixture performance compared to the raingauge measurements on simulated streamflows in three hydrological models.The radar rainfall products showed a tendency to overestimate the low-to-medium rainfall rate events.However, for flow-peak-generating events (with high rainfall rate intensity), the radar data has difficulties to reproduce same magnitude of raingauge rainfall and hence underestimated the flood peaks.This mixture performance is consistent to the radar data analysis in Figs. 5 and 6.It was hypothesised that the cause of this could be due to using the uniform distribution to describe the variation of the drop size distribution (drizzle and showers) during the radar rainfall process.And considering the topography of the catchment and the raingauge measurements performance in peak simulation, the orographic enhancement is also suspected to cause the underestimation of the radar rainfall, as described by Kunz and Kottmeier (2006).Also, similar radar performances against raingauge were found by Schellart et al. (2012).However, the difficulty in estimating the rain drop size distribution, the hydrometeor drifting, evaporation, and moisture loss, prevented the further investigation for these hypothesises.
-Furthermore, the radar performance at different rainfall rates influences the detection reliability analysis.
Because of the general underestimation of the rainfall at high rainfall rate and overestimation at low-middle rainfall rate, the detection reliability analysis shows a tendency of decreasing skill score for CSI and POD but increasing skill score for FAR.And finer resolution radar data has better performance on detection reliability but also have a risk of causing false alarm.
-As to the timing of flow peaks, the radar rainfall estimates has similar performance to raingauge data, that were able to drive all three models well to match the observed data, which is also important when put in an operational context where such timing directly determines the action time for flood warning purposes.
-The model structure indeed affects simulations of three models with radar rainfall inputs.The distributed model MIKE SHE proved to be reliable and consistent for simulating flow peaks when used with grid-based radar data input.However, all three models produce similar results when dealing with normal storm event with medium intensity and more uniform distribution -and in this case the lumped conceptual model PDM even achieved better scores for overall simulation.This reiterates the work done by Cole and Moore (2008) that the lumped conceptual models often provide a reliable and robust flow simulation at gauged catchment, while distributed models may find difficulties to match.However, the benefit of applying the distributed models to represent the variation of spatial effects of storm position on catchment flood response at times makes the distributed model approach an important area for future research.
-The PRTF model had relatively poor performance in most of the simulations, compared to the MIKE SHE and PDM models, which is partially attributed to the chalk catchment with a strong baseflow influence but without sufficient multi-year calibration period and warm-up process.Moreover, PRTF is generally an event-based unit hydrograph model, which is expected to perform better for single flood peak event simulations (period C in Fig. 11) or continuing multiple flood peak events along with real-time adjustment.Nevertheless, the inclusion of PRTF model in this study is essential and necessary, not only because it represents a unit-hydrograph type of hydrological modelling and thus provides a powerful alternative to conventional linear systems theory as applies within hydrology, but also it has been used operationally for flood forecasting by Environment Agency in South West of England.Therefore it is worthwhile to include PRTF model to serve the aim of this study.
-The difference due to using radar data at different resolutions for these events was found to be insignificant (i.e. the simulations with both low and high resolution radar data produced very close results), which suggests that the additional information content of the high resolution radar rainfall estimates could be possibly filtered out by a low-pass filter such as the radar format conversion from polar to Cartesian and hydrological process.
-However, the significant advantage of using high resolution radar data has been shown in a localised, convective storm event where a great deal of heterogeneity exists in the rainfall distribution over the catchment.It is vital to use rainfall data which has both high spatial and temporal resolution to ensure optimum accuracy of peak flow predictions.
-The use of more than one measurement technique, such as ensemble QPE and/or QPF, such as STEP (Bowler et al., 2006), may be necessary to account for the uncertainty inherent in all rainfall measurement methods used for radar rainfall applications.Moreover, in order to improve the accuracy of rainfall measurements, more delicate interpolation methods can be introduced to average the raingauge rainfall over catchments, such as Kriging (Goudenhoofdt and Delobbe, 2009;Velasco-Forero, 2009).However, the complex techniques come with heavy computational cost, which will affect the efficiency of the model during the flood forecasting.Moreover, the cost-benefit impact has to be evaluated before the method is applied.

Conclusions
In this study, we analysed the impact of model structure and storm types on flow simulations using radar rainfall estimates.Three hydrological models with different mathematical structure and complexity were set up for a medium sized catchment the upper Medway catchment in south-east of the UK.The three models, namely the distributed model MIKE SHE, the lumped conceptual model PDM and the transfer-function-based model PRTF were firstly calibrated using raingauge data and then subject to the rainfall inputs from the NIMROD radar rainfall estimates at two different temporal/spatial resolutions.The quality of the radar data was evaluated against raingauge data before being used as the input for flow simulations.Three periods of data were then selected for the analysis with two having stratiform precipitation and one was due to strong, localised convective storm.
A few concluding remarks can be drawn as below with respect to the objectives of this study: 1.The operationally available radar data has been shown to be able to drive hydrological simulations with reasonable results from models with different structures.
In principle, the radar driven models are able to produce comparable results for low flow with an evenly distributed storm as compared with the raingauge driving counterparts.A large amount of peak underestimation is common in radar-driven model simulations although the radar data has been subjected to complicated calibration and correction, it still fails to represent high-intensity precipitation due to inherent problems in the technology such as mixture of raindrop distribution, orographic enhancement and attenuation yet to be addressed.A very encouraging outcome, however, is that the timing of the peaks is able to be reproduced with precision, which implies the utility of radar data if the underestimates are properly acknowledged, especially in the case of ungauged basins where the radar rainfall may be the only available sources of rainfall.
2. The impacts due to difference in model structure and the resolution of radar data, however, are less pronounced in the situation of stratiform rainfall events with moderate rainfall intensity.It unfortunately means that the spatial information contained in the radar rainfall data is often spatially averaged, diminishing the impact of the measurement resolution.And the much simpler structures based upon lumped forms or black box models are generally sufficient for operational hydrology.
3. However, high-intensity, localised, convective storms require better rainfall distribution representation in which case radar rainfall estimates play a more important role than raingauge.The resolution of radar data matters more as a higher resolution gives a better description which results in better flow simulation in the distributed model 4. Given that models are properly calibrated, the choice of hydrological models is not as imperative as expected for normal cases with uniform rainfall distribution as they can produce similar results.However, in the case of highly localised strong storms, lumped conceptual models that are unable to account for rainfall inhomogeneity may fail first, it is therefore that making use of distributed models or even simple transfer function based models is desirable.
5. The improvement of attenuation correction of the reflectivity signal in extreme intense rainfall events has to be considered before applying the radar rainfall estimation on hydrological models, which was particularly the case at the C-band frequency.Operational radars in the UK national network are all C-band radars, and the virtue of the real-time attenuation correction capability of the dual-polarisation radars was found to be of assistance in the case of a severe storm, as suggested by Zhu and Cluckie (2011).
6.More sophisticated, frequent and spatially varying localised gauge-adjustment techniques should be involved in the Nimrod radar rainfall process in order to achieve the best rainfall estimators with high resolutions at time and space, which will certainly play a key part for accuracy improvement of radar rainfall estimates at catchment and urban scale for future developments.
It is worth noting that the conclusions are drawn only from our case study and a more comprehensive picture however would apparently require more representative storms, different models and even radar data processed with different techniques ought to be taken into account.Also, the consistent differences of the performance between raingauge and radar rainfall estimates in hydrological simulations may imply that hydrological models intended to be calibrated and driven with same data source.However, the raingauge measurements are point measurements and may not be able to represent the "true" catchment averaged rainfall, which would possibly cause the error in the comparison.Furthermore, although the scenario of this study is to minimise the interference from model structure when evaluating the impact from different rainfall sources, the results from the experiments are inevitably affected by the choice of catchment, the errors derived from model structures and calibration methods, etc., which should require more further comprehensive investigation to make conclusive and generalisable comments to other situations for future research.Nevertheless, the experiments as well as the analysis presented in this paper may provide a valuable insight for other researchers and more importantly practitioners as to the measures need to be taken when using operational radar rainfall estimates with their existing hydrological models.Certainly it would be more interesting to include the discussion on the technics to improve the radar data quality into the scenario but that for sure deserves a separate study where the authors would like to venture in future.

Fig. 4 .
Fig. 4. Comparisons of cumulative catchment rainfall from raingauges and radar at different resolutions.

Fig. 5 .
Fig. 5. Comparisons of 1 h cumulative catchment rainfall from raingauges and radar at different resolutions.
A: 15 November 2006-14 December 2006; B: 27 December 2006-14 January 2007 and C:  15 July 2007-25 July 2007)  were selected to examine the performance of the application of NIMROD radar rainfall estimates in hydrological models compared with raingauge measurements.The first two evaluation periods (A and B) were mainly caused by stratiform precipitation while the last one was trigged by a convective storm in summer 2007.

Fig. 10 .
Fig. 10.Rainfall rate distribution observed by radar at four time points from 09:30 to 09:45 GMT on the 20 July 2007.

Table 1 .
Statistics of performance for model calibration and validation.
33 729 Figure 2. The results of Model Calibration with raingauge rainfall

Table 3 .
Statistics of performance for different model output for frontal events.

Table 4 .
Statistics of performance for different model output for convective events.