Robust estimation of hydrological model parameters

The estimation of hydrological model parameters is a challenging task. With increasing capacity of computational power several complex optimization algorithms have emerged, but none of the algorithms gives a unique and very bestparameter vector. The parameters of fitted hydrological models depend upon the input data. The quality of input data cannot be assured as there may be measurement errors for both input and state variables. In this study a methodology has been developed to find a set of robust parameter vectors for a hydrological model. To see the effect of observational error on parameters, stochastically generated synthetic measurement errors were applied to observed discharge and temperature data. With this modified data, the model was calibrated and the effect of measurement errors on parameters was analysed. It was found that the measurement errors have a significant effect on the best performing parameter vector. The erroneous data led to very different optimal parameter vectors. To overcome this problem and to find a set of robust parameter vectors, a geometrical approach based on Tukey’s half space depth was used. The depth of the set of N randomly generated parameters was calculated with respect to the set with the best model performance (Nash-Sutclife efficiency was used for this study) for each parameter vector. Based on the depth of parameter vectors, one can find a set of robust parameter vectors. The results show that the parameters chosen according to the above criteria have low sensitivity and perform well when transfered to a different time period. The method is demonstrated on the upper Neckar catchment in Germany. The conceptual HBV model was used for this study. Correspondence to: A. Bardossy (bardossy@iws.uni-stuttgart.de)


Introduction
Hydrological models are used for different purposes such as water management or flood forecasting.The estimation of hydrological model parameters is a difficult task.Reasons for this are the highly non-linear nature of hydrological processes and the fact that different parameter vectors driving models describing the physical processes might have the same effect on the discharge.This means that changes of some parameters might be compensated by others.Unfortunately traditional manual calibration of models with reasonable parameter values often leads to weak results.Hence, nowadays automatic procedures based on numerical methods are used.
Many different optimization routines have been developed to find optimum parameter vectors.A variety of objective functions measuring model performance including multiobjective approaches, have been tried to define optimality in this context.Non-linearity of the hydrological models and of the objective functions lead to very complex optimization problems.Beven and Freer (2001) argue that there are no optimum parameters, in fact there is a large set of parameter vectors which all perform reasonably and one cannot easily distinguish between them.They call this an equifinality problem which leads to high uncertainties in the model predictions.Frequently shown dotty plots give the impression that the set of good parameter vectors can be found anywhere in the space.But no clear convergence to a best single value can be observed.However, in a previous paper (Bárdossy, 2007), the geometrical properties of a parameter vector with good performance (from now on the set of good parameters) were investigated for a two-dimensional case.It was shown that the set of good parameters is well structured.Unfortunately in higher dimensional spaces one can not see these sets, thus it is not clear whether they are scattered or have some clear structure.The high scatter observed in the good individual parameters is very disturbing since, it does not A. Bárdossy and S. K. Singh: Robust parameter estimation enable a classical identification of a single vector within corresponding confidence bounds.
The GLUE procedure (Beven and Binley, 1992) has widely been applied for uncertainty assessment and discussed in the scientific literature, although alternative procedures using parametric approaches to obtain best solutions have also been suggested.These approaches are optimal under certain assumptions, however, they are often selected purely for mathematical convenience and not necessarily based on experience with data.
In Kavetski et al. (2006a,b) it was noted that the performance metric of hydrological models is a bumpy function of the model parameters.They suggest different numerical procedures to smoothen parameter surfaces and to obtain optimal parameter vectors.
The purpose of this paper is to investigate the reasons leading to very different near optimum parameter vectors and to investigate the properties of the set of good parameters in high dimensional spaces.Our goal is not to find the parameter vectors which perform best for the calibration period but to find parameter vectors which: 1. lead to good model performance over the selected time period 2. lead to a hydrologically reasonable representation of the corresponding processes 3. are not sensitive: small changes of the parameters should not lead to very different results 4. are transferable: they perform well for other time periods and might also perform well on other catchments (i.e. they can be regionalized) Concepts of computer geometry and multivariate statistics are used to identify the set of good parameters.Specifically, convex sets and the depth function defined in Tukey (1975) are used.
The concept of data depth has recently received much attention by Donoho and Gasko (1992), Rousseeuw and Struyf (1998), Rousseeuw and Ruts (1998), Liu et al. (1999), Zuo andSerfling (2000), Miller et al. (2003) and Lin and Chen (2006).It has been used for the investigation of large data sets.The application of depth function has been seen in several fields.Serfling (2002) used the depth function for nonparametric multivariate analysis.Cheng et al. (2000) had used data depth function for monitoring multivariate aviation safety data for control chart.They were also applied in quality control by Liu (1995), Hamurkaroǵlu et al. (2004).The only hydrological application found so far is in Chebana and Ouarda (2008), where data depth was used to define weights for the regional estimation of hydrological extremes.
This paper is organized as follows: After the introduction, the case study area is introduced.In Sect.3, the effect of observation errors on the identification of hydrological model parameters is discussed.In Sect.4, the notion of statistical depths is introduced.Geometrical properties of the set of good parameters is investigated with the help of the depth function and robust parameter vectors are identified.In the final section, results are discussed and conclusions are drawn.

Case study area and the hydrological model
The concept of this paper will be illustrated with examples from the Neckar catchment.The hydrological model chosen is a modified version of the HBV model.A short description of the catchment and the model is provided in this section.

Study area
This study was carried out on the upper Neckar basin in South-West Germany in the state of Baden-Württemberg using data from the period 1961-1990.The region is flat, undulating in the east and north.The Black Forest and Swabian Alps are in the west and south.The 4000 km 2 large Upper Neckar basin was subdivided into 13 subcatchments (Fig. 1).Three of which were used for this study.
The study area elevations range from 238 m a.s.l. to 1010 m a.s.l.The dataset used in this study includes measurements of daily precipitation from 151 gauges and daily air temperature at 74 climatic stations.The meteorological input required for the hydrological model was interpolated from the observations with External Drift Kriging (Ahmed and de Marsily, 1987) using topographical elevation as external drift.The mean annual precipitation is 908 mm/year.Land use is mainly agricultural in the lowlands and forested in the medium elevation ranges.Hydrological characteristics of the three selected subcatchments are given in Table 1.For further details please refer to Samaniego (2003) and Bárdossy et al. (2005).

Hydrological model
The HBV model concept was developed by the Swedish Meteorological and Hydrological Institute (SMHI) in the early 1970's.It has been modified at the Institute of Hydraulic Engineering, University Stuttgart and used for this study.It includes conceptual routines for calculating snow accumulation and melts, soil moisture and runoff generation, runoff concentration within the subcatchment, and flood routing of the discharge in the river network.The snow routine uses the degree-day approach as set out in Eqs.(1) and (2).Soil moisture is calculated by balancing precipitation and evapotranspiration using field capacity and permanent wilting point as parameters, Eqs.(3) to (5).
(1)  Where: P eff is the effective precipitation, SM is the actual soil-moisture, F C is the maximum soil storage capacity, β is a model parameter (shape coefficient), P is the depth of daily precipitation, MELT is the amount of snow melt, DD is degree day factor, T is the mean daily air temperature, T crit is threshold temperature, DD 0 is degree day factor when there is no rainfall and k is a positive number.
Where: P E a is the adjusted potential evapotranspiration, C is a model parameter, T is the mean daily air temperature, T m is the long term mean monthly air temperature and P E m is the long term mean monthly potential evapotranspiration.
Where: E a is the actual evapotranspiration, SM is the actual soil-moisture and P W P is limiting soil-moisture at which potential evapotranspiration take place.Runoff generation is simulated by a nonlinear function of the actual soil moisture and precipitation.The runoff concentration is modeled by two parallel nonlinear reservoirs representing the direct discharge and the groundwater response.Flood routing between the river network nodes uses the Muskingum method.
Additional information about the HBV model in general can be found in Hundecha andBárdossy (2004), andBergström (1995).Direct runoff and percolation from each subcatchment are calculated using Eqs.( 6) to (10). Where: baseflow, k 0 is the near surface flow storage constant, k 1 is the interflow storage constant, k perc is the percolation storage constant, k 2 is the baseflow storage constant, S 1 is upper reservoir water level, S 2 is lower reservoir water level, L is threshold water level for near surface flow.The total runoff is computed as the sum of the outflows from the upper and lower reservoirs.The total flow is then smoothed using a transformation function, consisting of a triangular weighing function with one free parameter, MAXBAS.
Where: Q is current overall discharge and MAXBAS is the duration of the triangular weighting function (Unit Hydrograph).There are 15 parameters to describe the model,out of which 9 parameters are used for this study (Table 2).

The effect of observation errors
In rainfall runoff modeling, input errors play a crucial role but the problem of input errors is generally neglected by hydrologists (Paturel et al., 1995).Hydrological models use observation data for the identification of model parameters.
Unfortunately many of the hydrological observations contain partly systematic and partly random errors.Precipitation is measured at a few selected locations and typically interpolated for the catchment area.Thus, precipitation values used in the model can be wrong due to measurement (for example caused by evaporation or wind) and to interpolation errors.The impact due to error in precipitation has been investigated by Ibbitt (1972), Troutman (1985), Paturel et al. (1995), Andréassian et al. (2001) and Oudin et al. (2006), who found that error in precipitation has significant influence on model performance.
Observed discharge is used as the main calibration quantity, thus their errors may have significant influence on model performance.In most cases water levels are observed and rating curves are used to transform them to discharges.This is an important source of partly systematic error and, when combined with other errors, compromise the identification of the model parameters.
The parameter vectors obtained by model parameter optimization algorithms are optimal with respect to an erratic objective function.Measurement errors and errors due to model structure are mixed (Todini, 2007) and cannot be separated directly.The following examples illustrate the effect of observation uncertainty on parameter estimation.
Firstly, consider the observed meteorological variables and discharge.We assume that due to measurement errors the accuracy of the measured discharge Q M (t) is q%.Thus, the real but unknown discharge Q E (t) can be written as: with ε Q (t) being a random error.This random error is due to uncertainties of the rating curve, non-uniqueness of the stage discharge relationship, changes of the cross section etc.
Here we assume that the error follows a normal distribution N (0, q 100 ).This means we assume a constant relative random error and further, that the errors are independent (error dependence would increase the effect of observation uncertainty).
To quantify the effect of the flow error on model performance a set of M=100 realizations of Q E (t) was generated with q=5.Note that the rating curve related errors are usually higher than this, especially in the case of extreme flows.Consequently the parameters of the hydrological model were estimated using simulated annealing by maximizing the Nash-Sutcliffe coefficient as if each parameter vector Q E was the observed series.The model parameters obtained show a considerable scatter.For example in two dimensions, Fig. 2 shows the scatter plot for the selected model parameters L and k 1 , where M=20.
The uncertainty of the estimated model parameters with respect to input error structure can also be investigated.With respect to temperature observations one can assume that the real but unknown temperature T E (t) can be written as: In this case an additive error of the catchment mean temperature was assumed.As in the previous case, M realizations were generated and model parameters were optimized for each of the series separately.Tables 3 and 4 show the effect of observation error on the Nash-Sutcliffe coefficient for both discharge and temperature measurement errors, respectively.The uncertainty of model parameters with respect to precipitation uncertainty can be considerable, depending on the density of the observation network.This problem was investigated by Das (2006).
These examples show that model parameters and model performance are highly influenced by measurement errors.Two parameter vectors, with model performances differing in the range of the measurement error caused fluctuations of the Nash-Sutcliffe value, cannot be distinguished from each other.Either of them might lead to a better description of the hydrological system.The parameters obtained by sophisticated optimization procedures might thus be suboptimal in reality.Thus, it is reasonable to investigate the set of parameters which gives similar performance as the numerical optimum.These parameters will be called good parameters in the subsequent sections.

Geometrical structure of the parameter set
One of the major problems is that there is a large number of parameter vectors which perform nearly equally well.It is difficult, then to decide which of these should be taken for prediction.Scatter plots showing model performances as a function of individual parameters indicate that a wide range of parameter values can lead to good model performance.At present it seems impossible to know a priori if a fitted given parameter vector leads to good or bad performance when ap-plied to a model.In Bárdossy (2007), the geometrical structure of the best performing parameters of the unit hydrograph (impulse response function) of the Nash cascade were investigated.It was shown that the set has a very clear geometrical structure.In this paper models with many more than two parameters are considered.It is difficult to visualise the subset of best parameters in higher dimensions; instead methods of computational geometry are used herein.
In order to investigate the properties of the set of good parameter vectors, the concept of data depth was used.Depth functions were first introduced by Tukey (1975) to identify the center (a kind of generalized median) of a multivariate dataset.Several generalizations of this concept have been defined in Rousseeuw and Struyf (1998), Liu et al. (1999) and Zuo and Serfling (2000).
Definition: The halfspace depth of a point p with respect to the finite set X in the d dimensional space d is defined as the minimum number of points of the set X lying on one side of a hyperplane through the point p.The minimum is calculated over all possible hyperplanes.
Formally the halfspace depth of the point p with respect to set X is: Here x, y is the scalar product of the d dimensional vectors, and n h is an arbitrary unit vector in the d dimensional space representing the normal vector of a selected hyperplane.If the point p is outside the convex hull of X then its depth is 0. Points on and near the boundary have low depth while points deeply inside have high depth.
One advantage of this depth function is that it is invariant to affine transformations of the space.This means that the different ranges of the parameters have no influence on their depth.
The calculation of the halfspace depth is computationally very expensive if the number of points in X is large or the dimension is high.Efficient algorithms are available for d=2 from Miller et al. (2003).In this study the approximate calculation suggested in Rousseeuw and Struyf (1998) was used.
To illustrate the concept of data depth let us consider a two dimensional data set.The parameter vectors with good performance are shown on Fig. 3.We have chosen L and k 1 in this example.The first parameter corresponds to the x axis and the second to the y axis.Points with low depth ≤4 are marked with circles.The convex polygon separates the points with low depth (≤4) from those with high depth (>5).Note that points near the boundary of the set have a low depth while the interior points in the middle of the set have the greatest depth.

Data depth of the good parameter set
In order to explore the set of reasonably performing parameter vectors using the above introduced concept, nine parameters of the HBV model were considered.The parameter ranges used for the initial Monte Carlo simulation for the subcathment Rottweil (Neckar) is given in Table 2. N random parameter vectors where generated in a rectangle bounded by reasonable limits in the d=9 dimensional space.
For each of these parameter vectors the hydrological model was applied and the performance was calculated.This set of parameter vectors is denoted as X N .A subset X * N ⊂X N of the best performing parameter vectors (in our case we chose the upper 10%) were identified.The depth of each point in X N with respect to X * N was calculated.Figure 4 shows the histogram of the performance of the hydrological model for the points θ∈X N with depth D(θ )>L.One can see that all points with high depth (being in the geometrical interior of the set X N ) lead to good model performance.The reason for this is that one assumes that the low depth points can be regarded as an iso-hypersurface corresponding to the selected level.If one assumes continuity of the objective function then higher values of the function are expected in the interior of the set.In order to check this statement an independent second set Y N of N random parameter vectors were generated.The depth of the points of Y N with respect to X * N was calculated.For all parameters θ∈Y N , the hydrological model was run and the performances calculated.The results are evaluated for parameters such that D(θ)≥L, exemplified in Table 5 with the statistics of the performances.One can see that the randomly generated parameter vectors which posses high depth have good model performance.The standard deviation of the performance decreases with increasing depth, showing that in the deep interior of the set all parameter vectors perform similarly.These results show that for this case one can geometrically identify parameter vectors which are good.Note that even if the best performance is related to the deepest subset, this is not necessarily always the case, since the global optimum might itself correspond to a low depth.

Transferability
In order to investigate the transferability of the parameters with respect to their depth, two experiments were carried out.
As a first test the total observation period of 30 years was divided into three 10 year periods.The hydrologic characteristics of the three time periods are listed in Table 6.The model performance was calculated for each time period.The set of good parameter vectors was identified for each time period separately and the depth of each parameter with respect to this set was calculated.In this way, three depth values were assigned to each parameter vector.
The sets with 50 and 150 deepest parameter vectors were identified for each time period.The intersection of the convex sets corresponding to the 50 deepest points consisted of 36 for the 150 point set 84 points indicating that depth is stable over all time periods.Note that a parameter vector was considered to be in the intersection if it had positive depth with respect the sets considered.As a set of 10 000 points were considered; an independence selection of two sets with 150 points would have led with high probability no points in the intersection.This means that parameters with large depth are robust with respect to the selected time period.As a second test the parameters with greater depth for one time period were used for another time period and their performance was calculated.In Table 7 the results of the transferred model quality with respect to depth corresponding to the time period 1961-1970 are shown.Note that the subset of the boundary points was selected by choosing only points for which the performance exceeds a given threshold.This way we obtained two sets with the same mean performance.Note that for the interior points, the performance in the other time periods is significantly better than those of the boundary points.The standard deviations of the performance for the validation time periods are smaller for the interior points which indicates that the transfer of these parameters is more reasonable for the parameter vectors from the interior.

Sensitivity
The sensitivity is not investigated in the usual way to see how the model reacts to changes of individual parameters.Instead the parameter vectors are considered as sensitive if a small change of the whole vector might lead to a big change (usually drop) in the performance of the model.The sensitivity of boundary points and inside points was compared.For this purpose parameter vectors were altered from the boundary (D(θ 1 =1) and from the inside (D(θ 2 )>1 of the set.A specific vector η was added and subtracted from the selected parameter vectors.This way the vectors θ 1 and θ 2 were altered to the same extent.Four new parameter vectors Due to the definition of the depth D(C 1 )≥1 while D(C 2 )≤1.For C 3 one cannot make any statements on the depth.
Figure 5 explains the construction of the three points in one dimension.
The above construction of parameter vectors C 1 , C 2 and C 3 was carried out for a large number of randomly selected pairs θ 1 and θ 2 .The θ 1 and θ 2 were selected in such a manner that their mean performance was the same.Table 8 shows the statistics of the Nash-Sutcliffe coefficients for the sets corresponding to C 1 , C 2 and C 3 .One can see that the inside points all have good performance and the standard deviation is small.Points at C 2 (outside points) have the worst performance while C 3 is better than C 2 but worse than C 1 .The skewness the performance is nearly zero for the inside set C 3 , while in other cases the strong negative skew indicates that in some cases the performance loss due to the shift outside the set is extremely high.The same alteration of the parameters leads to less performance loss for deep points than for shallow points.Further, there is no loss if the parameter vector remains in the convex set of deep parameters.This again highlights the advantage of deep parameter vectors.

Robust parameter estimation (ROPE)
In Sect. 4 it was shown that parameters in the interior (expressed through data depth) of the set of good points are themselves good and transferable and not very sensitive.A possible explanation for this is that these parameters can be regarded as a kind of compromise solution -where none of the processes represented by the parameters is overemphasized.
For modelling purposes one might be interested in finding the set of good parameters and also the identification of the deep parameter vectors for robust modelling.For this purpose the following procedure is suggested: 1. the limits for the d selected parameters are identified 2. N random parameter vectors forming the set X N are generated in the d dimensional rectangle bounded by the limits defined in 1.
3. the hydrological model is run for each parameter vector in X N and the corresponding model performances are calculated 4. the subset X * N of the best performing parameters is identified.This might be for example the best 10% of X N .
5. M random parameter sets forming the set Y M are generated, such that for each parameter vector θ∈ M , D(θ )≥L (with L≥1) where the depth is calculated with respect to the set X * N .
6. the set Y M is relabeled as X N and steps 3-6 are repeated until the performance corresponding to X N and Y M does not differ more than what one would expect from the observation errors.
Note that the ROPE algorithm can be easily modified to a general multivariate optimization procedures.
The algorithm was used for three selected subcatchments (Rottweil, Tübingen, Süssen).Four iterations were enough to find a good set of parameters for all the catchments.The performance of the model using different calibration and validation periods is summarized in Fig. 6   As one can see the subsequent iterations of the algorithm deliver better sets.The mean NS for the iterations increased form 0.767 (iteration 2) through 0.773 (iteration 3) to 0.775 (iteration 4).The improvement in the last iteration is very small and less than what one would expect to be caused by measurement errors.Therefore the algorithm stopped after this iteration.
Figure 7 shows the dotty plots of the selected model parameters for catchment Süssen.The performance corresponding to the parameters of iteration 4 are better than those corresponding to iteration 2 but the parameter range remains the same.Figure 8 shows the two dimensional scatter plot for the two model parameters for iterations 2 and 4 obtained for catchment Tübingen.One has the impression that these parameters can take a wide range of values, and that there is no difference between the the two sets.The ranges of the parameters for the catchment Rottweil for iteration 1 and 4 are listed in Table 2.Even if the ranges are very similar for many parameters one has to bear in mind that these are two dimensional projections of 9 dimensional sets.The sets themselves are very different, the ratio of their 9 dimensional volume is approximately 0.01 (calculated as Monte Carlo integral).
Figure 9 shows the sensitivity of the calculated discharge with respect to two sets of parameter vectors with different depth.1000 Parameter vectors from the boundary (depth=1) and from the interior (depth>1) were taken and the corresponding hydrographs were calculated.The 95% and the 5% lines show that the interior parameter vectors lead to smaller differences in calculated discharge.The differences between the 95% and the 5% values are plotted separately on Fig. 10 showing that taking interior parameter vectors leads to an approximately 20% reduction.
In the case study presented here all parameters in the final set Y M performed well.In the case of other models or other performance measures this may not necessarily be the case.However the set Y M always contains a large portion of good parameters and possible transformations (for example taking the logarithm of some parameters) might fix this problem.

Conclusions
Discussion and conclusions -In this paper the effect of observation uncertainty on the parameter estimation was investigated.It could be shown that observation errors can lead to very different optimal model parameters if the uniqueness of the parameters is assumed and the parameters corresponding to the optimum of the performance function are identified.
-Observational uncertainty of the input and the discharge leads to variability of the model performance.This variability has to be considered in model parameter estimation.All model parameters which do not differ more in their performance than what can be caused by measurement errors could themselves be the best parameters.
-Data depth is a useful tool to identify robust parameter vectors.Parameters with low data depth are near the boundary and are sensitive to small changes and do transfer to other time periods less well as high depth ones.
-From the examples discussed in this paper, one could see that equally performing parameters are not necessarily equally transferable or equally sensitive.Data depth can help to find domains with robust and transferable parameters.
-A iterative algorithm to find a convex set containing good model parameters was developed.
-In this paper, model performance was measured by the traditional Nash-Sutcliffe coefficient.Other measures can be treated similarly -but might lead to different parameter sets.Further research is needed to use the concepts developed in this paper for other purposes and models.Robust estimation of the model parameters might be very useful for regionalization and could contribute to a better prediction in ungaged basins.The suggested methodology can be extended for uncertainty analysis by relating the likelihood of the parameters to their depth, however further research is required to complete this task.

Fig. 3 .
Fig. 3. Points with low depth ≤4 (circles) of a two dimensional set of model parameters (crosses).

Fig. 4 .
Fig. 4. The performance of the model using different depth.

Fig. 5 .
Fig. 5. Construction of the points C 1 , C 2 and C 3 in one dimension for sensitivity analysis of parameters.

Fig. 6 .
Fig. 6.Histograms of the model performances for the different iterations of the algorithm for catchment Süssen.

Fig. 9 .
Fig. 9. Hydrograph with confidence interval for boundary points and inner points.

Fig. 10 .
Fig. 10.Confidence band width of high depth vs confidence band width of low depth.

Table 1 .
Summary of the size of the different subcatchment in the study area.

Table 3 .
Model performance for the observed series using optimal parameters obtained using 100 randomly perturbed discharge data sequences.

Table 4 .
Model performance for the observed series using optimal parameters obtained using 100 randomly perturbed temperature data sequences.

Table 5 .
Model performance for the N=10000 random parameter sets with respect to the data depth calculated on the basis of the points selected corresponding to the upper 10% performance.

Table 6 .
Runoff characteristics for different time periods.

Table 7 .
Model performance for parameter vectors according to their depth corresponding to the time period1961-1970.

Table 8 .
Model performance for the inner and the shifted boundary and deep points.