Sensitivity analysis and parameter estimation for the distributed modeling of infiltration excess overland flow

The variational methods widely used for other environmental systems are applied to a spatially distributed ﬂash ﬂood model coupling kinematic wave overland ﬂow and Green Ampt inﬁltration. Using an idealized conﬁguration where only parametric uncertainty is addressed, the potential of this approach is illustrated for sensitivity analysis and 5 parameter estimation. Adjoint sensitivity analysis provides an extensive insight into the relation between model parameters and the hydrological response and enables the use of e ﬃ cient gradient based optimization techniques.


Introduction
Critical and interrelated issues like state and parameter estimation, sensitivity and un-10 certainty analysis have received growing attention from the hydrological community. Since effective parameters are not directly measurable and potentially compensate for various sources of uncertainty, the focus has been primarily on parametric uncertainty. Experiences on the calibration of lumped conceptual models using a single objective function revealed that its response surface contains several regions of attraction, dis- 15 continuous derivatives and other geometrical properties compromising the use local methods, especially those using derivative information (Duan et al., 1992). Therefore, most applications and methodological developments to model calibration entail a stochastic exploration of the parameter space (global optimization) using computationally intensive Monte Carlo methods and/or evolutionary algorithms (Beven and potential deficiencies in model structures and formulation, explain and correct the lack of fit of hydrological models, provide guidance for model reduction and parametrization, analyse the information content of available observations, and lastly describe the subspace (i.e. of the original control space) driving predictive uncertainty.
The Regional Sensitivity Analysis (RSA) of Hornberger and Spear (1981) inspired 10 numerous applications and developments for the analysis of hydrological systems which includes the contribution of Beven and Binley (1992). Some authors address the extension to multiples objectives (Bastidas et al., 1999) or the combination with the powerful variance-based Global Sensitivity Analysis Methods Ratto et al. (2001), combination which is particularly suited for the identification of the input factors driv- 15 ing behavioral simulations (Monte Carlo Filtering). By combining RSA with recursive estimation techniques (Vrugt et al., 2002, Wagener et al., 2003) really investigate the model behavior. Apart from RSA-based methods, relatively sophisticated SA techniques received the attention of practitioners using complex models structures (Christiaens and Feyen, 2002;Yatheendradas et al., 2005, Sieber andUhlenbrook, 2005); 20 including non point source pollution modeling (Francos et al., 2003;Muleta and Nicklow, 2004;van Griensven et al., 2006). Those were applied only recently to lumped hydrological models (Ratto et al., 2006;Tang et al., 2006). In accordance with the paradigm currently adopted for the calibration of hydrological models, Global Sensitivity Analysis methods are characterized by a multi-dimensional averaging (Saltelli et al.,25 2000) of the sensitivity measures over the feasible parameter space. From the forcing to the initial condition and the model parameters, the dimensionality of the input space if greatly increased. Since the commonly employed global sensitivity and non-smooth optimization techniques are sampling based, dimension reduction is ployed in order to make Global Sensitivity Analysis methods computationally tractable (Yatheendradas et al., 2005;Hall et al., 2005).
The distributed modeling of catchment hydrology offers great potential for understanding and predicting the rainfall-runoff transformation. However, the curse of dimensionality associated to scarce observations of the physical system make some of 10 the previously mentioned issues very challenging. Since the commonly used methods can be limited in handling computer intensive spatially distributed systems, some of the lessons learned from other geophysical applications (such as meteorology and oceanography) using high dimensional and computer intensive models should be exploited. 15 For instance, the variational methods provide a unified framework to investigate both sensitivity analysis and parameter estimation. The adjoint state method, yielding to an efficient calculation of the derivatives of an objective function with respect to all control variables, is particularly suited when the dimension of the response function to be analysed (or cost function to be optimized) is small compared to the number of inputs to be 20 prescribed (dimension of the control space). The variational methods have contributed to numerous applications related to the analysis and forecasting of meteorological and oceanographic systems (data assimilation, sensitivity analysis, observation targeting).
With the growing complexity of hydrological models, the theoretical and methodological developments carried out in the variational framework (Le Dimet and Talagrand, 25 1986;Hall and Cacuci, 1983;Navon, 1998;Ghil and Malanotte-Rizzoli, 1991;Bennett, 1992 to cite a few) are of great interest for various problems related to hydrological modelling. Early applications of variational methods to hydrological systems have been carried out for parameter estimation and sensitivity analysis in groundwa- EGU ter hydrology (Chavent, 1974;Sun and Yeh, 1990). The state estimation problem was also addressed in this deterministic framework in land surface hydrology (Callies et al., 1998;Margulis and Entekhabi, 2001;Reichle, 2000) and more recently in river hydraulics (Piasecki and Katopodes, 1997;Mazauric, 2003;Belanger and Vincent, 2005;Honnorat et al., 2006). Concerning the transformation of rainfall into runoff very few 5 attempts have been made but provide interesting contributions to data fitting (state and parameter estimation) in catchment scale hydrology (White et al., 2003;Seo et al., 2003a,b,c). In this prospective study, using a very simple and very common model structure, the potential of variational methods for sensitivity analysis and parameter estimation in 10 rainfall-runoff modelling is illustrated. The paper is structured as follows: a very brief overview of the deterministic framework is followed by the presentation of the model and case study adopted. Then, representative examples are provided for the sensitivity analysis of scalar and vectorial responses. Lastly, adjoint sensitivities are used for the resolution and regularization of the inverse problem to be solved for model calibration.

Brief overview of variational methods
Variational methods provide a deterministic framework for the theoretical formulation and numerical approximation of numerous problems related to the analysis and control of physical systems, especially those governed by partial differential equations (Lions, 1968). The mathematical formalism, based on functional analysis and differential cal-20 culus, was largely expanded by related disciplines such as optimal control theory and optimization. Sensitivity analysis and nonlinear least squares estimation (state and parameters) can be addressed using a unified framework. For a very succinct presentation of the approach, let us consider a generic model of the form EGU where x is the state variable of dimension N s , M a nonlinear operator (after space discretization) and α a variable of dimension N p denoting the model parameters. Assuming that we are interested in analysing a given aspect of the system behavior or in fitting data with the model diagnostic variables, we define a scalar functional: where φ is a nonlinear function of the state variables and model parameters. Depending on the targeted objective, φ will represent a quantity of interest (model response) or a cost function measuring the misfit between the simulated variables and the observations. The gradient of the scalar functional J with respect to α at the pointᾱ: provides a quantitative measure (local measure) for the relative influence of the various model parameters on the response of interest. When φ is a performance measure to be optimized, the derivatives can drive very efficient algorithms for the estimation of the optimal α * minimizing the misfit with observations. A very common and straightforward technique for the evaluation of the gradient components consists in running the model 15 twice with different values for the parameters. For the i th component, this first order approximation is given by EGU greatly increased for more accurate approximations (central difference or second order approximation). Using the formalism employed by Cacuci (1981), let us consider the Gâteaux derivative, a generalization of the concept of directional derivative in differential calculus, of the objective function at the pointᾱ in the directionα: wherex refers to the variations on the state variable x resulting from perturbations on the parameters α in the directionα. Given that x is governed by the generic model given by Eq.
(1),x is solution of the following system: where ∂M/∂x represent the Jacobian of the model with respect to the state variables. The system given by Eq. (6) is the so-called tangent linear model. In order to obtain J(x,ᾱ ), this system has to be solved and the composition with Eq. (5) leads to the quantity of interest. The problem with this approach is that only the precision problem is addressed. In fact, sinceĴ(ᾱ,α )= < ∇ α J ,α >, the operation should be repeated for all the directions in the parameter space in order to obtain all the components of the 15 gradient. In order to circumvent this problem, the linearity ofĴ(ᾱ ,α) with respect toα is produced using the introduction of an auxiliary variable p (of dimension N s ). It can be shown (Lions, 1968) that if p is governed by the following system the gradient is given by

EGU
where ∂M/∂x T denotes the transposed of the Jacobian of the model with respect to the state variables. It is important to note thatx andα do not appear in Eqs. (7) and (8). Therefore, once p is known by integration (backward in time) of the system described by Eq. (7), all the components of the gradient ∇ α J needed for sensitivity analysis and nonlinear least squares estimation can be calculated.

5
Especially because of the terms ∂M/∂x T and ∂M/∂α T in Eq. (7) and Eq. (8), for a given model the practical implementation of the adjoint state method can require substantial efforts. Different paths can be pursued depending if the operations are carried out on the continuous form of the direct model, on its discretized form or directly on the computer code implementing the composition of Eq. (1) and Eq. (2). From a numerical point of view, the best representation of the functional to be derived is the associated computer code. Tremendous advances have been made in algorithmic differentiation (Griewank, 2000) and consequently the code based approach is facilitated by the advent of powerful automatic differentiation (AD) engines (see http://www.autodiff.org). The derivatives computed by means of algorithmic differentiation are accurate to the 15 machine precision. Considering the computer code implementing the direct model (model and objective functional) as a concatanated sequence of instructions, algorithmic differentiation is based on a rigorious application of the chain rule, line by line. The application of the chain rule from the inputs to the outputs of the function is denoted as the forward mode of AD (such as in Eqs. 6 and 5) whereas the reverse mode operates 20 from the outputs to the inputs. For vector valued response functions, it can be shown that when the ratio between the dimension of the input space and the dimension of the output space is greater than one, the reverse mode is more efficient in computing the Jacobian. The reverse mode of AD is the discrete equivalent of the adjoint state method from optimal control theory and it is perfectly suited when the response is a   In the version used for this study, rainfall abstractions are evaluated using the Green Ampt infiltration model and the resulting surface runoff (hillslope flow) is transferred using the Kinematic wave approximation (KWA). The complex geometry of the watershed is described by a uniform grid in which each cell receives water from its upslope neighbors and discharge to a single downslope neighbor (steepest direction). For a one dimensional downslope flow of average velocity u and average depth h, the continuity equation can be expressed as: where r is the rainfall intensity and i the infiltration rate. Using the KWA approximation, which has shown the ability to represent channelized and sheet overland flow (Singh, 2001), the momentum conservation equation reduces to an equilibrium between the bed slope S 0 and the friction slope S f . The Manning equation (uniform flow on each 20 grid cell) is used to relate the flow velocity and the flow depth: where R is the hydraulic radius, n the Manning roughness coefficient and w the elemental flow width. In this simplified version of the model, the flow width is constant 371 Introduction EGU (rectangular section) and given the ratio between the width (grid resolution) and the flow depth, the hydraulic radius is approximated by the water depth (i.e R=h). The resulting equation governing the overland flow is given by: In the right hand side of Eq. (11), which represents the excess rainfall, the infiltration 5 rate i (t) is estimated using the Green and Ampt equation, a very classic simplified representation of the infiltration process. For an homogeneous soil column characterised by its hydraulic conductivity K and ψ the soil suction at the downward moving wetting front, the potential infiltration rate is given by where θ is the initial soil moisture content, η the soil porosity and I(t) the cumulative infiltration at time t. After ponding (Mein and Larson, 1973), the cumulated infiltration at time t+∆t can be calculated by the following equation which is solved by the Newton's method. 15

Case study
The previously described model is applied to a very small catchment area (25 km 2 ) (see Fig. 1) from the upper part of the Thoré basin which was affected by a catastrophic flood event in November 1999. Unfortunately the event was not gauged since all measuring devices were washed away by the flood. Therefore, a priori values (derived from 20 published tables) for the parameters are used in the generation of a reference virtual hydrological reality. Although, different parametrizations will be used in this paper, for 372 Introduction EGU the information of the reader typical uniform values are shown in Table 1. When rainfall forcing is estimated from real radar data (from Météo France) the nominal values specified produce specific discharges typical for Mediterranean flash flood events.

Sensitivity analysis
Most contributions to sensitivity and uncertainty analysis in hydrology have been car-5 ried out in statistical framework. Since it is necessary to sample the control space in such analysis, the computational cost is always dependent on the number of variables considered (curse of dimensionality). However, depending on the purpose of the sensitivity analysis, it may not be necessary to average information over the entire bounded parameter space and local approaches around the behavioral nominal values 10 may prove very informative. In order to corroborate and improve our understanding of the way the different model parameters control the hydrological response, the tangent linear and adjoint models of MARINE were developed using the direct and reverse modes of the TAPENADE automatic differentiation engine (Hascoët and Pascual, 2004). 15 The nominal values for the parameters, the initial and boundary conditions define a trajectory in the model phase space. For a given trajectory we compute local sensitivity indices by means of derivative information. The outcome to be analysed is a gradient for a scalar response and the entire Jacobian matrix of the transformation for a vectorial response.

Numerical experiments with parametrization of reduced dimensionality
For the distributed modelling of catchment hydrology, parameters are discretized according to the spatial discretization of model state variables. However, even if different values can be assigned to every single element of the discretization using a priori information, some simplification of the structure for this high-dimensional space is manda- EGU tory to make the inverse problem tractable. While very sophisticated techniques have been developed in groundwater modelling for the identification of optimal parametrizations (Sun and Yeh, 1985;Ben Ameur et al., 2002;Tsai et al., 2003), zonation based on a priori information is the most commonly used strategy in distributed rainfall-runoff modelling. With a fixed pattern for the whole watershed (or a very limited number of 5 zones), scalar multipliers are used for both sensitivity analysis and parameter estimation. This classical reduction of the control space is adopted, a correction factor for the drainage network and another one for the hillslopes are specified. The relative importance of those input factors on two aspects of the hydrological response (flood volume 10 and flood peak) is provided by Figs. 2 and 3. Due to the mathematical formulation of the infiltration model Eq. (12), θ, ψ and η have the same first order effect. It is important to note that all sensitivities are negative because increasing the nominal value for all the parameters reduces the magnitude of the response. The analysis of Fig. 2 confirms that the flood volume is mainly determined by the infiltration parameters on the 15 hillslopes (hydraulic conductivity K and initial soil moisture θ). Additional experiments, which are not reported here show that the wetter the soil at the beginning of the event, the faster the decay of the infiltration rate to the hydraulic conductivity and therefore the greater the relative influence of K compared to the initial soil moisture θ. The previously mentioned parameters have a significant influence on the flood peak but they 20 are largely overtaken by the correction factor affecting the roughness in the drainage network (see Fig. 3). Due to the flow concentration, the roughness coefficient plays a more important role in partitioning infiltration and runoff in the drainage network.
If the quantification of the effect of parameter variations on the complete hydrograph is of interest, a vectorial response containing the temporal evolution (80 time steps) 25 of the simulated discharge can be considered. Due to the ratio between input and output space dimensions (i.e. 6/80), the Jacobian matrix is computed using the multidirectional tangent mode of TAPENADE. Each column of the Jacobian matrix is the result for all the time steps composing the response of an infinitesimal perturbation on EGU one of the input parameters. While one can propose a physical interpretation for the lines and/or columns of the Jacobian matrix, a very interesting view angle is provided by its singular value decomposition (SVD). The singular value decomposition of an m×n matrix A is a factorization of the form where S is a diagonal matrix containing the singular values of A in the decreasing order while U and V are orthogonal matrices (respectively of dimension m × m and n×n). The resulting σ 1 , σ 2 , · · · , σ min(m,n) is referred as the singular spectrum of A. The columns of U= {u 1 , u 2 , · · · , u m } and V = {v 1 , v 2 , · · · , v m } are the left and right singular vectors in the input and output spaces of the transformation represented by A. The 10 magnitude of the singular values in S represent the importance of the corresponding singular vectors in the columns of U and V. This factorization is widely used for the analysis of linear ill-posed problems (Hansen, 1998) and its potential extrapolation to nonlinear systems using the Jacobian, the adjoint or the Hessian or the model operator is spreading (Buizza and Palmer, 1995;Clément et al., 2004;Le Dimet et al., 2002;Li et al., 2005).
With the adopted parametrization of reduced dimensionality, the singular spectrum is given by Table 2 and the components of the first two singular vectors in the parameter space (right singular vectors) are shown in Fig. 4. The subscript "r" corresponds to the drainage network and the subscript "v" corresponds to the hillslopes. From Table 2 it 20 can be seen that the decay of the singular values is very rapid. Most of the variability (more than 96%) is contained by the first two vectors and because they represent othorgonal directions in the parameter space their components exhibit a clear distinction between the production and transfert of runoff. Given the components of the first singular vector and the magnitude of the singular value associated, the domination of EGU compensation with friction parameters. This is probably due to the fact that the increase in friction causes speading of the flood hydrograph: discharge values diminish up to the flood peak and increase after. Furthermore, using the adjoint state method (reverse mode of AD) the computational cost related to the evaluation of local sensitivities is not related to the dimension of the 5 input space, therefore a similar analysis is proposed without any a priori reduction of the control space.

Numerical experiments with fully distributed parameters
The use of sampling based sensitivity analysis methods is not tractable for distributed parameter systems without dimension reduction. Since this limitation does not apply to 10 the local sensitivity analysis method adopted in this paper, we use in this paragraph the full potential of the adjoint method in order to analyse the influence on the hydrological response of the value specified for all the elements of the discretization. Considering the scalar responses analysed in the previous paragraph, a single integration of the adjoint model yields to sensitivity indices for all parameters. 15 For, the sensitivity of the flood volume to the hydraulic conductivity and the sensitivity of the flood peak to the roughness coefficient, the spatial variability of sensitivity indices is provided by Fig. 5. The darker the pixel, the more important is the sensitivity of the response to the parameter. As expected, sensitivities are more important around the drainage network for both parameters. There is no contradiction with the results 20 obtained previously because since the hillslopes cover a larger surface, a scaling factor affecting this area be more influent than the one modifying the values in the drainage network on the overall runoff coefficient.
For the sensitivity of the maximum discharge to the roughness coefficient, two color scales were used due to positive and negatives sensitivities encountered over the sur- EGU counterbalance the overall effect in some concomitant sub-bassins. Therefore, when applying scalar multipliers, compensation effects usually occur which are very difficult to identify without such analysis. For example, increasing the nominal by 10% for all roughness coefficients leads to -4.5% change on the peak discharge. This variation is -5.9% when only the cells showing a negative sensitivities are modified and +1.5% when the same operation is carried out on the cells featuring positive sensitivities. If one is interested in ranking the different parameters, the norm (carefully choosen) of the normalized gradient can be derived and provide rigorous sensitivity indices. When considering the vectorial response, due to the absence of parametrization for the 3 parameters the ratio between the dimensions of the input and output spaces is 10 now very close to 100 (i.e. 3×2582/80). The Jacobian matrix is computed line by line using multiple integrations of the adjoint model. The multi-directional reverse mode is not yet implemented in TAPENADE but it should lead to a significant reduction of the computational effort in the near future. In order to ensure the comprehensibility of the SVD results, the factorization is performed on sub-Jacobians. Each sub-matrix 15 accounts for a single parameter but for all spatial locations. Even if the relative influence of the different parameters is not taken into account, at the parameter level the analysis contributes to an extensive understanding of the influence of the values specified over the entire watershed.
The singular spectrum for all the parameters and different forcing conditions (lumped 20 and spatially variable) is given by Fig. 6. The analysis of this figure reveals that the decay of singular values is faster for the roughness coefficient compared to the infiltration parameters. It is also important to note that this gap is reduced when the spatial variability of rainfall is taken into account. In order words, we corroborate the natural reasoning stating that spatially variable precipitation emphasizes the influence 25 of the spatial variability in friction parameters. It seems that more complex parameterizations are needed to capture the influence of heterogeneity for the infiltration parameters K and θ . In order to quantify the number identifiable degrees of freedom in the parametrization more precisely , the relative importance of the parameters and EGU the level of noise in the observations should be taken into account. However, using this approach it is possible to measure the effect of increasing the information content in observations (such as internal gauging stations or several flood events) on the identifiability of the parameters. The visualization of singular vectors in the parameter space for n and K (Figs. 7 and Fig. 8) also provide an extensive insight into the model behavior. The presence of two concomitant sub-basins driving the variability of the simulated discharges at the outlet of the watershed is clearly characterized. For the roughness coefficient n and the hydraulic conductivity K , the first singular vector is dominated by the main stream and its reception basin (Figs. 7a and 8a). For the hydraulic conductivity, the sensitivity 10 magnitude is decreasing with the distance from the principal convergence zone in the river network. In an analogous manner, positive components are encountered mainly on the other sub-basin for the second singular vectors (see Figs. 7b and 8b). For the roughness coefficient, only some elements, situated very close from the outlet are part of the main sub-basin whereas some elements situated very far from the outlet also do 15 for the hydraulic conductivity. When analyzed in the observation space, the potential interactions between the two sub-basins is outlined. For the roughness coefficient, the components of u 1 and u 2 (singular vectors in the observation space) are plotted together with the outlet discharge (see Fig. 9). Analyzing this figure, it is evidenced that the slight disruptions of the hydrograph during both the rising limb and the recession are 20 not due to the temporal variability of the rainfall but to the interaction of the flood waves traveling in the sub-basins. As expected, there is a perfect correspondence between the singular vectors of the input and output spaces. Due to the fact that the smaller sub-basin (holding v 2 ) is closer to the outlet of the catchment, the resulting smaller concentration time leads to a quicker response at the outlet perfectly characterized by 25 u 2 . It was shown in this section that the derivatives obtained with algorithmic differentiation provide a valuable introspection into the relation between the model parameters and the simulated hydrological response. Furthermore, the availability of accurate EGU adjoint-based sensitivities enable the use of efficient gradient based optimization techniques.

Parameter estimation
Whatever the complexity of the mathematical representation for hydrological systems, parameters are not directly measurable entities. Although the development of strate-5 gies for the a priori estimation of model parameters have received growing attention, calibration is still an important step in the modelling process. Mostly due to of the numerical artefacts produced by the mathematical, numerical and algorithmic representation of hydrological processes in lumped conceptual models, the classical and very efficient gradient-based parameter estimation methods have been abandoned. With 10 the evolution of the underlying philosophy and perceived objectives of model calibration, together with the rapid increase of computational power, the computer intensive non-smooth global search strategies have become the forefront in hydrology. The calibration of spatially distributed models using very scarce observations of the hydrological response leads to ill-posed inverse problems if no regularization strategy is 15 adopted. In meteorological or oceanographic data assimilation (state updating), an additional term (Tikhonov regularization) is added to the cost function for the penalization of control variables iterates which are too far from the a priori value (result of a model integration) for the state of the atmosphere. Because it is acknowledged as a fact that behavioral parameters may lie in very different regions of the parameter space, 20 global optimization methods are used for distributed models with very parsimonious parametrizations in terms of reduced flexibility in the spatial variability (scalar multipliers). However, when a priori values for the parameters are relevant, it was shown local search methods may offer advantages over global techniques (Kuzmin et al., 2004 1

EGU
Concerning the widely discredited continuity of the objective function derivatives, it has been shown that for some cases many problems can be avoided using appropriate and robust model implementation (Kavetski et al., 2006b,c). The formulation of hydrological processes and the computational approach usually adopted for distributed models tend to produce smoother response surfaces. It can be very difficult to avoid the 5 presence of internal model thresholds, especially when they are really due to physical features of the system. In the algorithmic representation of computer models, internal thresholds transform into conditional statements determining the control flow of the computer program. Therefore, the derivatives computed with algorithmic differentiation are valid only in a certain domain around the nominal values for the input variables 10 (Araya-Polo, 2006). The author even implemented the evaluation of this safe interval and the facility is now proposed by the forward mode of TAPENADE. However, it is also important to note that the objective functions used for the calibration of model parameters involve an integration of the residuals over time for integrated hydrological responses (spatio-temporal smoothing). Thereby, internal thresholds, especially 15 when occurring at the grid element level, do not necessarily produce discontinuous derivatives for the objective function. Even if they do, in the context of variational data assimilation Zhang et al. (2000) have shown that differentiable minimization algorithms such as quasi-Newton (BFGS) may still work well for minimizing non-smooth cost functions. The use of non-differentiable optimization algorithms (such as Bundle methods) 20 employing subgradients can be considered if difficulties are encountered. In this paper, a bound constrained (inequality constraints) quasi-Newton (BFGS) optimization algorithm (Lemarchal and Panier, 2000) from the MODULOPT library was used. Using the adjoint sensitivities, the algorithm estimates the active set by performing a Wolfe line search along the gradient projection path. 25 For the evaluation of the approach, synthetic observations are generated with the reduced parametrization used in the previous section with k r =4 mmh −1 , k v =2 mmh −1 , n r =0.05, n v =0.08 and θ=0.5 (uniform over the watershed). The Nash criterion is used to measure the misfit between model simulations and the synthetic observations. As Introduction EGU shown in Fig. 10, all control variables are retrieved independently from the initialization point. The relative importance of the parameters inferred from SA results is retrieved: the more sensitive is the response to a parameter, the greater is the identifiability of this parameter and therefore the faster the iterates convergence to the reference value. Furthermore, the sensitivity analysis results are usually used to assess but also en-5 hance the identifiability of model parameters. Since the analysis carried out in the previous section exhibit the spatial variability of the hydrological response sensitivity to the model parameters, rather than simply guiding the choice of calibration parameters (factor fixing), SA might provide guidance for a reasonable increase of the parametrization complexity. In fact, the sub-space from the original parameter space driving the 10 simulated discharges is spanned by the first vectors (right singular vectors) from the SVD of the Jacobian matrix. Given that a small number of singular values are dominant, as illustrated in Fig. 6, most of the variability can be captured with very few orthogonal directions in the parameter space. Even if linearization is a local concept, the right singular vectors are mainly determined by the topography of the watershed and we do not 15 expect important modifications when the Jabobian is evaluated for different trajectories in the model phase space. In order to compute the vectors describing the relevant subspace for data fitting, the factorization was performed for the Jabobian calculated with spatially uniform rainfall forcing and model parameters (reference values of Table 1). Compared to the parameter estimation experiments carried out in the previous sec-20 tion, a more complex virtual hydrological reality was used for the generation of the synthetic discharge series. Deterministic spatial distributions where imposed for the different model parameters: the hydraulic conductivity K is decreasing linearly with the ground elevation; the roughness coefficient n is derived using a land-use classification from SPOT satellite data for the hillslopes and uniform in the drainage network; and 25 lastly the initial soil moisture θ is constant over the watershed. Then, the calibration problem is tackled using parametrizations of increasing dimensionality. The number of degrees of freedom for each parameter, the Nash performance of the identified parameter set and the inverse of the condition number are given in Table 3.

HESSD Introduction
Conclusions References

EGU
Starting with the very simple parameterizations P 1 and P 2 , the number of degrees of freedom is increased gradually using the singular vectors driving X % of the variability for parameters K and n (labelled P SV X in the table). The condition number was calculated with H the quasi-hessian after the last BFGS update (i.e. at the optimum). We recall that the larger the ratio 1/κ(H), the better is the conditioning of the optimization 5 problem. From the results obtained in Table 3, it seems that using this description of the parameter space the number of control variables can be increased without altering conditioning of the optimization problem. The previous statement is valid as long as the vectors describing the kernel in the parameter space (the specified degress of freedom which do not significantly alter the hydrological response) are not introduced 10 in the parametrization. The results obtained with P SV 90 show that even with noise-free observations the use of those directions for the description of the affordable sub-space lead to instability in the inverse problem.
However, it is interesting to note that with respectively 7 and 10 degrees of freedom (i.e. P SV 70 and P SV 80 ), the conditioning it is even better than the one obtained 15 with parametrization P 2 . As emphasized by Tonkin and Doherty (2005), the subspace determined from the truncated singular value decomposition of the Jacobian (TSVD) is determined from the information content of the observations whereas the subspace constructed from a prior parsimony strategy is not. The previously cited authors used TSVD of a finite difference Jacobian matrix intervening in the linearized equations of the 20 Levenberg-Marquardt method for the regularization of the inverse problem. In order to prevent over-fitting and combine the advantages of TSVD and Tikhonov regularizations an hybrid regularization methodology is proposed and embedded in the last version of the PEST package (Doherty, 2004).
While the truncation level of the SVD is the only mechanism for preventing over- 25 fitting, the quadratic penalization term of the Tikhonov approach is also a way to insert a priori information on the parameters. Even if physically acceptable bounds were assigned for each parameter (a singular value across the watershed surface), the improvement of performances in terms of Nash coefficient do not necessarily come with EGU behavioral parameters (see Table 4). For the ideal situation where the estimated parameters can be compared to the virtual hydrological reality, the Nash N X and the coefficient of determination R 2 X (estimated versus reference values) were computed for the different parametrizations. Except for the parameter θ (poorly constrained by the observations), which was used to 5 compensate for the lack of degrees of freedom for overly simple parametrizations, P 2 seems superior to other strategies. However, the compensation effects are reduced with the increasing complexity for the other parameters and this lead to a better identification of θ.
Using the combination with Tikhonov regularization, one could introduce additional a priori information with a penalization term enforcing the correlation with a reasonable spatial distribution. The investigation of an the hybrid strategy similar to the one proposed by Tonkin and Doherty (2005) is beyond the scope of this paper but special care should be taken for the specification of the regularization parameter. The choice of the mathematical formulation for this regularization term should maintain a good 15 compromise between flexibility and stability.

Conclusions and discussion
Recent advances in computing power and observation capabilities enable the representation of the spatial variability characterizing the atmospheric forcing and the catchment attributes. Although the choice of the necessary and affordable model complexity 20 providing accurate and reliable predictions for the hydrological response of a watershed to rainfall forcing is still an open problem, models are gradually evolving from simplified bucket models to more complex structures. Even if they can be limited for real time flood forecasting distributed models are valuable tools for understanding hydrological processes. However, the analysis and control of those complex systems is very chal- 25 lenging and all aspects cannot be addressed using the paradigm inherited from lumped hydrological models (low dimensionality, low computational cost, no sizable constraints EGU on parameter values). For the moment, little regard was paid to the potential contribution of deterministic methods that have proven successful for other disciplines facing the same challenges. As Margulis and Entekhabi (2001) emphasized for land surface hydrology applications, many of the issues that led meteorologists and oceanographers to use adjoint techniques are now at the forefront in hydrology.

5
Using a very simple and common model structure, an ideal test case configuration, it has been shown that the potential of variational methods for catchment scale hydrology should be considered. Although for this particular application most of the outcomes reduce to evidence retrieval, the adopted techniques should be further exploited. The approach is not model-free but its practical implementation is largely facilitated by the 10 advent of very efficient automatic differentiation tools such as the one used for this study. It is important to emphasize that a single integration of the adjoint code, encompassing the forward integration of the direct model and the backward integration of the adjoint model, yields all spatial and even temporal sensitivities (Hall and Cacuci, 1983) for the nominal values prescribed for the control variables. The key advantage of this 15 technique is that the computational cost is independent from the dimension of the control space. Given the abundance of extracted information, adjoint sensitivity analysis is profitable step to be carried out before the assimilation of observations for parameter and state estimation. The results obtained in this paper show that the influence of each input factor in the high-dimensional parameter space can be investigated (spatial SA). 20 In comparison, similar calculations using sampling based approaches would imply a prohibitive computational cost. When the entire flood hydrograph is analyzed, the interpretation of the singular vectors of the Jacobian in the parameter space and in the observation space brings out very relevant information. The parameters really influencing the hydrological response response (spatial location) and the measurements 25 really constraining the parameters (temporal location) are identified. Although differential methods are intrinsically local, they are perfectly suited when the analysis of the system in behavioral regions of the parameter space is of interest. The scope of the analysis can be extended by varying event-based input factors such as antecedent soil EGU moisture conditions and rainfall forcing (multi-local sensitivity analysis). The availability of distributed physically based models does not, and will not in the near future, overcome the need to calibrate at least part of the model parameters. When applying parsimonious parametrizations, subgradients computed with the reverse mode of algorithmic differentiation (adjoint method) were found exceedingly ef-5 ficient in driving bound constrained quasi-newton optimization algorithms to the reference values used to generate synthetic observations. If no prior strategy is adopted for the parametrization, the calibration problem is ill-posed. Exploiting the results obtained during the sensitivity analysis, prospects are formulated and illustrated for the regularization of the inverse problem using the truncated singular value decomposi-10 tion of the Jacobian matrix. By using a subspace from the original control space for the calibration of model parameters, the approach is very similar to the reduced-order strategies proposed in data assimilation (Blayo et al., 1998). In the variational data assimilation framework, but also in the hybrid approach proposed by Tonkin and Doherty (2005), this strategy is combined with the classical Tikhonov regularization to prevent 15 overfitting and enforce the convexity of the cost function. In fact, in order to limit the natural increase of parameter uncertainty resulting from an increasing parametrization complexity, a priori information on the spatial organization and the nominal values for the parameters could be used.
The authors acknowledge the fact that nominal values for the parameters are inferred 20 from indirect and uncertain observations of the hydrological response, with imperfect models driven by corrupted rainfall forcing and using performance measures based on integrated catchment or sub-catchment response. The separation of the different sources of uncertainty is very challenging because of complex compensation effects. Following the development of mathematical formulations for meaningful error models, 25 variational methods can provide an invaluable insight into the analysis and control of the uncertain system dynamics (Vidard et al., 2000;Vidard et al., 2004). Alternatively, for the integration of gradient-based methods to a Bayesian probabilistic framework, the reader is referred to the very promising contributions of Kavetski et al. (2006a)  Part I: Adjoint Sensitivity Analysis, J. Hydraulic Eng., ASCE, 123, 486-492, 1997. 367 Ratto, M., Tarantola, S., and Saltelli, A.: Sensitivity analysis in model calibration: GSA-GLUE approach, Computer Physics Communications, 136, 212-224, 2001. 365 Ratto, M., Young, P. C., Romanowicz, R., Pappenberge, F., Saltelli, A., and Pagano, A.: Uncertainty % of variability eigenvalues K (uniform rainfall) n (uniform rainfall) θ, η, φ (uniform rainfall) K (radar rainfall) n (radar rainfall) θ, η, φ (radar rainfall)