Improvement, calibration and validation of a distributed hydrological model over France

Abstract. The hydrometeorological model SAFRAN-ISBA-MODCOU (SIM) computes water and energy budgets on the land surface and riverflows and the level of several aquifers at the scale of France. SIM is composed of a meteorological analysis system (SAFRAN), a land surface model (ISBA), and a hydrogeological model (MODCOU). In this study, an exponential profile of hydraulic conductivity at saturation is introduced to the model and its impact analysed. It is also studied how calibration modifies the performance of the model. A very simple method of calibration is implemented and applied to the parameters of hydraulic conductivity and subgrid runoff. The study shows that a better description of the hydraulic conductivity of the soil is important to simulate more realistic discharges. It also shows that the calibrated model is more robust than the original SIM. In fact, the calibration mainly affects the processes related to the dynamics of the flow (drainage and runoff), and the rest of relevant processes (like evaporation) remain stable. It is also proven that it is only worth introducing the new empirical parameterization of hydraulic conductivity if it is accompanied by a calibration of its parameters, otherwise the simulations can be degraded. In conclusion, it is shown that the new parameterization is necessary to obtain good simulations. Calibration is a tool that must be used to improve the performance of distributed models like SIM that have some empirical parameters.

1 Introduction 20 Few distributed models are able to simulate the main land surface processes at the scale of a country like France (Henriksen et al., 2003;Mitchell et al., 2004). At this scale, many difficulties arise, which are mainly related to scale and parameterization. The SIM, model, which is used operationally at Météo-France, for example, to monitor simulate the exchanges in heat, mass and moment between the continental surface (including vegetation and snow) and the atmosphere. There are several versions of ISBA, ranging from a two layer force-restore method (Deardorff, 1977), to a more detailed diffusion version (Boone, 2000). SIM is implemented using the three layered force-restore version (Boone et al., 1999) with the 3-layer snow scheme of Boone and 5 Etchevers (2001).
In the three layered version of ISBA, the evolution of the soil water content for each layer (omitting phase changes) follows these equations: where w i are the volumetric soil water contents for each layer, d i are the soil depths, ρ w is the water density, I is the infiltration (defined as the difference between precipitation and surface runoff), E g is the evaporation over bare ground, E tr is the transpiration of Introduction where τ is a time constant, w fc is the soil water content at field capacity and w eq is the soil water content at the equilibrium between capillarity and gravity. C 1 , C 2 , C 3 , C 4 are the force-restore coefficients. These, and the hydrological parameters of the soil (soil water contents at the wilting point (w wilt ), field capacity (w fc ) and 10 saturation (w sat )), are obtained a priori from the textural properties of the soil using empirical relationships (Clapp and Hornberger, 1978;Noilhan and Mahfouf, 1996;Boone et al., 1999). Hydraulic conductivity k (m s −1 ) and pressure head Ψ (m), depend on the soil water content (w i ), but also on the textural properties of the soil: where Ψ sat is the air entry tension, k satc is the hydraulic conductivity at saturation (also obtained a priori from textural properties) and, finally, β is the slope of the water retention curve. Introduction The hydrogeological model MODCOU calculates the temporal and spatial evolution of the aquifer at several layers, using the diffusivity equation (Ledoux et al., 1989). Then it calculates the interaction between the aquifer and the river and finally it routes the surface water to the rivers and within the river using an isochronistic algorithm.

5
It calculates river discharge with a time step of three hours. The time step used to calculate the evolution within the aquifer is 1 hour. In the version of SIM used in this study, the aquifers are only calculated in two basins: The Seine (3 layers) and the Rhône (1 layer) basins.
3 The present state of the parameterizations of ISBA related to hydrology 10 ISBA was originally designed as a simple physical model to represent the continental surface in atmospheric models. The need to validate the model over large surfaces and long periods of time, led to the coupling of the surface scheme with the hydrogeological model MODCOU (Habets, 1998). However, the first applications of the coupled system showed that it was necessary to modify ISBA to better represent processes relevant to 15 hydrology. In the next sections, three parameterizations introduced to ISBA in the past are described and their impact to model calibration is commented.
3.1 Deep soil layer to take into account the slow hydrological component The initial version of ISBA divided the soil in two layers: a thin superficial layer, which acted as a reservoir for evaporation from the soil surface, and a single subsurface layer 20 to model the mean water content for the root and the subroot zones. The coupling of ISBA with MODCOU showed that the partitioning of the precipitation between runoff and evapotranspiration should be improved. Boone et al. (1999)  Interactive Discussion the amplitude of drainage pulses and increase the time lag between infiltration and drainage, making the base flow time series more realistic. The modification was done using the same force-restore philosophy already used in ISBA and led to the equations described in Sect. 2.2. A new parameter was introduced: the root depth (d 2 ), which was added in addition to the total soil depth (d 3 ). Both of these parameters are poorly 5 known. But root depth is clearly related to the vegetation type, which makes it easier to assign a consistent value to it. The values of d 2 and d 3 were set in function of the vegetation type and tested in one-dimensional and two-dimensional setups. In general, d 2 was set to be 2 3 d 3 (Habets et al., 1999b). These values, were introduced into the global ECOCLIMAP database (Masson et al., 2003) and became the reference values 10 used in all subsequent studies.
3.2 Subgrid runoff scheme to simulate fast riverflow ISBA simulates surface runoff through the saturation excess mechanism (also known as Dune mechanism), therefore, runoff is only produced when precipitation occurs over a saturated soil. This is a problem at the scale considered in SIM, because, in reality, 15 the scale of variability of runoff production is smaller than the typical size of the grid cell (64 km 2 in our case). The consequence is that, when ISBA is run at these low resolutions, the soil almost never saturates and, therefore, there is no runoff production, even though, in reality, a fraction of the cell is saturated and does produce surface runoff. To solve this problem, a subgrid variability of runoff was introduced by Habets 20 et al. (1999b) following the approach of the Variable Infiltration Capacity (VIC) scheme, described in Wood et al. (1992) and Dümenil and Todini (1992) and inspired from the Nanjing model (Zhao, 1992). In this scheme it is considered that the infiltration capacity, the maximum depth of water that can be stored in the soil column, varies non-linearly within the grid cell. 25 The fraction of the grid cell that is saturated is a function of some soil parameters (the soil water content at saturation, the wilting point and the root depth), the soil water content of the root zone (w 2 ) and a new parameter, called b, which represents the 1326 Introduction shape of the heterogeneity distribution of effective soil moisture capacity. Furthermore, after preliminary testing of this parameterization on the Adour watershed (Habets et al., 1999b) found that the parameterization generated too much runoff in summer for dry soil conditions. To avoid this problem, a threshold was introduced in the soil wetness, under which runoff was not produced, this threshold was set to be the wilting point (w wilt ). In this empirical approach, the main difficulty is to set the value of the shape parameter, as it cannot be obtained a priori. This parameter could be related to subgrid topography, soil texture and vegetation type (Dümenil and Todini, 1992;Warrach et al., 2002;Decharme and Douville, 2007) but in fact, this dependency, if it exists, is not 10 well understood and, therefore, b remains a parameter to be calibrated (Xie and Yuan, 2006). In SIM, this parameter was set to a fixed value (b=0.5) for almost all the cells. For sandy soils, it was set to be very small.
To avoid a calibration parameter, subgrid runoff could be simulated using the TOP-MODEL approach (Beven and Kirby, 1979). In this case, the runoff scheme would not 15 need to be calibrated, as the topographic index (λ) only depends on topography. Habets and Saulnier (2001) tested this approach on the Ardeche river basin using the version of ISBA with two layers and concluded that, at the conditions under the SIM model is run, its introduction didn't give substantial benefits in terms of daily river discharge, but modifies the partition between runoff and drainage. Decharme and Douville (2006) 20 found similar results over the Rhône river basin, while, at the global scale, they found that, in comparison to the VIC approach, it improved the simulation of river discharges in many regions of the globe (Decharme and Douville, 2007).
The downside of the TOPMODEL approach is related to the difficulty of calculating the topographic index, as it is dependant on the resolution of the digital elevation model 25 (DEM) (Quinn et al., 1995). This is a source of uncertainty. As the DEM used in SIM to derive the hydrographic network has a coarse resolution (1 km), it was not reasonable to use it to compute the topographic index. Introduction In the initial force-restore framework, drainage is produced when soil wetness is restored to field capacity (w fc ). Therefore, when the soil is under w fc , the model does not produce any drainage. In those places where it is known to be an aquifer which is not simulated by MODCOU, SIM underestimates the stream flows in summer, because, 5 during this period of time, the contributions from the aquifer are the main source of water for the stream. To solve this problem, Habets et al. (1999a) introduced a parameterization which allows the existence of a residual drainage under field capacity. This residual drainage compensates the lack of contribution from the aquifer. The parameterization, modifies the equations of ISBA for the drainage (K 3 ). For 10 example, for the third and deepest layer of the soil, the drainage is represented as follows: 15 where w drain is a parameter (to be calibrated) and w g min is a numerical threshold, which is rarely reached. The subgrid drainage between the root layer and the deep layer (K 2 ) is introduced similarly. As described in Caballero et al. (2007) and ? the w drain parameter is calibrated in order to sustain a predefined discharge, for example, the driest observed decile (Q 10 ), 20 using the following equation: where i represents the grid cells that belong to the basin corresponding to the river gage under study, d i is the total soil depth for the grid cell and S i is the surface of the grid cell i that belongs to the upstream area of the river gage under study and τ is a constant of one day. This parameterization is able to improve the simulation of river discharge but, at the 5 same time, presents two problems. First, water that should be taken from an underground aquifer, is artificially taken from the soil reservoir. Second, when the parameterization is active, the simulation of low flow is influenced by w drain , which is calibrated, as a consequence, the model's ability to detect the impact of climate change on low flows is slightly reduced in those places where the parameterization is active.

10
There are two ways of improving the previous problems. The first one would be to extend the number of simulated aquifers with MODCOU. This could be done in 2-D or even 3-D for multi layer aquifers. The other possible solution would consist on introducing a 1-D representation of the aquifer, with, for example, a new reservoir, which would play the role of the non resolved aquifer (Fenicia et al., 2006). Both solutions are out 15 of the scope of this work, nevertheless, this problem should be tackled in the future.

Introducing a parameterization for hydraulic conductivity in ISBA
Even though some parameterizations were introduced into ISBA to improve its performance in the context of hydrology, at the present state, the discharges simulated by SIM present some problems, which might be due to the poor description of the dynam-20 ics of water in the soil. Figure 1 shows how SIM tends to produce a second peak of discharge just after the main events: drainage attains the river network too slow. This shows that a better description of the processes in the soil, might help to produce a more realistic drainage and runoff, which, in turn, would help to produce more realistic discharges. 25 Hydraulic conductivity at saturation (k sat ) plays an important role in the dynamics of water in the soil. In ISBA, hydraulic conductivity depends only on soil texture (through Introduction  Clapp and Hornberger, 1978). As texture is considered constant, it is homogeneous in the whole column. As a consequence, ISBA does not consider the changes in hydraulic conductivity produced by structural causes, for example, the presence of macropores. Macropores, which are generally present in natural undisturbed soils (Young et al., 1998), are produced by such agents as plant roots, soil cracks, or soil fauna.  introduced an exponential profile of hydraulic conductivity to a version of ISBA which used the TOPMODEL approach for runoff (following the work of Montaldo and Albertson, 2001;Chen and Kumar, 2001) and applied it to the Rhône basin at different resolutions. In this study, this same parameterization was introduced to the SIM suite and, therefore, extended to the whole of France. As the details of the parameterizations can be found on , here only its main characteristics will be shown. Figure 2 explains the modified hydraulic conductivity in an schematic way. In this formulation, hydraulic conductivity at saturation (k sat ) depends on depth (z): where k satc is the compacted value of saturated hydraulic conductivity, which corresponds to the value used in Eq. (9), f is a shape factor and d c is the compacted depth (k sat (d c )=k satc ). The equation for hydraulic conductivity (Eq. 9) is replaced by 20 The introduction of this parameterization involves a recalculation of the force-restore parameters found in Eqs. (4-7), which can be analytically calculated from the old values and the parameters f and d c . With the introduction of the exponential profile, the C 3 parameter, which characterizes the rate at which the water profile is restored to the field capacity, is different for the root zone layer and the deep layer (C 3 becomes C 32 25 and C 33 respectively). An important consequence of using this parameterization, is that it introduces two new parameters, which cannot be obtained from primary ones. Unfortunately, it is difficult to define the physically meaningful range of f and d c . For example, Chen and Kumar (2001) used a homogeneous value of f of 1.8 m −1 all over the USA. Niu and Yang (2003) and  used a default of 2 m −1 , but during sensitivity 5 tests, they led the parameter to be in the ranges 1-8 m −1 and 1-3 m −1 respectively. For d c it is easier. The hypothesis is that the changes in soil structure are due to the presence of organic mater, as roots, which create preferential paths and macropores. Therefore, the compacted depth could be somewhere not far from the root depth. After sensitivity tests,  found that the best values of the parameters 10 for the Saône basin (a sub-basin situated on the north part of the Rhône basin) were f =2 and d c =d 2 , being d 2 the root depth. With these values, hydraulic conductivity at the surface of a typical soil can change by one or two orders of magnitude, which strongly changes the behavior of the modeled hydrological response. Again, as it was discussed in Sect. 3.2, one could imagine a relationship between these parameters 15 and the properties of the soil or the vegetation, but this relation, if it exists, remains difficult to stablish, therefore, these parameters need to be calibrated.
To avoid complexity, it seems attractive to reduce the number of parameters, for example, there are other implementations of the exponential profile of hydraulic conductivity, which use only one parameter (f ), instead of two. For example, Stieglitz et al. 20 (1997) used k sat =k sat (z=0)·e −f z and Chen and Kumar (2001) used k sat =k satc ·e −f (z−1) . In this case d c is 1 m, which is an arbitrary assumption. In fact, preliminary tests showed that both parameters were needed to accurately represent the dynamics of water in the soil.
From now on ISBA-KSAT or SIM-KSAT will refer to the modified versions of the In this study, subgrid drainage was implemented as in ? but adapted to the presence of an exponential profile of hydraulic conductivity. The adaptation was necessary, because the original parameterization was created for a soil with a constant C 3 , and now the values of this constant are different for each layer. The only difference between 5 the formulation used in this study and the one described in Sect. 3.3 is that the w drain parameter had to be different for each layer (l ) due to the fact that C 3 was also different for each layer. The values of w drain for each layer (w drainl ) could be calculated from the old values (w drain ), so there was no need to calibrate the parameters again. The new values of the parameter for a given cell are, for each layer: where w drainl and C 3l are the new values of w drain and C 3 for each layer (l ).

One-dimensional sensitivity tests
The results shown in this section were obtained using a one dimensional version of ISBA, which included all the commented parameterizations. The data used for pa-

Hydraulic conductivity
The experiment, consisted in modifying the values of one parameter, leaving the other unmodified at its default value. The ranges of the parameters were selected according to the values used in previous studies (Sect. 4). From the parameterizations described in previous sections, the one that affected the 5 most the behavior of the model when it was introduced was the exponential profile of hydraulic conductivity. Changes in the force-restore coefficients, affected the soil water dynamics, soil water content, evaporation and the partition between surface runoff and drainage. For example, using the default values of the parameters, which might not be realistic, ISBA-KSAT annually produced 19% more evaporation than ISBA.

10
The model outputs were very sensitive to changes in the values of f and d c . Figure 3 shows the sensitivity of total evaporation and the partition between surface runoff and drainage to these parameters. The main changes on evaporation were due to the changes in evaporation over bare ground, which, in the selected point, was important. The impact of both parameters on evaporation was comparable, in terms of 15 amplitude and annual cycle, as shows Fig. 4. In fact, in both cases, the increase of the parameter increased the total evaporation, but diminished the evapotranspiration of the vegetation. The mean annual cycle of the water content of the root zone was strongly affected by d c . The greater the value of d c /d 2 , the dryer the soil was. As d c /d 2 increased, the annual runoff diminished, and drainage ( Fig. 5) increased, attain-20 ning a maximum at d c /d 2 =0.7, then it diminished. On the other hand, an increase of f , increased drainage but left runoff and soil wetness almost constant. The model's evaporation was only sensitive to the changes of hydraulic conductivity in spring and summer, and drainage and runoff were more sensitive in autumm and winter. 25 In the same fashion the values of the shape parameter of subgrid runoff were modified.

Subgrid runoff
The outputs of the model were less sensitive to the shape parameter of subgrid runoff than were to the parameters related to the exponential profile of hydraulic conductivity. As expected, evaporation was not very sensitive to this parameter ( Fig. 6) and it was the partition between the fast and the low components of the runoff which was mostly affected by it. As expected, the accumulated annual surface runoff was close to zero when b was close to this value, as subgrid runoff is determinant for model 5 runoff production. This is reflected in the yearly cycle of both, drainage and runoff. This last variable, changed considerably from almost zero in the whole period (except december), for b=10 −3 to having runoff during the whole period (in exception of July) for b=5.

10
The sensitivity of ISBA to three empirical parameters, which control processes related to hydrology, was studied using a one-dimensional setup. It was found that these parameters completely control the amount of runoff and drainage produced. Evaporation was also affected considerably. Runoff was not only affected by b, but also by the compacted depth. This parameter, together with f , also strongly affected evapora-15 tion. These three parameters affect the same processes, therefore the values of one of these parameters affect the values of the others, as a result, different sets of the parameters might lead to similar results, which might lead to equifinality (Beven, 2006). This is a consequence of the empirical basis of the parameterizations. At this point, an important question arises: Which is the best way to find the appropriate values of this 20 parameters? The next section will try to answer to this question.

Calibration of the distributed model
When using a model, like ISBA, which is intended to be as physical as possible, it is always desirable to determine the values of the parameters of the model using observed data. However, in the previous section it was seen that this is not always possible. First, because it is rare to have data for every parameter. Second, because, even though the necessary data might be accessible, it might not be directly usable due to the difference of scales between measurements and the grid of the model. Third, because, as it was seen before, some of the parameters of the model might be empirical, not physical. Therefore, in the case of facing any of the previously mentioned situations, 5 calibration must be used. But calibration is a difficult exercise, because it can lead to obtaining good results for the wrong reasons (Kirchner, 2006). For instance, the structure of a model is always limited, as only the main processes are taken into account. Therefore, calibration can lead to situations where an existing parameterization, indirectly, takes into account processes that initially were not included in the structure of 10 the model. This is not desirable, because it prevents the modeler to understand the behaviour of the system. In some cases, this kind of situation can be detected, for example, when the values of the calibrated parameters are out of their physical range and the model simulates the right discharges. However, sometimes there is not much information available about the physical range of the parameters, which is the case 15 of the two parameters of the exponential profile of hydraulic conductivity at saturation (see Sect. 4). But, in the context of SIM, calibration is necessary, because the physical base of the values of the parameters found in previous studies is not strong (it is the case, for example, of b, f and d c ). Therefore, using the values from the literature is not safer than using parameters obtained from a careful calibration. 20 No previous study tackled the problem of model calibration using the SIM model. Instead, a default value for each parameter was found at the time of the introduction a new parameterization. Being the model so dependent to empirical parameters, as the sensitivity tests of Sect. 5 showed, it was expected that calibration would strongly improve its performance. 25

Strategy of calibration
Daily river discharge on 152 gauging stations distributed all over mainland France (Fig. 7)  were not seriously affected by anthropization (for example, hydropower generation facilities). Globally, the total area covered by the catchments defined by these stations is of 432 384 km 2 , the mean basin area is of 8227 km 2 , the smallest surface is of 245 km 2 (Huveaune at Aubagne) and the biggest covers 110 356 km 2 (Loire at Montjean-sur-5 Loire). In general, SIM performs reasonably well on the selected stations, however the model is known to perform poorly in some basins due to structural problems. For example, on the Huveaune and Argens in the South-East, which are small. These two basins were kept to test to what extend the calibration could compensate the structural problems. Furthermore, there are stations, like the Rhône at Beaucaire, which 10 integrate the discharge of highly anthropizased tributaries, for example, the Durance which is Alpine. Ten years of data were selected (from August 1995 to July 2005). The first five years where used to calibrate the model, the following five were used to validate it using the split-sample test technique (Klemes, 1986). Validation will be detailed in further 15 sections.
For calibration purposes, the quality of simulation was evaluated using a function built using the Nash-Sultcliff (NS) efficiency (Nash and Sutcliffe, 1970) and the overall water balance at the daily time step, which are independent from each other (Weglarczyk, 1998). The overall water balance represents the error on the total volume which 20 is calculated as the difference between observed and simulated runoff volumes normalised by the observed runoff. The function to minimize was: and Q o i and Q s i are the observed and simulated river discharges at the instant i . Following this strategy, the shape parameter of subgrid runoff (b) was calibrated after the exponential profile of hydraulic conductivity (f and d c ). This order was chosen because, as seen in one-dimensional tests of Sect. 5, the evaporation simulated by the model is more sensitive to changes in f and d c than on b. The first two parameters affect evaporation, runoff and drainage, while b only affects the partition between drainage and runoff. Therefore, the calibration of b was used as a fine tunning of the previous calibration. Furthermore, this allowed to test to which extent the calibration of b was necessary after the introduction of the exponential profile. The FDc-BASIN and FDcB-BASIN simulations were calibrated at the basin scale, as opposed to FDC-FRANCE, in which each cell had the same values of the parameters. In this case, the values of each cell were identical if they belonged to the same subbasin. This calibration method is semi-distributed. As the basins defined by the selected stations are nested, a procedure was defined to decide which values would have a cell that belongs to more than one basin: 1. Each grid cell of ISBA was assigned to one single basin. If it belonged to more than one basin, because these were nested, it would assigned to the smallest.
2. Simulations were done using the same values of the parameters on the whole of 15 France.
3. The values of the parameters assigned to each cell, were those that performed better at the station that defined the basin to which the cell belonged. Before doing any calibration, it is crucial to find a physically meaningful range of possible values for the calibrated parameters. Otherwise, as explained in Sect. 6, the model could produce good results for the wrong reasons. It must be stressed that finding a good range of values is not enough to guarantee that, after calibration, the results will 5 be physically sound, but it remains a necessary step.
In the case of f and d c , as said in Sect. 4, it is difficult to define this meaningful range. In Sect. 5.1, the values found on the literature were used to do one-dimensional tests. This procedure was adequate to study the sensitivity of the model, but more care must be taken in order to calibrate it. For example, the hypothesis that the ranges of change 10 of f and d c are independent from each other might not hold. Therefore, a new strategy was defined to determine the ranges of the parameters.
First, d c was related to the root depth, as the structure of the soil and the presence of biomass are related (Sect. 4). The calibrated parameter was d c /d 2 , instead of d c alone. 15 Second, it was determined how f and d c /d 2 should be related. The effect of the exponential profile was introduced in the model through the force-restore parameters (Sect. 4). Instead of looking for a range of possible values of f and d c , it was the physically meaningful region of the phase space formed by the possible values of C 32 and C 33 that was determined. 20 The C 3 parameter characterizes the rate at which the water profile is restored to the field capacity, the greater it is, the faster drainage will be produced. To be coherent with the hypothesis that macropores and preferential paths are located near the surface, it was determined that C 32 >C 3 and C 33 <C 3 .
The discretization of the phase space was determined by sensitivity tests using the 25 one dimensional setup described in Sect. 5. These tests were also useful to specify the other limits of parameter space. It was decided that the pairs of C 32 and C 33 should not cause an unrealistic evaporation. The evaporation was considered unrealistic when the difference with the standard model was of approximately a 30%. This choice is arbitrary, but reasonable. As a result, 29 pairs of f and d c /d 2 were selected ( Table 1). The quantity of pairs was chosen to optimize the computational cost of the simulations and a good representation of the parameter space.

5
The model is said to be valid if its accuracy and predictive capability in the validation period have been proven to lay within predefined acceptable limits (Henriksen et al., 2003). As SIM is a physically based distributed model, it is is desirable to check, not only the model outputs or the variables used to calibrate it (discharge), but also as many intermediate variables as possible. To do this, it is necessary to use as many 10 sources of data as possible, but, unfortunately, for distributed models applied to large regions, like SIM, it is difficult to collect the necessary data to do the internal validation (Refsgaard, 1997). In our case, the only available sources of data are river discharge, all over France, and piezometry, only for the Seine basin, as it is not yet possible to have access, for 15 example, to distributed observations of soil wetness or evaporation. Piezometry is useful in those basins where underground water is simulated (in this study: the Seine and Rhône river basins) and discharge is the variable that is better observed. In this study, data of the 152 stations used in the calibration was used to validate it, but also, the performance on the remaining stations was analysed. Furthermore, other useful 20 comparisons were also done. For example, the resulting evaporation of the model was compared to the evaporation of another version of SIM to detect changes in the patterns, which allowed us to better understand the behavior of the new model.
As it is common in the literature (Perrin et al., 2001;Moussa et al., 2007), a splitsample method was used (Klemes, 1986), to validate the simulated discharge. The

Printer-friendly Version
Interactive Discussion each of the steps of the calibration. Discharge was validated according to two objective criteria, taking into account that a best criterion to evaluate the simulated river discharge does not exist (Weglarczyk, 1998). The choice of a criterion depends mainly on the applications of the model (Perrin et al., 2001) and the objectives of the modeler. The objective of the SIM model is not only to simulate the right dynamics of the discharge, but also to simulate the right water budget. According to these objectives, two widely used criteria were selected: the Nash-Sultcliff (NS) efficiency coefficient (Eq. 17) and the overall water balance (WB) (Eq. 18). To facilitate the analysis of the results, the numerical values of these two criteria, were related to their qualitative counterparts, as presented in Table 2. The 10 analysis of the results of the validation was done according to this table.

The reference simulation
The reference simulation (REF) corresponds to the standard version of SIM, without the exponential profile of saturated hydraulic conductivity. This model was not subject 15 to calibration. According to efficiency, it performed better during the second half of the 1995-2005 period (Fig. 8). This difference is an indication of limitations of the model, whose structure has difficulties to cope with the variability of the conditions in both periods. The quality of the water balance was constant, according to the criterion used. Table 3 shows that, during the validation period in terms of efficiency, the performance 20 of the model on more than half of the selected stations was poor, and it only was good at 10% of the stations. In terms of water balance, the quality of the results were good or very good in more than half of the stations. Therefore, in general, the model produced the right volume of discharge, but had more problems to reproduce the right dynamics. In terms of efficiency, the default simulation was better than the reference one in 10 some already well performing stations. The performance was good or very good on 17% of the stations, an improvement of 7%. But the improvement was not generalised, on 51% of the stations the scores decreased. The number of stations where the model performed poorly was increased (+5%). In terms of water balance, the performance didn't change significantly. In view of this results, it is not possible to say that the 15 introduction of the exponential profile generally improved the model. The opposite isn't true neither. Therefore, when using the exponential profile with default values of the parameters, it is difficult to say which variant is the best. shows the number of stations where each simulation was the best, in terms of efficiency. These numbers indicate that the selected range is reasonable (see Sect. 6.3), as the simulations that perform better were situated in the middle of the range of f and around d c =d 2 .
The question to answer is if there is a gain in changing the values of f and d c /d 2 5 when they are homogeneous in the whole of France. Figure 8 and Table 3 show that, taking the simulation that had better scores in the maximum number of stations (simulation 22, f =3 m −1 and d c =d 2 ), in terms of efficiency, there was a strong gain in calibrating the model. For the validation period, almost all stations improved their scores. Comparing to DEFAULT, the number of stations with performance qualified as good or 10 very good, in terms of efficiency, improved from 17% to 27% and the number of stations with poor results was also strongly reduced (from 56% to 39%). Therefore, according to river discharge, the answer to the question is positive.
Comparing simulations DEFAULT and FDc-FRANCE it is deduced that the introduction of the exponential profile of hydraulic conductivity can improve the performance of 15 the model if some adapted calibration is done, otherwise it is not guaranteed to have better results. However, to use homogeneous values of the parameters on the whole domain does not look realistic, therefore a more distributed approach could lead to better results. The objective of the next section is to test this hypothesis. 20 Once the 29 simulations of Table 1 were done, a set of parameters was found for each basin and subbasin, following the procedure described in Sect. 6.2. The resulting distribution of the parameters can be seen on Fig. 9, which reflects the values of Table 1. The figures show that, with some exceptions, the values of the parameters of neighboring basins or nested basins are similar, making large regional blocks. This coherence 25 was expected, as in general, spatial proximity may be a good similarity measure for transposing catchment model parameters in space (Merz and Bloschl, 2004;Parajka et al., 2007) −1 and d c =d 2 ). Another important consequence of the introduction of the exponential profile and its calibration is that the new hydraulic conductivity at saturation changed considerably in comparison to the REFERENCE model: the average k sat at the surface of the soil increased in two orders of magnitude, as did the maximum ( Table 5).

The spatially heterogeneous simulation
The results of the new simulation (FDc-BASIN) show that such an approach strongly improves the performance, both according to efficiency and total water budget (Fig. 8, Table 3 and Table 4). In this case, according to Table 3, the number of stations with results qualified as good or very good, in terms of efficiency, was 50%, which was a 15 high increase comparing to the 17% of DEFAULT and the 27% of FDc-FRANCE and the number of stations with poor results also diminished considerably, being 18%. In terms of water balance, there was also an improvement, the results on 59% of the stations were good or very good. This results show that there is an important gain in calibrating the model at the basin scale. Another interesting effect of the calibration 20 is the gain in model stability. The behavior of the simulation REF in both periods, calibration and validation, was quite different (Fig. 8), being the performance better for the validation period. The calibrated model (FDc-BASIN) also did perform better during the validation period, but, interestingly, the difference in performance between both periods was lower, as opposed to the REFERENCE simulation, which was less stable 25 across periods. Therefore, FDc-BASIN could deal with a broader range of conditions than the REFERENCE: it was more robust.
The strong change in hydraulic conductivity and the subsequent improvement in the scores of river discharge, was not accompanied by a strong change in evaporation (Ta-

Back Close
Full Screen / Esc

Printer-friendly Version
Interactive Discussion ble 6). In comparison to the reference model the average evaporation on France only was increased in 2%. Figure 10 shows that the change on evaporation was mainly in the range [−5%, −5%]. Even though, in some points, the changes were more important, attaining sometimes 20%. Nevertheless, as there is not distributed data available for evaporation, it is not known if this change is an improvement or it is not. On the 5 other hand, the annual surface runoff and drainage were changed strongly. Runoff increased considerably (+46%), which explains the more reactive discharge. Opposed to this, drainage diminished remarkably (−21%) and the soil water content of the deepest layer increased (+15%) as a consequence to the slower conductivity of the deep soil.

Calibration of subgrid runoff
In this last step, the parameter b of the subgrid runoff scheme was calibrated. The previous calibration adjusted the dynamics of water in the soil, this calibration was done to find a better partition between surface runoff and drainage. In this case, 6 simulations were performed with the following values of the parameter: 15 b=10 −3 , 10 −2 , 10 −1 , 5·10 −1 , 1, 1.5. This is based on the range of parameters found by Habets (1998) in the literature. The method to assign a value to each grid cell was exactly the same as in FDc-BASIN.
The obtained values of b are around 0.5, which is the default value used in SIM, and are, with few exceptions in the range 0.1≤b≤1. The resulting geographical distribution 20 of the calibrated parameter presented geographical coherences. The default value of b=0.5 was kept in a region of the north and the center of France. For example, the Seine and the Loire basins had, mainly, a value of 0.5. In the south western part of France, in the Adour and Garonne basins, the calibrated value was b=1. These geographical patterns give some confidence on the validity of the calibration procedure, 25 but unfortunately, the values themselves are not yet understood. Figure 8 and Table 3 show that, in terms of efficiency, this calibration helps to improve a little bit more the results. After calibrating b, 56% of the stations were in the range of 1345 Introduction Good and Very Good efficiencies, which represents an improvement of 6%. In terms of water balance, there were few changes, as expected.
Concerning the annual water balance, the results of this simulation are quite similar to the previous one, the difference on the global mean is only of 1% on terms of drainage and runoff and the differences are even lower for the soil water content. Nev-5 ertheless, these results do not explain the whole change, in fact, the small difference in the mean is due to the fact that the most part of the cells kept the default value of the parameter (b=0.5) and those that were different compensated each other. In fact, in the basins were the b was increased to 1, runoff increased (in comparison to the previous simulation) by around a 40% and in the basins were it was diminished to 0.1 the 10 decrease of runoff attained values by around −60%. These strong changes in runoff didn't cause changes in evaporation, as it remain very similar to that of the previous simulation (the highest differences were by around 5%).
In the part of the Seine basin that was calibrated, the calibration did not strongly modify the drainage, and thus, the recharge flux to the aquifers, and the piezometry. 15 On the three layers (oligocene, eocene and chalk aquifers) of this basin, the root mean square error (RMSE) and the bias of FDcB-BASIN remained comparable to those of the REFERENCE : the number of gages with an absolute bias lower than 2m increased from 12, in the REFERENCE simulation, to 16, in FDcB-BASIN, out of a total of 44 gages. On the contrary, in the southern half of the Rhône basin, the calibration lead 20 to a large decrease of the drainage flux, and thus the piezometric levels showed a net decrease (−9 m on average). This is due to the fact that this part of the basin is severely affected by the anthropization of alpine tributaries. Therefore, the results of the calibration in this area must be taken with care.

25
The objective of this study was to improve the overall performance of the model by introducing a parameterization of hydraulic conductivity and performing a calibration of Interactive Discussion of its parameters. It was shown that the simple method used highly improves the scores of river discharge and leaves other variables, like evaporation or the piezometric levels (at the Seine basin), almost untouched. The maps on Fig. 11 and the new discharges of Fig. 1 make explicit the overall improvement of the description of the processes related to the simulation of river discharge. Table 4 indicates that this improvement is not an 5 artifact of the calibration, as the results on stations not used in the calibration were also improved. This general improvement (Fig. 1) shows that both the introduction of the parameterization and the calibration were necessary to improve the SIM model. Nevertheless, in some basins, the parameters found by the calibration must be taken with caution. First, the method used to assign a set of parameters to each cell was very 10 simple. As can be seen in Fig. 9, the algorithm used to decide to which basin belongs each cell produced some artifacts at the borders of some basins. Furthermore, the model cannot perform well in places were processes not simulated by the model are important, as in karstic areas or basins where anthropization is intense, for example, the Alpine part of the Rhône basin, where hydro-power production has completely 15 changed the behaviour of a number of basins. However, it is worth noting that the calibration did not mask these effects, for example, the three basins depicted in black on Fig. 11  Interactive Discussion this remains valid. In addition, the use of d c /d 2 for the calibration, takes into account the variability of root depth, which depends on the variability of vegetation.
Finally, another important question is how these calibrated parameters reflect the real properties of the basins. As the model is constructed, the values of some parameters are related to the values of other parameters. For example, f and d c are not 5 independent, and b would be different if it was calibrated before the former parameters. Furthermore, it was chosen not to calibrate other parameters of the model, like the soil depth or some properties of the vegetation. Nevertheless, it was seen that the calibration affected the processes it intended to modify (runoff and drainage) and the scores of discharges were improved considerably. Therefore, even though it is known that this set of parameters is not the only one that would give similar results, the resulting model is realistic enough to simulate, in an appropriate manner, the relevant processes of the basin. Therefore, within the ranges defined by Table 2, the model can be defined as realistic.

15
This study describes the modifications that were implemented on the SIM model to improve its performance in the context of hydrology. Emphasis was placed on the role of the new parameters introduced. The study showed that a better description of the hydraulic conductivity of the soil was important to produce more realistic discharges. An exponential profile of hydraulic conductivity at saturation was introduced in the model, 20 including new parameters, which have an empirical nature. Therefore, as it was not possible to set the values of these parameters from direct observations, a calibration procedure was set up. It was shown that the calibration improved considerably the results and that the final model was more robust than the original SIM. The calibration mainly affected the processes it was intended to modify (drainage and runoff), and the 25 rest of the parameters remained stable. It was also demonstrated that it is worth introducing this new empirical parameterization, only if it is accompanied by a calibration of Introduction Interactive Discussion the parameters. In conclusion, in this case, calibration is a tool that can considerably improves the performance of distributed models like SIM.
Some key issues must be further investigated. For instance, three parameters were calibrated in this study, in the future it would be desirable to study if it is worth introducing new parameters to the calibration and to better understand the role of the interactions between the parameters, being conscious that each new degree of freedom can put in danger the reliability of the model. Another important issue is the improvement of the method used to calibrate the model. The method used was very simple, even simplistic, and it could be improved in the future. Nevertheless, it showed to be robust enough and considerably improved the performance of the model. Finally, the study of the basins where the calibration set surprising values of the parameters, will be very useful to learn more about these basins and the behaviour of the model. This new version of the model, will be used to follow the evolution of soil wetness, the forecast of river discharge and, finally, to study the impact of climate change on the continental water cycle. Dordrecht, 1989, 1325 , 16, 1261-1282. 1326Merz, R. and Bloschl, G.: Regionalisation of catchment model parameters, J. Hydrol., 287, 95-123, 2004. 1343 J. Hydrol., 198, 69-97, 1997, 1340 A Comprehensive Radiation Scheme for Numerical Weather Prediction Models with Potential Applications in Climate Simulations, Mon. Weather Rev., 120, 303-325, 1992. 1322 Stieglitz, M., Rind, D., Famiglietti, J., andRosenzweig, C.: An Efficient Approach to Modeling 5 the Topographic Control of Surface Hydrology for Regional and Global Climate Modeling, J. Climate, 10, 118-137, 1997       Resulting geographical distribution of the values of the two parameters related to the exponential profile of hydraulic conductivity (f (m −1 ) and the ratio between the compacted depth (d c (m)) and the root depth (d 2 (m)). Colored grid cells were calibrated, the gray ones weren't, because they do not correspond to any of the basins of the selected stations, which correspond to the black dots. The simulated river network is shown in darker gray. . The second map (FDcB-BASIN) corresponds to SIM-KSAT after calibrating the parameters related to the exponential profile of hydraulic conductivity and the runoff subgrid scheme. The simulated river network is shown in darker gray.