Multi-model comparison of a major flood in the groundwater-fed basin of the Somme River (France)

. The Somme River Basin is located above a chalk aquifer and the discharge of the somme River is highly in-ﬂuenced by groundwater inﬂow (90% of river discharge is baseﬂow). In 2001, the Somme River Basin suffered from a major ﬂood causing damages estimated to 100 million euro (Deneux and Martin, 2001). The purpose of the present research is to evaluate the ability of four hydrologic models to reproduce ﬂood events in the Somme River Basin over an 18-year period, by comparison with observed river discharge and piezometric level as well as satellite-derived extents of ﬂooded area. The models used differ in their computation of surface water budget and in their representation of saturated and unsaturated zones. One model needed structural mod-iﬁcation to be able to accurately simulate the riverﬂows of the Somme river. The models obtained fair to good simulations of the observed piezometric levels, but they all overestimate the piezometric level after ﬂooding, possibly because of a simplistic representation of deep unsaturated ﬂow. Models differ in their annual partition of the inﬁltration of water within the root zone (mostly driven by simulated evapotran-spiration), but these differences are attenuated by water transfers within the saturated and unsaturated zone. As a consequence, the inter-model dispersion of the computed annual baseﬂow is reduced. The aquifer overﬂow areas simulated during ﬂooding compare well with local data and satellite images. The models showed that this overﬂow occurs almost every year in the same areas (in ﬂoodplain), and that the ﬂooding of 2001 was characterized by an increase in the quantity of the overﬂow and not much by a spreading of the overﬂow areas. Inconsistencies between river discharge and piezometric levels suggest that further investigation are needed to estimate the relative inﬂuence of unsaturated and saturated zones on the hydrodynamics of the Somme River Basin.


Introduction
In 2001, the flooding of Abbeville, a town located by the Somme River in Northern France struck public attention. The flood lasted for several months, required more than 1100 people to be evacuated and caused 100 million Euro of estimated damage (Deneux and Martin, 2001). The flood occurred in a basin located above a widespread chalk aquifer which usually smoothly reacts to rainfall events. In 2001, however, a rapid increase of 10 m in the piezometric level was locally observed in a few weeks only. As pointed out by 100 F. Habets et al.: Multi model comparison of a major flood more frequently due to climate change (Pinault et al., 2005). A long-term experiment FLOOD1 (Amraoui et al., 2009;Thiéry et al., 2008) was also set up to observe and understand how the unsaturated and saturated zones of the chalk aquifer of the Somme basin vary, and to compare this site to other experimental sites in the UK (Price et al., 2000;Lee et al., 2006;Ireson et al., 2006;Goody et al., 2006).
The present article aims at answering the following questions: 1. Are the unsaturated and saturated zones well represented by the models, even particularly during the floods?
2. Is there an advantage to use complex land surface schemes instead of simple ones?
3. Can remote sensing observation from space be used to assess the flooding simulations in hydrologic models?
We addressed these questions using a long-term model intercomparison. The article is organized as follows: first, a description of the characteristics of the Somme basin is presented. Then, the four models used in this study and their calibration are briefly described. Next, the results of the models are compared, with a special focus on the 2001 flooding. Finally the impact of the spatial resolution and the role of the unsaturated zone are discussed.

Description of the Somme basin
The Somme River catchment has an area of 6433 km 2 (Fig. 1). The river flows 245 km before reaching the English Channel. It has a very gradual slope, slow waters and regular flow, and is continuously sustained by the Chalk aquifer. The river runs in wide valleys, with many ponds and marshes, and steep hillsides to dry plateaus, as water infiltrates easily into the chalk and the covering silt deposits. In the upstream basin, there are several dry valleys, but some non-perennial sources can appear depending on the aquifer level. The chalk is characterized by the presence of fissures due to chalk dissolution. These fissures are more developed in the valleys than in the plateaus (Crampon et al., 1993) and lead to a dual porosity of the chalk, resulting from interstitial porosity and fracture porosity (Lee et al., 2006). Consequently, the basin has a highly non-linear response to rainfall, depending on its initial state, and the unsaturated zone plays an important role in the hydrodynamic of the river (Pinault et al., 2005). The thickness of the aquifer varies from 20 to 200 m, while the thickness of the unsaturated zone varies from 1 m in the wet valleys to more than 50 m under the plateaus. From a meteorological point of view, the mean annual rainfall (about 800 mm/year) is very close to the mean annual potential evapotranspiration (PET). There is a low precipitation gradient between the upstream and downstream areas, with more rain closer to the sea.
The Somme hydrograph is typical of rivers closely connected to a chalk aquifer for a pluvial oceanic regime: there is not that much difference between low and high flows and the peaks are quite smooth. Thus, the fast component of the flow (that closely follows rainfall) is quite reduced, owing to the relative absence of surface runoff (Headworth, 1972).
Detailed analyses of the observed data were made to understand the 2001 flooding (Hubert, 2001;Pointet et al., 2003;Négrel and Pételet-Giraud, 2005). This flooding was the consequence of several years with larger precipitation than average, aggravated by a winter with strong precipitation. The lower part of the basin has been affected by floods with return periods over 100 years, while the upper basin has been less affected, with return periods of about 20 years.

Short description of the models
To simulate the long-term water budget of the Somme Basin as well as the flooding period, four models are used. Two of them are finite differences hydrogeological models: MARTHE (Thiéry, 1990) and MODCOU (Ledoux et al., 1989). Other two models are the combination of a land surface model (LSM) with a hydrogeological model: the Catchement LSM or CLSM Ducharne et al., 2000), and SIM .
The CLSM is a semi-distributed model, which couples the energy fluxes parameterizations of the Mosaic LSM (Koster and Suarez, 1992) to the concepts of TOPMODEL (Beven and Kirkby, 1979) to generate runoff and soil moisture patterns. SIM is the association of the ISBA LSM (Noilhan and Planton, 1989;Boone et al., 1999) with the MODCOU hydrogeological model. Thus, MODCOU and SIM differ only by the way the surface water budget is computed: MODCOU uses a simple reservoirs scheme that solves daily water budget, while SIM uses a LSM developed for weather forecast model that solves sub-daily water and energy budgets.
MARTHE is the hydrogeological model from the BRGM, the French public institution involved in the Earth Science field. It was the first model applied on the Somme basin shortly after the flooding (Amraoui et al., 2002).
MODCOU and the CLSM were previously used in the neighboring Seine basin, which shares some characteristics with the Somme basin, particularly with regard to climate, land use and the presence of the chalk aquifer Ducharne et al., 2007). SIM is used on the whole of France by the French meteorological service to monitor water resources and for ensemble riverflow forecasting Thirel and al., 2008). However, SIM has an explicit representation of the aquifers only on the Rhone and Seine basins (Habets et al., 1999;Rousset et al., 2004). Thus the SIM-France application was not efficient on the Somme basin where it does not take into account the chalk aquifer. These four models present several important differences: -MARTHE and MODCOU solve a daily water budget, with the input data of daily precipitation and potential evapotranspiration (PET), while the CLSM and SIM compute the diurnal cycle of both water and energy budgets, using hourly precipitations, incoming solar and atmospheric radiations, 2 m air temperature and humidity, and 10 m wind speed; -the deep unsaturated zone is not taken into account by the CLSM and it is represented with a simple percolation function in MARTHE, and with a simple conceptual model in MODCOU and SIM; -the saturated flows are computed by a 2-D finite difference model to solve the hydrodynamic equation based on the Darcy's law and mass conservation in MARTHE, MODCOU and SIM, while it is represented by a conceptual linear model in the CLSM (Gascoin et al., 2009).

Model implementation and calibration
Two datasets were made available for each one of the modeling groups: the 18-year period (from 1 August 1985 to 31 July 2003) atmospheric forcing from the SAFRAN analysis (Durand et al., 1993;Quintana-Seguì et al., 2008) and the ECOCLIMAP database (Masson et al., 2003) that gives soil and vegetation maps and associated parameters. For the implementation and calibration of the models, the choice was made not to impose any conditions on the modelers, in order to allow them to express their skills without any constraint. The drawback of this position is that the results can be difficult to analyze.
The period used for the calibration as well as the atmospheric data used for the calibration, varies across the models. Indeed, MARTHE was applied to the basin before this intercomparison project, using a coarse network of observed atmospheric stations to derive the atmospheric forcing for its calibration. After this first step, it used the SAFRAN dataset that was available at a finer resolution and over a longer duration. The periods of calibration and validation cannot be really distinguished for each model. However, the calibration of the models was realized over long periods, without concentrating on a given event.

Implementation of MARTHE
A part of the calibration of the MARTHE model relies on the application of the lumped model GARDENIA (Thiéry, 2003), which was applied in the Somme basin during the flood (Pointet et al., 2003). For the distributed model, as the groundwater covers a greater area (7336 km 2 ), greater than the surface topographic basin (5560 km 2 ), MARTHE extends its modeling domain to the Bresle and Authie rivers as natural boundary conditions in the South and North (Fig. 1), but in the East, instead of reaching the Oise river, in order to limit the simulated aquifer domain, MARTHE uses the crest of the piezometric level.
The calibration of the distributed model was based on the trial/errors method on the period 1995-2002, and consisted in two main steps: first, a calibration of the surface water budget, using a distributed application of the GARDENIA model, and then, a calibration of the storage coefficient and the transmissivity. To account for the time delay of the flow in the unsaturated zone, MARTHE uses a simple percolation function, with a spatial average time constant of about 3 months. The distributed model was first calibrated on the period 1995-2002 by Amraoui et al. (2002) and then the grid was refined in 2006 and 2007 (Amraoui et al., 2007). Now, MARTHE uses a 500 m resolution in plateaus and a finer 100 m one in the valleys.
To account for different quality of the atmospheric forcing data used in the calibration phase and in the present study, a correction factor calibrated on the local PET estimation was applied to the SAFRAN PET. This correction factor is 0.8, which means that the PET was reduced by 20%. Such correction may seem important, but the impact of such reduction for the estimation of the actual evapotranspiration (AET) is sensitive mainly in winter, when the PET is low. Moreover, the difference between the SAFRAN PET and the observation is about 10% on an annual basis (Benatya, 2004).

Implementation of MODCOU
The extension of the MODCOU model differs from the one of MARTHE, since in the East, the natural limit was set to the Oise River. Thus, the simulated domain is larger (8205 km 2 , Fig. 1). The hydrographic network was derived using the HYDRODEM software (Leblois, 1993;Leblois and Sauquet, 2000), and the final spatial resolution varied from 125 m to 1 km, with higher resolution being associated with the river and sub-basin limits.
The calibration of the MODCOU model is also based on the trial/errors approach on the period 1995-2003, and is explained in detail in korkmaz et al. (2009). First guess parameters were derived from existing simulations: i) from the Seine application for the surface water budget and unsaturated flow transfer parameters, and ii) from the MARTHE application for the groundwater parameters. Then, the parameters related with the simulation of the surface water budget were adjusted, as well as the groundwater parameters in a steady state. The steady state piezometric level was used to derive the depth of the unsaturated zone, which is supposed to be constant in time. The unsaturated zone transfer model used in MODCOU is based on a conceptual Nash cascade model (Nash, 1960). The other parameters of the Nash cascade were derived from the Seine basin application: the depth of each reservoir was set to 5 m, and the drainage coefficients vary according to the geological map. Then, there were successive iteration to calibrate those parameters (korkmaz et al., 2009).

Implementation of SIM
The SIM model is used by Météo-France in an operational mode over France to monitor the water budget on the national scale using a 8 km grid . In this study, the unsaturated and saturated flows parameters calibrated by MODCOU were used in SIM. Although SIM results from the coupling between a LSM and a groundwater model, it is not fully coupled, and especially, there is not yet a feedback between the depth of the unconfined aquifer and the soil moisture simulated by the LSM, as in Yeh and Elathir (2005) or Miguez-Macho et al. (2007). Thus, the surface water budget was not affected by the introduction of the aquifer. However, in this application, the 8 km resolution used in SIM-France seemed too coarse to discriminate the valleys from the plateaus, hence, a 1 km resolution was used. Therefore, the surface water budget is different in the SIM-France application and in this application. The impact of such refinement is discussed in Sect. 6. As stated before, MODCOU and SIM differ only by the way their surface water budgets are computed. In SIM, it is computed by the ISBA LSM. Most of the ISBA parameters are derived according to soil and vegetation types (Masson et al., 2003). Only a few parameters are subject to calibration. For this application, only one ISBA parameter was modified: the subgrid runoff coefficient which allows some surface runoff to be generated before the complete saturation of the cell. A default value is used in the SIM-France application in order to generate some surface runoff. However, since the fast component of the Somme riverflows is very weak, it was decided to set this coefficient to a low value (b = 0.01), in order to limit surface runoff to a minimum.

Implementation of the CLSM
The CLSM does not use a regular grid but it partitions the simulated domain into unit hydrological catchments. Each of them includes a shallow water table which follows TOP-MODEL's assumptions (Beven and Kirkby, 1979). As a result, the depth of this water table varies laterally following the topographic index, and the mean depth varies over time following the meteorological forcing, over a range of a few meters, which typically corresponds to the soil horizon. This shallow saturated zone is connected to the overlying unsaturated zone using the Richards equation, allowing to sustain soil moisture and evapotranspiration in summer by means of capillary rise. It also contributes to the river baseflow and controls the extension of the saturated areas, thus saturation-excess overland flow. Yet, this saturated zone is by no means comparable to the thick aquifer system underlying the Somme basin, and it generated excessively high seasonal Hydrol. Earth Syst. Sci., 14, 99-117, 2010 www.hydrol-earth-syst-sci.net/14/99/2010/ variations of runoff, with low flows close to zero and high flows strongly overestimated. Calibration proved insufficient to achieve a satisfactory simulation of riverflow (Carli, 2005). Therefore, an additional reservoir was developed and added to CLSM, which allows the model to store more water and release it as slow flow (Gascoin et al., 2009). Water is diverted from the soil moisture reservoir during the wet season (when soil moisture is greater than a calibrated threshold) to recharge a linear reservoir designed to approximate the groundwater flow from the thick Chalk aquifer. Note that this reservoir provides a lumped representation of the transfer and storage in both the deep unsaturated and saturated zones and its water content is not comparable to a piezometric level.
In this study, the CLSM model was used as a lumped model over the entire Somme basin upstream from Abbeville (5566 km 2 ). There was no other spatial heterogeneity than the soil moisture patterns derived from the topographic index distribution, and riverflow can only be simulated at Abbeville. A detailed presentation of the model and its calibration on the period 1985-2003 in the Somme basin can be found in Gascoin et al. (2009). In particular, the calibrated transfer time through the additional linear reservoir is 700 days, in the range of the values found by Milly and Wetherald (2002) for a similar parametrization in many major river basins.

Analysis of the long term simulation of the Somme basin
In this section, the comparison of the 18-year simulations of the Somme basin by the four models is presented. In a first step, the mean annual water budgets are compared, then, the simulation of the riverflows and piezometric levels are compared with the observations.

Evaporation
The water budget computed by each model is quite similar (Table 1): the actual evapotranspiration (AET) represents approximately 70% of the precipitation for the CLSM, MOD-COU and SIM, and 65% for MARTHE. The lower evaporation rate in MARTHE is partly explained by the fact that MARTHE used a correcting factor on the provided PET. The soil infiltration corresponding to the flux at the bottom of the soil reservoir (or root zone) represents about 30% of the precipitation, while the remaining part of the precipitation generates surface runoff. For all models, all the soil infiltration flux reaches the water table and then the river. Thus, it can be said that about 90% of the riverflows are provided by the aquifer outflows, which is coherent with the isotopic tracer study by Négrel and Pételet-Giraud (2005), and with the absence of surface runoff in the chalk catchment (Headworth, 1972). However, there are some important differences regarding the annual distribution of these fluxes (Figs. 2 and 3). The annual cycle of the simulated AET shows large differences between the hydrogeological and LSM models (Fig. 2). In winter and spring, the AET simulated by the hydrogeological models closely follows the PET (with the correcting factor for MARTHE), and is higher than that computed by the LSMs. On the contrary, in summer, the AET computed by the two LSMs is higher than that simulated by the hydrogeological models, representing in July almost 70% of the PET for SIM and the CLSM, and only 55% for MODCOU and MARTHE. Such differences are due to the way the evapotranspiration flux is computed by these two types of models. The two hydrogeological models use PET and precipitation as input data. They manage a soil reservoir and compute a unique evapotranspiration flux that is equal to the PET, provided there is enough available water in the reservoir. As the reservoir is filled by winter rainfall, the AET can reach the potential value in winter. On the other hand, the two LSMs SIM and the CLSM compute the diurnal variation of the coupled water and energy budgets, and in particular, the soil moisture and soil temperature at several depths in the soil. Thus, they better take into account the various parameters influencing the evaporation flux: the atmospheric demand, the available energy, the soil water stress, and the state of the surface, including the actual development of the vegetation. For these land surface models, the computed AET is the combination of several fluxes: i) the bare soil evaporation, ii) the plant transpiration, iii) the interception of precipitation by the vegetation, and iv) the evaporation from the snow cover (negligible in the Somme basin). To do so, they use as input the hourly values of seven atmospheric variables (Sect. 3), as well as a description of the annual cycle of the vegetation. In the Somme basin, the vegetation presents a clear annual cycle: on average on the domain, the vegetation fraction ranges from 33% of the surface in winter to 90% in summer, and the leaf area index ranges from 0.6 m 2 m −2 to 3.8 m 2 m −2 in summer. Thus, in SIM and CLSM, both plant transpiration and interception loss become negligible in winter and only bare soil evaporation occurs, but at a rate close to only 50% of the PET.
Although the AET simulated by the CLSM and SIM are closer to each other than those simulated by the hydrogeological models, they present some differences. In spring, the AET from SIM rises at a higher rate than that from the CLSM, while in autumn, the AET decreases faster for SIM than for the CLSM. This is due to the fact that at the beginning of the dry period, the soil moisture in the CLSM can be fed by the shallow water table (see below and Fig. 3), thus limiting the soil water stress. Figure 3 shows the annual cycle of four soil water fluxes: the soil infiltration (SI) corresponds to the flux at the bottom of the soil reservoir or root zone, the flux from the unsaturated zone (UF), the baseflow from the aquifer (BF), and the surface runoff.

Soil water fluxes
The surface runoff is rather weak (less than 0.1 mm day −1 ), but is sensitively larger for MARTHE than for the other models.
The SI flux is the more scattered flux. In summer time, MODCOU and MARTHE have a low positive flux (0.1 mm day −1 ), while it is close to zero for SIM, and negative for the CLSM. A negative flux means that some water from the saturated zone feeds the root zone. Such process is not surprising in the valleys where the aquifer is close to the surface. The 10-day evolution of the SI flux is noisier in MARTHE than in the other models, which shows that MARTHE reacts faster to the precipitation signal than the other models. As expected from the comparison of the annual cycles of the AET, the SI fluxes simulated by SIM and MODCOU are quite different, SIM having a larger annual variation of this flux than MODCOU.
MARTHE, MODCOU and SIM take into account an unsaturated zone depth ranging from 1 to 104 m, with an average value of 32 m (the deep unsaturated zone is not represented in CLSM). For these three models, the transfer throughout the unsaturated zone smoothes and add a delay to the SI flux. The unsaturated flow (UF) has an average percolation rate ranging from 0.5 to 1 m day −1 . The impact of the unsaturated zone (Fig. 3) is greater in MARTHE (reduction of the amplitude by 70% and peak delayed by 60-day) than in MODCOU and SIM (approximately 27% reduction of the amplitude and peak delayed by about 30-day).
However, these three representations of transfers in an unsaturated chalk are quite different from the description of the Pang-Lambourn chalky basin in the UK. Jackson et al. (2006) and Mathias et al. (2006) argue that the transfer in the unsaturated flow has a unique impact: to add a time delay. This time delay is associated with an average apparent percolation rate of 3.1 m day −1 (Ireson et al., 2006), which is in the range of the values derived from the observation of 12 boreholes by Headworth (1972) (from 1.5 to 6.7 m day −1 ). Thus, in the Somme basin, MARTHE, MODCOU and SIM use a lower percolation rates than in similar basins, and assume a smoothing effect to the chalk unsaturated zone.
The UF simulated by MARTHE, MODCOU and SIM recharges the water table, from which it flows out as baseflow. In the CLSM, the baseflow has two components: one is provided by the shallow aquifer described according to TOP-MODEL, and the other one is the outgoing flow from the linear reservoir designed to approximate the groundwater flow from the chalk aquifer. The impact of the aquifer to smooth the flow is striking (Fig. 3 than 70% for MODCOU and SIM and by 62% for MARTHE. Thus, it can be said that the impact of the groundwater models to smooth the recharge flow is comparable between these three models. However, for MODCOU and SIM, most of the attenuation of the amplitude of the flow is due to the transfer in the groundwater, while in MARTHE, both the saturated and unsaturated zones have a comparable effect. For the CLSM, the baseflow is dominated by the outflow from the groundwater linear reservoir, the original baseflow from the shallow aquifer being only 27% of the total (not shown). The time constant of 700 days in the linear reservoir allows obtaining an amplitude of the baseflow very similar to the other models. The mean-annual baseflows simulated by MARTHE and the CLSM are similar. This is surprising since the soil infiltration and evaporation flux simulated by these models are fairly different. Indeed, only the SIM model presents a different behavior, with an annual amplitude almost twice larger than in the other models. SIM is also the only model for which the unsaturated and saturated flows parameters were not calibrated using its simulated surface water budget. Thus, for the Somme basin, it can be said that the calibration of the hydrological transfers in the saturated and unsaturated zones erases the temporal differences in the estimation of the surface water budget on a mean annual basis. From this result, it appears that the sole comparison of groundwater recharge models as presented by Bradford (2002) may not be sufficient to estimate the best modeling approach. It therefore appears that hydrogeological modeling should be considered as a whole, from the surface water budget to the groundwater flow.

Riverflows
The riverflows are observed at fives gages in the Somme basin. The four models are able to simulate the daily riverflows, but, for the CLSM, the flow is not routed, and thus, the comparison with the observed riverflow is only possible at a lower frequency (7-day averaged). Moreover, the CLSM computes the riverflow only at the outlet of the basin. Table 2 gives for each gage and each model the coefficient of efficiency E (Nash and Sutcliffe, 1970) and the water balance ratio (Q sim /Q obs ). Figure 4 presents the comparison of the 7-day riverflow of the Somme at Abbeville. The CLSM is able to reproduce the main characteristic of the hydrograph, but shows some important differences, especially during the recession period in 1991-1992 and during some high flow periods. The summer flows simulated by SIM are often underestimated compared to the observation (about 1 year out of 3). MARTHE and MODCOU obtained better results (E > 0.8, cf. Table 2). The Table 2. Statistical results obtained for the comparison of the observed and simulated riverflows over the 18-year period: discharge error ratio (Q sim /Q obs ) and 7-day efficiency (E). Daily efficiency is given in bold for the models that compute daily riverflows.  12-1985 12-1987 12-1989 12-1991 12-1993 12-1995 12-1997 12-1999 12-2001   comparison of the simulated and observed average annual cycles for the 5 gages is presented in Fig. 5. For the Somme at Abbeville, MODCOU achieves the best agreement with the observed annual cycle, while MARTHE overestimates the recession flow, SIM underestimates the low flows, and the CLSM underestimates the maximum flow on average.
For the tributaries (not simulated by the CLSM), it appears that none of the models is able to accurately represent the observed annual cycles. Table 2 shows that the quality of the simulations is still statistically fair for the Hallue and Avre rivers (E > 0.7), but decreases on the Nièvre and Selle rivers (0.5 < E < 0.7). In general, SIM and MODCOU obtained better results in terms of riverflows.
For the Somme at Abbeville, the differences between the models are coherent with the analysis of the baseflows and surface runoff presented above. However, this is not the case for the Nièvre basin, where MARTHE has the largest amplitude, nor for the Avre basin, where the simulated peaks are time lagged. Thus the spatial variabilities of the Somme basin is not well captured by all models, which leads to an uneven distribution of the quality of the simulated riverflows.

Piezometric levels
More than fifty piezometers are available at least occasionally during the 18 years of interest. Three out of the four models simulate the distributed piezometric level: MARTHE, MODCOU and SIM. Figure 6 presents the daily evolution of the piezometric level averaged over 15 observation wells that are available during the full period. There are two periods of high level in 1994-1995 and 2001-2002 and the recessions in between are well captured by the three models. The simulated annual bias is presented in the bottom panel. MARTHE tends to evenly overestimate the hydraulic level, while SIM and MODCOU have a negative bias with an important variation in time. Thus, it seems that MOD-COU and SIM have some issues for simulating properly the water table dynamic at these 15 observation wells. This is also true during the flood period (2001). The three models however show a positive bias during the recession periods of 1996 and 2002-2003. Thus, it may be that the models are not able to simulate a correct recession after a high flow period, or that they have some difficulties to simulate the peak flows.
Such result is confirmed looking at the comparison between the full observations (all data available for all observation wells) and the simulations of the three models presented Fig. 7. The results of SIM and MODCOU are more scattered than those of MARTHE. Two included panels present the results for two contrasted years: a dry year 1998 and a wet year 2001. There is a weak evolution of the piezometric level in 1998, but some observation wells are badly represented by SIM and MODCOU (the level is underestimated). In 2001, the piezometric levels display more variations and the errors in SIM and MODCOU tend to be reduced.
The dispersion is more pronounced for the high values of the piezometric level located in the plateaus than for the lower ones. This is due to the fact that in the valleys, the water table is closer to the surface and thus its evolution is limited. But this is also certainly due to some weaknesses in the simulation of the flow in the unsaturated zone.
The statistical results were computed for 45 observation wells having sufficient data (Table 3). MARTHE obtained the best results, followed by MODCOU and SIM, while opposite results were found for the simulation of the riverflows.
The results obtained by SIM and MODCOU are quite different although they share the same UZ and saturated schemes, which means that the differences in the temporal evolution of the surface water fluxes analysed Sect. 4.1.2 have a significant impacts. Table 3. Average piezometric level simulated by the models over the same domain and the entire 18-year period, and statistical comparison of the observed and simulated piezometric level at the gaging stations. The normalized RMSE (RMS n ) is equal to the RMSE divided by the amplitude of the observed piezometric level (RMS n =RMS/(Hobs max −Hobs min )).  1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 10 20 30 40 50 gages 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002

Analysis of the 2001 flood simulations
As stated before, detailed analysis of the observed data during the flood have been done by Hubert (2001) and Pinault et al. (2005). Therefore, in this section, the emphasis is put on the comparison of the simulation of the four models with the observations. The models were calibrated over a long time period, and not particularly for the period of the flood. Therefore, a focus on this targeted period can be considered as being a partially independent assessment.
Three kinds of observations are used in this section: the observed riverflows and piezometric levels, and the flooded areas derived from satellite.

Comparison between observed and simulated riverflows
To focus on the flooding, we have selected the period when the observed riverflows of the Somme at Abbeville were above the 10-year return flood, i.e. above 71 m 3 s −1 . This period lasted from 30 January 2001 to 28 July 2001 and covered 180 days. For the Somme at Abbeville (Fig. 8), the observed riverflows vary above 90 m 3 s −1 for about 3 months. These variations correspond to a high frequency signal with a period of about 2 weeks that are due to a tidal effect (not shown). Indeed, the connection of the Somme River to the English Channel is controlled by gates in order to prevent the sea water from entering the Somme River during high tides, thus reducing the daily streamflow. As no model takes into account the tide in the routing of the riverflows, it is not surprising that these variations are not reproduced by the models. SIM better reproduces the observed river high flows during the first 50 days, while MARTHE, MODCOU and the CLSM underestimate the flows by 10 to 30 m 3 s −1 . The results of MARTHE and MODCOU are improved by the large increase of the simulated flow at the end of March 2001. Such an increase is also simulated by SIM (which then tends to overestimate the riverflow) as well as by the CLSM, at around the same period of time. This increase corresponds to a period of high daily precipitation (Fig. 8). However, there is no such pronounce variation of the observed riverflows at the Somme at Abbeville, although these large precipitations increased the riverflows of the tributaries (Fig. 9). The maximum of the observed flows occurs in mid April and also corresponds to the maximum flooding, as shown by the evolution of the surface computed by using satellite data (cf. Sect. 5.3) plotted in grey in Figs. 8 and 9. This maximum is in phase with the maximum riverflows simulated by SIM and MODCOU, and about 15 days earlier than the maximum simulated by MARTHE (except for the Hallue river). This indicates that MARTHE reacts too slowly, which is consistent with the overestimation of the recession flow by this model.
MARTHE is the only model that is able to simulate the dynamic of the flows of the Hallue river during this period, although on the 18-year period, the three models achieved fair results. This probably means that during the high flows, some processes are either not represented or badly represented by SIM and MODCOU.
The performances of the four models during the flood are summarized in Table 4. The efficiencies are rather low and even often negative. There are some errors in the simulation of the discharge, but the maximum riverflows are well captured by at least one model for each gage. The CLSM obtained the worst results for the Somme at Abbeville. Over the five gages, MARTHE obtained slightly better results than MODCOU and SIM.
In general, from this section, it can be said that the four models that were calibrated on a long term period encompassing the flood are not able to reproduce the dynamic of the flood.

Evolution of the piezometric level
To analyze the evolution of the piezometric level around the flooding period, another time period is used to cover the rise of the water table and the beginning of its recession, from July 2000 to November 2001. Figure 10 shows an average of the piezometric level over the 42 observation wells that gave continuous data over this period (these observation wells are located on Fig. 11). The maximum piezometric level was reached in April 2001, i.e. well in phase with the maximum of the flooded area. The piezometric level increased by up to 6 m from July 2000 and decreased by about 5 m in November 2001. None of the models is able to reproduce the observed evolution of the piezometric level: they all exhibit a time lag of about 45 days in the maximum piezometric level, and they all overestimate the piezometric level after the maximum by 2 to 3 m. At the beginning of the period, MARTHE overestimates the piezometric level, while MODCOU and SIM underestimate it. Table 4. Comparison of the observed and simulated riverflows during the flood period: mean discharge ratio Q sim /Q obs , daily Efficiency (E), daily Index of Agreement (IA, Willmot ,1981), Ratio of the simulated over observed maximum riverflow (R qmax ). The return period of the flood (RPF) is given for each gage. The 7-day statisical results are given in bold for the Somme at Abbeville.  to July 2001, while they simulate a riverflow slightly lower than the observed one (Fig. 8). As the major part of the riverflow during the flood is due to exfiltration from the water table (Négrel and Pételet-Giraud, 2005) one would expect that both the simulated riverflows and piezometric levels would share similar qualities and flaws with respect to the observations. However, this is not the case. This means that the average of the 42 piezometric wells does not represent the evolution of the piezometric level where the groundwater contributes to the riverflow. Thus, it is highly likely that the models reproduce the piezometric level better in the valleys than in the plateaus. This is partly shown in Fig. 11 which presents the maps of the piezometric level in July 2001, as well as the normalized RMSE in the 42 observation wells during the whole period. The larger RMSE are generally found in the border of the basin, where the unsaturated zone is deep. The best agreements between the estimated and observed piezometric level are most often located closer to a river.

Flooded areas
The 2001 flooding is considered to be due to the rise in the aquifer level toward the surface, thereby creating several extended sources. The models may be used to check this assumption. Moreover, it is also interesting to try to rebuild the evolution of the flood, in order to estimate which parts of the basin were flooded first, and to check whether the models are able to reproduce these phenomena well.
To do so, data from the European Remote Sensing satellite (ERS) were used. The work was carried out with 5 radar ERS/SAR images taken on five different dates: 1 March 2001, 17 March 2001, 5 April 2001, 21 April 2001, 10 May 2001, optical SPOT/HRVI (dates 2 April 2001, 25 May 2001, 27 July 2001 and Digital Elevation Model (DEM) data. Absolute calibration of the ASAR images is done in order to transform radar signals (digital numbers) into backscattering coefficients. Both radar and optical images are geo-referenced and superimposed, with a very slight error (the RMSE control point error is about 20 m). The method developed to detect flooded pixels is based on the large decrease in radar signal when the surface is covered with water. In fact, when soil surfaces evolve from saturation to flooded, the radar signal, which is highly scattered for humid soils, is more clearly reflected when the surface is covered with water and is consequently less backscattered. The detection algorithm is first based on four steps: i) Selection of a reference image just before flooding; ii) Application of frost filter to radar images in order to eliminate speckle noise effects; iii) Identification of image pixels presenting a large difference with the reference image. A threshold of 5 dB is empirically chosen; iv) Segmentation of identified pixels in order to retrieve flooded areas.
Secondly, in order to avoid detection errors due to the concurrent growth of vegetation (also corresponding to a decrease in the baskscattered signal as shown by many authors such as Le Hégarat-Mascle et al. (2002), the algorithm is constrained by a land cover map estimated with the SPOT/HRV data classification and with a Digital Elevation Model (DEM) allowing low level vulnerable areas to be estimated. Partial validation of the detection of flooded areas based on the ERS/SAR images was made using SPOT/HRVI optical data taken at three dates (2 April 2001, 25 May 2001, 27 July 2001, using the NDWI index (Gao, 1996), which is very sensitive to the presence of surface water (Guichaoua, 2005). Only the lower part of the Somme basin is treated here (Fig. 12), which represents about 25% of the whole basin.
Four dates are processed in 2001: 1 March, 5 and 21 April, and 10 May. At these dates in the sub domain under study, the surfaces detected as fully covered by water extended over 0.89, 2.71, 9.51 and 2.53 km 2 . The evolution of the inundated areas compares well with the history of the flooding reported by Deneux and Martin (2001). However, the absolute values may be underestimated, since the inundated area for the whole basin (i.e. an area about 4 times larger) was estimated to reach 70 km 2 (http://www.nord-pas-de-calais. ecologie.gouv.fr/article.php3?id%20article=700top). Figure 12 shows the domain covered by water detected using satellite images acquired on 21 April (red areas). It also shows the municipalities having declared flood damage (grey areas), as well as the areas where aquifer overflow is simulated by MARTHE (green) and MODCOU (blue). As the semi distributed model CLSM is able to simulate a fully distributed estimation of the soil wetness, due to the Topmodel algorithm, it is also able to estimate the flooded area. Its results are also presented Fig. 12.
There is a good consistency between the areas detected as flooded by using the ERS data and the municipalities that had declared flood damage. At the scale of the entire Somme basin, there is also good coherence between municipalities with flood damage and the areas of aquifer overflow simulated by the models, although the CLSM simulates many saturated spots over the plateaus. The impacted municipalities are located either where some overflow is simulated or downstream from these areas.
It can also be seen that the areas of aquifer overflow mostly follow the riverbed. This is consistent with the observations of the Somme basin and with the spatial distribution of the groundwater flooding of the Pang basin in 2001 (Finch et al., 2004). This is an additional assessment of the simulations in the valley. An important result from the analysis of the models is that the simulated aquifer overflow occurs almost every year in these areas. This might seem to be surprising but it is due to the fact that the plateaus are cut by some large valleys, with steep hillsides delimiting large flood plains. Although the altitude of the riverbed is slightly lower than the altitude of the valley, it is obvious that the level of the aquifer is rather close to the bottom of the valley and that it can provide some aquifer overflow during the high water season, as pointed out for instance by Eltahir and Yeh (1999) who then noticed that this generates an unlinear variation of the baseflow. Thus, in the simulations, it is the quantity of aquifer overflow that varies every year more than the localization of this flow. However, such a result is highly dependent on the exact measurement of the ground altitude and the aquifer level.
A sensitivity analysis to the spatial resolution was performed by MARTHE by using a constant 500 m grid by comparison with the reference simulation that uses a 100 m resolution along the rivers. As expected, the fluxes between the river and the aquifer are reduced at the finest resolution (due to the fact that the area in contact is reduced), but this is balanced by an increase of the aquifer overflow outside the riverbed. This extended aquifer overflow remained mostly located in the main river stream, in the bottom of the valley. Thus, the resolution used by hydrologeological models have a significant impact on the localization and quantity of aquifer overflow (within and outside the riverbed).
The models show that most of the flood occurred in the natural flood plain of the Somme. As a consequence and as already underlined by Hubert (2001); Deneux and Martin (2001) and Pointet et al. (2003), the flood damage would have been reduced if there had been fewer constructions in the main river stream, and the duration of the flood could have been reduced if the drainage in this stream had not been disrupted by human development.

Discussion and Conclusion
The above findings raise several issues. The inability of the models to accurately reproduce both the riverflows and the piezometric levels during and after the period of high flow is certainly due to the overall simplicity of the models, so the first question is to know which processes are missing in the models. It was shown that the unsaturated zone plays an important role in the Somme basin (cf. Sects. 2 and 4.2). But the processes in this zone are either not taken into account (CLSM) or represented with simple schemes: a percolation function for MARTHE, and a Nash cascade for MODCOU and SIM. Thus, no model is able to take into account a dynamic change of the unsaturated zone, which could be at least an evolution of its depth according to the piezometric level. Additionally, as mentioned above, the chalk is characterized by a dual porosity: matrix porosity and fissure porosity, which may lead to a non-linear response of the unsaturated zone according to its soil water content (Pinault et al., 2005;Price et al., 2000;Lee et al., 2006;Mathias et al., 2006). As the soil water content in the unsaturated zone increases to near saturation, some thresholds may be reached, with most of the water transfer occurring in the fissure at a faster speed, which might explain the very fast increase of the piezometric level in some observation wells (Pinault et al., 2005). This assumption tends to be confirmed by the results obtained in 2007 at the Flood1 experimental site that was set up to understand the processes occurring in the unsaturated zone during floods (http://www.flood1.info/FloodwebFr/ Web/documents/MLuce Amiens 26 09 2007.pdf).
However, the simple UZ schemes used in the hydrological models does not take into account such phenomena. Thus, the problem encountered during the flood is not only due to a poor calibration of the parameters, but to the use of an unsaturated model not adapted to the chalk matrix.
Another question that can be addressed by this multimodel comparison is the interest to use complex land surface schemes. It was shown (Fig. 3) that the water budget computed by the two LSMs lead to surface fluxes significantly different from those computed by the simpler PET, P schemes, especially in terms of temporal evolution.
But, as the temporal evolution of the water fluxes is deeply modified by the transfer in the unsaturated and saturated zone, the impact of the surface schemes is mostly hidden by the calibration of the UZ and groundwater parameters. The only exeception is provided by SIM, for which there was no specific calibration, and which shows a larger amplitude of the simulated baseflow, leading to significant differences in the simulated riverflows and piezometric levels, as summarized in Tables 2 to 4 and shown in Figs. 4 to 10.
The CLSM is the only model to take into account an interactive coupling between the saturated zones and the surface. As the water table rises, the surface soil moisture becomes saturated, and the areas where the precipitation cannot infiltrate and thus generate surface runoff increase. However, the areas where the water table is close to the surface (lower than 2 m depth) represent a small fraction of the overall basin, so the impact on the water budget is not very important. This is partly proven by the smooth observed riverflows, which do not react immediately to precipitation.
Thus in the Somme basin, there is no clear benefit in using a more complex surface scheme.
Four models with different physical representations and parameters are used in this study. All the models are able to simulate the 18-year riverflows of the Somme River (7-day efficiency above 0.76), while not always being able to accurately represent either flooding or the piezometric level. The multi-model comparison has shown that the CLSM obtains worst results in term of simulation of the Somme riverflows although the inclusion of the linear reservoir improves its results. MARTHE obtains fairly better results than MODCOU and SIM in terms of riverflows and piezometric levels. Such result is probably due to a better calibration of the groundwater parameters and not specifically to a better formulation of the processes. One argument for such conclusion is that all the models share the same flaws: inability to accurately reproduce all the tributaries of the Somme river, and inability to reproduce the evolution of the water table during the flood.
Hydrol. Earth Syst. Sci., 14, 99-117, 2010 www.hydrol-earth-syst-sci.net/14/99/2010/ The present study has shown that: 1. The saturated zone is not well represented during flooding, partially due to an inaccurate simulation of the fast transfers in the unsaturated zone. The four models use too simplistic unsaturated zone schemes, with no evolution of the UZ depth, and no representation of the fissure flow.
2. The annual cycles of the surface water fluxes simulated by simple and complex land surface schemes vary. However, these differences are dampened by the transfers in the unsaturated and saturated zones. Therefore, the use of complex land surface scheme is not a requirement to represent the hydrology of the Somme river basin. However, to simulate the Somme basin, LSMs should either be coupled to hydrogeological models or include the representation of the transfers in the unsaturated and saturated zones. This reinforce the need to include deep hydrology in LSMs which are currently increasingly developed (Yeh and Elathir, 2005;Miguez-Macho et al., 2007;Liang et al., 2003;Maxwell and Kollet, 2008).
3. The remote sensing observations of the flooding areas is both useful and complementary to classical in situ hydrological measurements.
According to these conclusions, studies aiming at the improvement in MARTHE and MODCOU of the simulation of the water transfer in the Chalk unsaturated zone are in progress by taking into account the fissure flow (Thiéry et al., 2008) and by integrating a dynamical unsaturated zone depth (Philippe et al., 2009). The application of these developments in the distributed modelling of the Somme basin should help to improve the modelling of the riverflows and piezometric head during the 2001 flood.