An ensemble approach to assess hydrological models ’ contribution to uncertainties in the analysis of climate change impact on water resources

Introduction Conclusions References


Introduction
The study of climate change impact on water resources has improved our understanding of the interactions between climate and hydrological processes.Water availability will be affected at various levels by the anticipated changes in temperature, precipitation, atmospheric and oceanic circulations and other climate variables depending on the scenarios and the investigated regions.The climate change impact on evapotranspiration, rainfall, runoff and water availability has been shown to be affected by the uncertainty associated to climate scenarios (Xu, 1999).The advent of regional climate models (RCMs) as a physically based and dynamical way of downscaling global climate model (GCM) outputs makes the combined GCM-RCM uncertainty more challenging to be assessed (Déqué et al., 2007).The uncertainty is not only due to imperfections in the models and geophysical datasets required to describe the land surface components, but also because anthropogenic greenhouse gas emissions as well as some climate change effects and feedbacks cannot be predicted in a deterministic way (Foley, 2010).Nevertheless, hydrologists have to work with these uncertain projections, taking into account the underlying assumptions on climate scenarios in their investigation on how and why runoff and hydrological responses are changing (Blöschl and Montanari, 2010).Teutschbein and Seibert (2010) review applications of RCM output for hydrological climate change impact studies.Graham et al. (2007) and Horton et al. (2006) both used a large set of RCM projections based on different GCMs and greenhouse gas emissions scenarios provided by the PRUDENCE project (Christensen and Christensen, 2007) to quantify the uncertainties in hydrological model output when forced by climate model projections.In the analysis of the impacts on future simulated runoff, Graham et al. (2007) found that the most important source of uncertainty comes from GCM forcing, which has a larger impact on projected hydrological change than the selected emission scenario or RCM used for downscaling.Horton et al. (2006) stress the fact that using different RCMs forced with the same global dataset induces a similar variability in projected runoff as using different GCMs, and also that the range of hydrological regimes associated with two considered emission scenarios are overlapping.
Regarding the uncertainty related to the emission scenario, the study of Hawkins and Sutton (2009) for decadal air surface temperature reveals that, in regional climate predictions, this kind of uncertainty makes a small contribution to the total uncertainty for the next few decades.
The studies found in literature vary regarding the construction of an ensemble of hydrological models.Prudhomme and Davies (2008) used two different versions of the same lumped model.Wilby and Harris (2006) used two hydrological model structures (CATCHMOD, a water balance model and a statistical model).Kay et al. (2009) investigated the uncertainty in the impact of climate change on flood frequency using two hydrological models: the Probability Distributed Model (PDM) and the grid-based runoff and routing model G2G.Crosbie et al. (2011) quantified the uncertainty in projections of future ground water recharge contributed by multiple GCM simulations, downscaling methods and hydrological models.The hydrological models were two versions of WAVES (a physically-based model), HELP (a bucket model) and SIMHYD (a lumped conceptual model).Dibike and Coulibaly (2005) used two conceptual runoff models (HBV and CEQUEAU) to project future runoff regimes based on one GCM scenario and two different statistical downscaling techniques.Most of these studies conclude on the fact that the uncertainty related to different hydrological models or their parameterisation is significantly less important than uncertainty from multiple GCMs.
Few studies have focused solely on the effect of the choice of hydrological model on hydrological changes or the model structural uncertainty (i.e., the uncertainty related to the internal computation of hydrological processes).For instance, Jiang et al. (2007) used six monthly water-balance models (models based on the water balance equation at the monthly time step) for one Chinese catchment.Results show that all models have similar capabilities to reproduce historical water balance components.However, larger differences between model results occur when comparing the simulated hydrological impact of climate change.Ludwig et al. (2009) investigated the response of three hydrological models to change in climate forcing: the distributed model PROMET, the semi-distributed model Hydrotel and the lumped model HSAMI over one alpine catchment in Bavaria in southern Germany.Climate data was generated by one RCM run.The hydrological model performance was evaluated looking at the following flow indicators; flood frequency, annual low flow and maximum seasonal flow.Results showed significant differences in the response of the hydrological models (e.g., estimation of the evapotranspiration or flood intensity) to changes in the climate forcing.The authors mentioned that the level of complexity of the hydrological models play a considerable role when evaluating climate change impact, hence they recommend the use of hydrological model ensembles.Gosling et al. (2011) presented a comparative analysis of projected impacts of climate change on river runoff from two types of distributed hydrological models (a global hydrological model and different catchment-scale hydrological models) applied on six catchments featuring important contrasts in spatial variability as well as in climatic conditions.The authors conclude that differences in changes of mean annual runoff between the two types of hydrological models can be substantial when forced by a given GCM.Poulin et al. (2011) investigated the effects of hydrological model structure uncertainty using two models: the semidistributed model Hydrotel and the lumped model HSAMI over one catchment located in the province of Québec, Canada.The delta change approach was used to build two climate scenarios.Model structure uncertainty was analysed for streamflow, groundwater content and snow water equivalent.The authors suggested that the use of hydrological models with different levels of complexity should be considered as contributors to the total uncertainty related to hydrological impact assessment studies.
Our abilities to predict the future hydrological effects to the changes in climate are necessarily limited, even if we had perfect hydrological models (Beven, 2001).Jones et al. (2006) suggest that conceptual and physical based models have a different role in impact assessment, where the former can be used to rapidly assess the impact of different climate scenarios, while the latter can assess the joint impacts of land-use and climate change.Nowadays, the most used approach is to calibrate a hydrological model on current day data and then use the calibrated model to predict the response under changed conditions (e.g., Ludwig et al., 2009;Poulin et al, 2011).However, for example, Mauser and Bach (2009) have pointed out that any calibration of a model on present conditions may become invalid for the evaluation of climate change impacts.On the other hand, Blöschl and Montanari (2010) argue that we cannot hope to reduce uncertainty by including more detail into the models (as in the case of physical, process-based models).
As mentioned before, most studies on climate change impact have found that the largest source of uncertainty comes from GCM forcing (e.g., Kay et al., 2009).However, hydrological modelling is an important part of the evaluation of the impact of change because it allows us to understand how the hydrological process would react to climate change.The aim of the present study is to assess the contribution of hydrological models to uncertainty in the climate change signal for water resources management.To achieve this, four hydrological models with different structure and complexity are fed with regional climate model outputs for a reference  and a future (2041-2070) period.The impact on the hydrological regime is estimated through hydrological indicators selected by water managers.In our analysis, the uncertainty from the hydrological model is compared to uncertainty originating from the internal variability of the climate system.This internal variability induces an uncertainty that is inherent to the climate system and that is the lowest level of uncertainty achievable in climate change studies (Braun et al., 2012).It is, therefore, used as a threshold to define the significance of the hydrological modelling induced uncertainty.However, the evaluation of the uncertainty associated with the calibration method or model parameters is out of the scope of this study and is covered in many articles (e.g., Poulin et al., 2011;Teutschbein et al., 2011;Kay et al., 2009).

Description of the investigated catchments
The present study looks at two contrasted catchments: the au Saumon catchment (738 km 2 ) located in Southern Québec (Canada) and the Loisach catchment (640 km 2 ) located in Southern Bavaria (Germany).Both are head catchments of larger river basins: the Haut-Saint-Franc ¸ois (Québec) and the Upper Isar (Southern Bavaria).The catchments' locations and topography are presented in Fig. 1.Since they are not regulated by dam operations nor significantly influenced by anthropogenic activities, flow regimes from both catchments can be considered as natural.Downstream of the investigated sub-basins, the tributary rivers join managed river systems where complex water transfers and reservoirs affect the river flow.These anthropogenic influences to the flows are not considered in the present study, but they are, however, covered in other activities within the QBic 3 project (Ludwig et al., 2012).
The au Saumon catchment presents a moderately steep topography in a northern temperate region dominated by deciduous forest.Slopes range from 0.171 upstream to 0.034 at the outlet; the highest point (1100 m) in the catchment is Mont Mégantic.The annual overall mean flow at the outlet is 18 m 3 s −1 (ranging from 10 m 3 s −1 in August to 54 m 3 s −1 in April).High flows mostly occur in spring (driven by snowmelt) and fall (driven by rain).
The Loisach River is an important tributary of the Upper Isar River.The catchment upstream of Schlehdorf gauge (elevation 600 m) is located in the Bavarian Limestone Alps with a smaller portion in the northwest in a region composed of marshland.The dominant soils are limestone in the mountains and loam with some gravel in the plain sections.Coniferous forests with small areas of marshland, pasture and rocky outcrops dominate the land use.The highest point within the catchment is the Zugspitze (2962 m).The runoff regime of the Loisach is controlled by snowmelt in late spring and rain events in summer.Mean annual runoff is 22 m 3 s −1 with a minimum in January (12 m 3 s −1 ) and a maximum in June (34 m 3 s −1 ).
The meteorological observation datasets used for calibration and validation of hydrological models and to correct climate simulations are gridded datasets already available for both regions.For Southern Bavaria this has been generated from sub-daily data of 277 climate stations on a 1 km grid with the PROMET model (Mauser and Bach, 2009), while the project partner CEHQ provided its reference dataset of daily precipitation and minimum and maximum air temperatures with a resolution of 0.1 • for Southern Québec.

The hydro-climatic model chain
Figure 2 illustrates the chain of models used to generate the flow simulations.This chain consists of an ensemble of climate simulations feeding an ensemble of hydrological models of various structural complexities.The upper half of the diagram in Fig. 2 depicts the two climate data ensembles used in the study while the lower part represents the hydrological ensemble and the associated scaling and bias correction tools required to adjust the climate model data to the hydrological models.These tools connect the top and bottom parts.The combination of climate and hydrological models generates the hydro-climatic ensemble that is analysed to quantify the contribution to uncertainty induced by the hydrological models with respect to the climate natural variability estimated from the climate models.

The climate simulation ensemble
Five members of the Canadian Global Climate Model (CGCM3) under the SRES A2 emission scenario are dynamically downscaled by the Canadian Regional Climate Model CRCM version 4.2.3 (de Elía and Côté, 2010) to generate the required climate data for the province of Québec, while three members of the German global model ECHAM5 under the SRES A1B emission scenario are downscaled by the KNMI's regional model RACMO2 (van Meijgaard, 2008) to supply the climate data over Bavaria.These two climatesimulation ensembles allow the exploration of the natural variability (the unforced variability) in the climate system.This natural variability can be estimated by repeating a climate change experiment using a given GCM several times when only the initial conditions are changed by small perturbations (Murphy et al., 2009;Braun et al., 2012).Although the natural variability is just a fraction of the total climate simulations uncertainty, it is irreducible even if perfect models would be available.Therefore, natural variability is used in this study to compare the significance of the uncertainty induced by the hydrological models compared to the irreducible baseline uncertainty.
Driving hydrological models of different structural complexity over small, heterogeneous catchments with an ensemble of climate scenarios requires further (statistical) adjustment to the forcing variables in order to suit the hydrological modelling scale (e.g., 1 × 1 km 2 ).A post-processing is applied to correct biases in RCM temperature and precipitation before downscaling the fields to the hydrological model scale.Monthly correction factors are computed based on the difference between the ensemble-mean of the 30-yr mean monthly minimum and maximum air temperature for the reference period and the 30-yr monthly means of dailyobserved minimum and maximum air temperature.The correction is then applied to each member of the ensemble to conserve the inter-member variance used to estimate the natural variability.
The resulting seasonal climate change signals from the climate simulations ensemble (after bias correction and downscaling) are presented in Fig. 3 for both catchments.The mean annual projected change in air temperature for the Haut Saint-Franc ¸ois area between the reference and future period is about 3.0 • C.However, the winter months (December to February, DJF) show a stronger warming and a stronger intermember variability.The average change in precipitation is positive for all seasons but summer (JJA).In the Upper Isar region annual warming is estimated to be 2.2 • C. Precipitation are projected to be roughly the same as in the past in autumn (SON) and winter, but to increase in spring (MAM) and decrease in summer (JJA).Similarly, precipitation is corrected with the local intensity scaling method (LOCI, Schmidli et al., 2006), which adjusts 30-yr average monthly wet-day frequency and intensity, with a wet-day precipitation threshold of 1 mm (e.g., Chen et al., 2011).Since the LOCI method was developed for daily data, the resulting daily precipitation is redistributed to the sub-daily timescale proportionally to the original RCM precipitation for each day in order to accommodate for a finer temporal resolution of the model data (Muerth et al., 2012).The SCALMET (Marke, 2008) model output statistics (MOS) algorithm then scales all meteorological variables (including also the following uncorrected variables: humidity, wind speed, radiation and cloud cover) from the RCM grid scale to the hydrological models' grid scale using topography as the main predictor for small-scale patterns.SCALMET conserves energy and mass within each RCM grid cell once downscaled on the hydrological model fine scale grid (Further details on the postprocessing of climate simulations can be found in Muerth et al., 2012).

The hydrological model ensemble
An ensemble of four hydrological models displaying a range of structural complexity has been constructed.The models range from lumped and conceptual to fully distributed and physically based.Both spatial and temporal resolutions differ within the hydrological model ensemble.The model HSAMI (HSA; Bisson and Roberge, 1983;Fortin, 2000) is a conceptual and lumped model that uses a set of parameters to describe the entire catchment.The conceptual and process-based semi-distributed model HYDROTEL (HYD; Fortin et al., 2001;Turcotte et al., 2003) defines a drainage structure based on unitary catchment units and derives behavioural information for each RHHU (relatively homogenous hydrological units).The conceptual and process-based fully-distributed model WASIM-ETH (WAS; Schulla and Jasper, 2007) and the process-based and fully distributed model PROMET (PRO; Mauser and Schädlich, 1998) are distributed on a grid with a mesh of 1 km.The temporal resolution for all hydrological models is daily with the exception of PROMET that requires hourly forcing.PROMET simulation results are, thus, aggregated to daily means after the simulation is completed.Table 1 presents the characteristics of each of the hydrological models.
Meteorological inputs were processed to fit each model's potential evapotranspiration formulation requirements.For the au Saumon catchment, HSAMI and HYDROTEL use the empirical formulation developed by Hydro-Québec (Fortin, 2000).For Bavaria, HSAMI uses the Hydro-Québec formulation while the Thornthwaite formulation (Thornthwaite, 1948) is used in HYDROTEL.Both formulations use daily minimum and maximum temperatures.WASIM and PROMET use the Penman-Monteith equation which requires additional meteorological inputs for relative humidity, wind speed and net radiation.The soil hydrodynamic formulation is also different within the ensemble.In HSAMI, vertical flows in the soil column are represented by two conceptual and linear reservoirs that represent the unsaturated and saturated zones, while HYDROTEL, WASIM and PROMET compute soil water fluxes and storage with parameters adjusted to different soil layers.HYDROTEL provides a lumped characterisation of soils at the subcatchment scale and considers the soil column properties as being vertically homogenous.
The computation of snow accumulation and melting is also treated differently in each model; the snow pack evolution in PROMET respects the energy balance in the snow pack, while the other models use simpler temperature-index approaches.
In all four hydrological models, calibration has been made on the 1990-1999 period.In order to evaluate the predictive capacity of each hydrological model, a simple split sample test has been applied using the 1975-1989 period for validation.Automatic calibration is applied for HSAMI and HY-DROTEL by using the Shuffled Complex Evolution optimisation method (Duan, 2003) with the sum of squared errors between observed and simulated runoff as objective function.WASIM is manually calibrated by adjustment of land use specific minimal resistance parameters for evapotranspiration and four recession parameters for runoff.PROMET is calibrated by changing the soil parameters.
The Nash-Sutcliffe (1970) efficiency coefficient (NS) is computed in order to evaluate the performance of the hydrological models (Table 2).For the validation period in the au Saumon catchment, the daily NS has values of about 0.6 for all models, with the exception of PRO, which achieves a value of 0.2.In the Schlehdorf catchment, the daily N.S has values of 0.75 for HSA and HYD, but for PRO it is only 0.12.Despite the low performance of PRO for daily NS, it has a comparable performance in the evaluation of hydrological indicators on the reference period (see Sect. 3.1).Calibration and validation processes are more widely described in Ludwig et al. (2012).

Hydrological indicators
The analysis of the impact of climate change on hydrology is evaluated on the following four hydrological indicators: 1.The overall mean flow (OMF), defined as the mean daily runoff over the entire period of the investigated time series.
2. The 2-yr return period 7-day low flow (7LF2), calculated from a 7-day moving average applied on daily runoff data.The lowest value over a year is kept as the yearly low flow.A statistical distribution is fitted to the series of yearly low flows to compute the low flow that occurs statistically every 2 yr (DVWK, 1983).
3. The 2-yr return period high flow (HF2) is the flow that is statistically exceeded every two years or, in other terms, that has a 50 % chance of being exceeded in any given year.It is evaluated from the time series of each year's maximum daily runoff (DVWK, 1979).To calculate 7LF2 and HF2, it is assumed that the time series follow the log Pearson III probability density function, following the German Association of Water (DVWK, 1979(DVWK, , 1983)).
4. The Julian day of spring-flood half volume (JDSF) identifies the date over the hydrological year at which half of the total volume of water has been discharged at the gauging station (Bourdillon et al., 2011).This indicator targets the spring flood peak, from February to June in Québec and from March to July in Bavaria.
Both catchments show an important annual cycle in the hydrological regime.Two distinct periods representing summer and winter are, therefore, defined for the analysis.For the Québec catchment, the summer covers the period from June to November and the winter covers December to May while in Bavaria the summer goes from March to August and the winter from September to February.

Permutations and statistical test
At the very end of the modelling chain (Fig. 2), the present and future climatological values of the hydrological indicators are permuted across members to increase the sample of our climate change signals dataset (e.g., Bourdillon et al., 2011).This operation is based on the assumption that each member is considered as an independent realisation of climate, both in the reference and the future periods.With permutation, the future of a given member is not only compared with the present of the same member, but also with the present of all other members.For instance, five GCM members used in a single branch of the modelling chain (i.e., used to drive only one RCM and one hydrological model) produce five present and five future hydrological outputs.With permutation, 25 future versus present differences are obtained for the hydrological indicators, as shown in Fig. 4. Therefore, using the permutations, 25 values of relative differences are obtained with five reference and five future hydrological indicators at the au Saumon catchment.For Schlehdorf, nine values are obtained with the three-member ECHAM5 ensemble.The median of the change values gives the climate change signal while the variability gives an estimation of the uncertainty associated to that signal.The Wilcoxon rank sum test (Wilcoxon, 1945)   with equal medians, against the alternative that they do not have equal medians (Wilks, 2006).For instance, for a given hydrological indicator (e.g., OMF), we have four climate change signal samples, which have been obtained with the four different hydrological models.The Wilcoxon rank-sum test tells us, if two samples, obtained from two distinct models (e.g., HSAMI and HYDROTEL), are independent or not (see Sect. 3.3).It should be noted that the climate change signals from the same model are considered as independent, as they come from independent climate simulations.

Results and discussion
The aim of the present study is to assess the contribution of hydrological models to uncertainty in the climate change signal for water resources management.First, the performance of the hydrological models is evaluated over the reference period by validating the simulated indicators when the model is forced with station data against the observed flow at the gauging station.The differences from observations are used to assess the performance of the hydrological model ensemble (Sect.3.1).Second, the impact of forcing the hydrological models with the climate model projections is assessed through the hydro-climatic simulations using the ensemble of calibrated hydrological models forced by the ensemble of climate simulations (Sect.3.2).Finally, the relative difference in the hydrological indicators between the reference  and future (2041-2070) periods is calculated to evaluate the climate change signals.A statistical test is used for all given indicators in order to compare the series of relative change of hydrological indicators obtained with the different hydrological models.1), while the right panels show the absolute error in m 3 s −1 or days.

Performance of the hydrological models
In order to evaluate the hydrological models when forced by observed station data, the simulated hydrological indicators are compared to the hydrological indicators computed from the gauging station data for both catchments.Figure 5 (left) shows relative errors E i between indicators computed from simulations and from observed flows as computed following Eq.( 1): where, I (obs) is the value of the indicator as computed from observed flows; I (sim)i is the indicator calculated from the simulated flows with the hydrological model i forced by stations data over the validation period.The right panels in Fig. 5 show the absolute error (in m 3 s −1 or days for JDSF).
Errors related to the OMF over the whole period are relatively small for both catchments (less than 10 %).The hydrological models underestimate the OMF for the au Saumon catchment while they overestimate it for Schlehdorf.This highlights the fact that biases are site specific and cannot be generalised.However, in both catchments the OMF is well captured by the various hydrological models.Larger relative errors affect the low flows with a wider dispersion between models than for the OMF.These errors show that low flows are challenging for surface hydrological models.One of the major problems with low flow simulations is related to surface-groundwater interactions which are poorly represented by the hydrological models.During low flow periods, water exchange occurs through the riverbed and the river may be fed by groundwater or may leak to feed the aquifer (Pushpalatha et al., 2011).However, the absolute error in low flow is small.For instance, for au Saumon, HYD, PRO and WAS have a mean error of 23 % in 7LF2-SUMMER, which represents only 0.3 m 3 s −1 .HSA presents a large relative error for this indicator (about 260 %) which reaches 3.4 m 3 s −1 .Over Schlehdorf, the more complex and physically based model PRO that could be thought to better handle low flows show similar performance as the others models in 7LF2-WINTER.
For high flows, WAS and PRO have small relative errors for au Saumon but these small relative errors can represent a large amount of water as it can be seen in the right panel of Fig. 5.For Schlehdorf, the best performance in HF2-SUMMER is obtained with WAS while PRO has the largest deviation.
Figure 6 shows the observed and simulated (with the hydrological models forced by meteorological station data) mean hydrographs.Au Saumon presents two high-flow events.The first one in spring (driven by snowmelt) is well simulated by HYD and PRO, but underestimated by HSA and WAS.A second but smaller high-flow event occurs in summer (driven by rain) which is not captured by HSA.The au Saumon summer low flows are overestimated by HSA and WAS.Schlehdorf is characterised by one summer peakflow which results from both snowmelt and precipitation.The peak is overestimated by PRO and is simulated earlier by most hydrological models.Schlehdorf winter low flows are overestimated by HYD.

Climate change impact on water resources
where i and j represent the member of the climate simulation from which the hydrological indicator was taken.For each hydrological model, the boxplots present the change values obtained by the permutations (25 values for each boxplot at au Saumon and 9 values at Schlehdorf as seen in Fig. 4).In both figures, the change of each hydrological indicator (following Eq. 2) is shown.The two extreme indicators 7LF2 and HF2 are calculated for the two seasons (summer and winter).
The change in JDSF is only expressed as the absolute difference between the present and future values in days.In Fig. 7, the hydro-climatic ensemble suggests a general increase in the overall mean flow for au Saumon.The change of the OMF median values varies between 3 % and 11 % for the different hydrological models.The extremes of the expected changes range between −6 % and 22 %.The whole hydro-climatic ensemble predicts an earlier spring flood.The median change value of the JDSF varies from −11 to −13 days, while the overall range goes from −3 to −19 days.The increase in temperature projected by the climate models (Fig. 2) simulates an earlier melt in the future simulated snow cover.The change in the low flow indicators depicts a larger variability between the hydrological models.For the 7LF2-SUMMER, the median change values vary from −5 % to −40 %.The reduction in the precipitation and the increase of the potential evapotranspiration (PET not shown) explain this overall decrease in 7LF2-SUMMER.For 7LF2-WINTER, HSA has a significantly larger median change value (+70 %), while the other three models show values of about +40 %.The change in the summer high flow indicator (HF2-SUMMER) ranges from −3 % to 18 %.PRO is more sensitive to the range in climate forcing and shows the largest spread in the indicator from −10 % to +80 %.The median change values of HF2-WINTER are around +5 % with a range from −18 % to +23.The overall trend shows an increase in high flows.
Schlehdorf (Fig. 8) shows a general, but smaller diminution of the OMF, the median change value varies between −1 % and −6 %.The spring flood discharge happens sooner in the simulations with the median difference ranging between −4 and −6 days.The median of summer low flow (7LF2-SUMMER) ranges between −5 % and −8 %.In winter the relative uncertainty about the potential changes is much larger, so the relative change of 7LF2-WINTER varies from −20 % to +20 %.The signal for this indicator seems to be very model specific.The models HSA and HYD present a negative change signal (median of −15 % and −5 %, respectively) while the more complex models WAS and PRO present a positive change signal (+4 % and +12 %, respectively).The summer 2-yr return period high flow (HF2-SUMMER) has median values ranging between +1 % and −8 % and the overall relative uncertainty ranges between −18 % and +25 %.In HF2-WINTER, HSA has a negative relative difference (median of −5 %), while the other models show a median value of about +3 %.The total change in HF2-WINTER ranges between −8 % and +30 % where a general increase in high flows is expected for all hydrological models but HSA.Table 3 shows the mean and standard deviation (std) from the relative change series presented in Figs.7 and 8.

Hydrological models contribution to uncertainty
In the present section, we explore the uncertainty induced from an ensemble of hydrological models in the impact assessment of climate change on water resources.Complex models are usually more demanding to configure over a given catchment and they also demand more computing power.Hence, it is of interest to know if they provide more information in a climate change analysis compared to what is obtained from simpler models.If all models within the ensemble provide different signals for some indicators, then an ensemble could be considered required to fully assess the impact of climate change on water resources.The rank-sum Wilcoxon test is used in order to compare pairs of climate change signal ensemble obtained from two distinct hydrological models.For each hydrological indicator, we evaluated if two samples (one sample from each hydrological model) have been drawn from the same distribution (the null hypothesis) with a significance level of 5 %.If the null hypothesis is not rejected, it could be an indication that the climate change signals from two hydrological models provide similar information.Note that this does not verify the null hypothesis, but only says that it cannot be rejected from the available information.This test was applied to the relative differences (except for JDSF where it was applied to absolute differences in days), as specified in Figs.7 and 8.
The Wilcoxon test results are shown in Table 4 for au Saumon and Schlehdorf where the series of climate change impact on hydrological indicators are compared for all the pairs of models.The OMF at au Saumon, the null hypothesis is not rejected when comparing the pairs HSA-HYD, and WAS-PRO.For OMF Schlehdorf, the only pairs of model that lead to rejection are WAS-HSA and WAS-HYD.The large difference in the Wilcoxon test results over the two catchments might originate from the formulation of potential evapotranspiration (PET); PRO and WAS use the complex Penman-Monteith while HYD and HSA use temperaturebased empirical approaches.However, the model pairs HSA-PRO and HYD-PRO do not reject the null hypothesis for Schlehdorf.Bormann (2011) reported that different PET formulations following different approaches show significantly different sensitivities to climate change.
The change in the JDSF is similarly predicted with all hydrological models over Schlehdorf.Over the au Saumon, only WAS behaves differently to the less complex HSA and HYD.So in this case the signal is more robust because this indicator depends mostly on temperature.
The low flow shows greater differences between models.The season when low flows are most severe is different; it happens in summer for au Saumon and in winter for Schlehdorf.In au Saumon, the null hypothesis is rejected for all models pairs for the 7LF2-SUMMER, but it is the conceptual model HSA which presents the largest difference with all other models (see Fig. 7).In Schlehdorf the null hypothesis for 7LF2-WINTER is rejected for all model pairs except for the pair HSA-HYD.However, a very different behaviour is shown between lumped and distributed models for low flows.The lumped and semi-distributed models predict a negative change, while the fully distributed models predict a positive change (Fig. 8).The Schlehdorf catchment is very steep and this could affect the baseflow simulation, which is better represented in the semi-distributed and fully distributed models.In the less severe low flow periods (winter for au Saumon, and summer for Schlehdorf ), groundwater recharge is larger, so this leads to a more stable baseflow and smaller differences in the simulated low-flow quantities between hydrological models.These differences may also be influenced by the PET formulation.
The highest flows are seen in winter for au Saumon and in summer for Schlehdorf.The null hypothesis is not rejected when comparing all pairs of hydrological models for the HF2 in these periods.However, a large uncertainty is present in this indicator, but it is more related to the natural variability simulated by climate models than to choice of the hydrological model (Figs.7 and 8).Nevertheless, the choice of the hydrological model affects the HF2-SUMMER in au Saumon.
It is important to note that results for the rank-sum Wilcoxon test differ for the two sites and also differ from one indicator to another.Analysis for au Saumon indicates that the hydrological models generate a significantly different signal for most indicators (except HF2-WINTER).The use of a hydrological model ensemble would, thus, be recommended in order to fully assess the uncertainty on hydrological indicators due to climate change.For Schlehdorf, only OMF and 7LF2 seem to be sensitive to the selection of hydrological model.To analyse the high-flow indicator or springflood timing indicator, the recommendation to use a simple conceptual model can be made with a certain level of confidence.Another important aspect is that the analysis of the uncertainty from the hydrological models cannot be transferred from site to site and seems to have to be repeated for every catchment.A regional analysis would be required to see if the conclusions present a regional behaviour.

Discussion and conclusions
The present study looked at the uncertainty in projecting future changes in runoff characteristics induced by the choice of hydrological models for two distinct natural flow catchments.A hydro-climatic ensemble is constructed with a combination of an ensemble of climate scenarios and an ensemble of hydrological models.The major strength of the hydroclimatic ensemble approach is that the ability of the hydrological models to reproduce hydrological characteristics can be compared and the uncertainty of future changes in runoff behaviour can be assessed.Although the selected models in our study cover a wide range of complexity, a limitation of this approach is that the selection of hydrological models will never cover the full space of plausible models and conceptualisations.By not including some plausible models that are substantially different from the selected models, can result in underestimated model uncertainty.A complete evaluation of this component of the uncertainty in hydrological projections represents a research challenge (Refsgaard et al., 2012).In this study, four hydrological models have been chosen from those used in scientific or administrative assessment of climate change impacts on river runoff in Québec and Bavaria.The complexity of these models ranges from conceptual and lumped to process-based and fully distributed.
The principal objective of the paper is to assess the contribution of hydrological models' uncertainty in the climate change signal for water resources management.The results of our study suggest that the added value depends on the hydrological indicator considered and on the region of interest.
Regarding hydrological indicators, Blöschl and Montanari (2010) suggest that that we can have reasonable confidence in predicting hydrological changes that are mainly driven by air temperature (e.g., snowmelt and low flows through evapotranspiration) as opposed to rainfall-driven events like floods.Similarly, Boé et al. (2009) have more confidence to projected changes of low and mean flows.Our results suggest that not only the forcing climate variables, but also the hydrological model plays a key role in the uncertainty of projected climate change signal of hydrological indicators.
In the case of high flows, most of the hydrological models lead to comparable results; therefore, both lumped and distributed models can be used.
The evaluation of the overall mean flow is more sensitive on the type of model in the Québec catchment than in Bavaria.Therefore, an ensemble of hydrological models should be employed in order to evaluate the range of climate change impacts due to the differences in the process description in different hydrological models.However, the differences in catchment properties (e.g., soil type and topography) can also influence the uncertainty arising from the hydrological model structure (e.g., Kay et al., 2009).As suggested by Blöschl and Montanari (2010), the dependence of local conditions is a distinguishing feature of hydrology that can make the effect of climate change less predictable and more diversified.

Fig. 5 .
Fig. 5. Performance of the hydrological models over the reference period.The left panels show the relative error as computed with Eq. (1), while the right panels show the absolute error in m 3 s −1 or days.

Figures 7
Figures 7 and 8 show the impact of climate change on hydrological indicators for au Saumon and Schlehdorf catchments, respectively.The change is expressed as differences of simulated hydrological indicators ( I ij ) from the reference (I ref j ) to the future period (I fut i ).

Fig. 6 .
Fig. 6.Observed and simulated (forced by stations data) hydrographs for au Saumon and Schlehdorf over the reference period.

Fig. 7 .
Fig. 7. Changes of hydrological indicators from reference to future period at au Saumon (Haut St-Franc ¸ois, Québec) of overall mean flow (OMF), the Julian day of spring-flood half volume (JDSF), the 2-yr return period 7-day low flow (7LF2) in summer and winter, and the 2-yr return period high flow (HF2) in summer and winter.For each hydrological indicator, the relative change (as calculated with Eq. 2) is presented.On each box, the central mark is the median, the edges of the box are the 25th and 75th percentiles, and the whiskers extend to the most extreme value.

Table 1 .
Characteristics of the hydrological model ensemble.

Velázquez et al.: An ensemble approach to assess hydrological models 571
is used to compare the climate change signals obtained with two different hydrological models.It performs a two-sided rank sum test of the null hypothesis that two series of data are independent samples from identical continuous distributions Hydrol.Earth Syst.Sci., 17, 565-578, 2013 www.hydrol-earth-syst-sci.net/17/565/2013/ J. A.

Table 3 .
Mean and standard deviation (std)from the change series presented in Figs.7 and 8.

Table 4 .
Results of Wilcoxon test comparing pairs of hydrological models for (a) au Saumon, and (b) Schlehdorf.The p-value is shown and the shaded area indicates a rejection of the null hypothesis at significance level of 5 %.