Simulation of high mountainous discharge : how much information do we need ?

Simulation of high mountainous discharge: how much information do we need? B. Schaefli and M. Huss Laboratory of Ecohydrology, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland Water Resources Section, Delft University of Technology, Delft, The Netherlands Department of Geosciences, University of Fribourg, Fribourg, Switzerland Laboratory of Hydraulics, Hydrology and Glaciology (VAW), ETH Zurich, Zurich, Switzerland


Introduction
Continuous simulation of discharge has become a standard tool for water resources management at the catchment-scale, e.g. to estimate extreme events from long simulated time series (e.g., Haberlandt et al., 2008;Hingray et al., 2010) or to predict climate and land use change impacts on hydrologic regimes (e.g., Horton et al., 2006).For this type of application, conceptual models play an important role: they represent the dominant rainfall-runoff processes directly at the catchment-scale instead of resolving the small scale physical processes (see Sivapalan et al., 2003).In high mountainous catchments, the focus of the present paper, such conceptual models represent the snow accumulation and ablation behavior of a very complex environment with a set of lumped equations accounting only for altitudinal temperature and precipitation gradients (e.g., Schaefli et al., 2005;Klok et al., 2001;Hock, 2003;Stahl et al., 2008).
These models are generally calibrated so that the model reproduces as closely as possible a series of observed discharge measurements.The question inevitably arises whether hydrologic predictions of such a calibrated model are actually reliable.Knowing that a model performs well for historic situations does not necessarily imply that it will perform well for future catchment conditions, i.e. that it can be used for climate change or land use change impact prediction.
This well-known problem of calibrated models becomes particularly relevant for hydrologic systems that are undergoing such fast evolution as glacierized catchments (see, e.g.Moore et al., 2009).A first, although not sufficient step to answer this question is investigating whether with a calibrated model, we are "getting the right answers for the right reasons" (Kirchner, 2006).Adopting the perspective that catchments are systems that partition incoming water into different storages (soil, snow, groundwater etc.), from which water is released at some later moment in time (see Wagener et al., 2007), this essentially translates into getting all these three functions right (partitioning, storage and release).In glacierized catchments, this would for example mean that a precipitation-runoff model should not just mimic observed discharge but also Figures reproduce snow accumulation and melt dynamics or the glacier mass change (e.g., Schaefli et al., 2005;Koboltschnig et al., 2008).This is essential to be able to correctly simulate runoff dynamics under the strongly changing climatic conditions expected for the next decades.
One of the challenges is hereby the integration of very different types of data containing spatially integrated information such as discharge or glacier mass balance or point information such as snow height or ice ablation observations.We, therefore, developed a methodology that can make use of sparse point observations to calibrate and evaluate a lumped precipitation-discharge simulation model for high mountainous catchments.
This work was triggered by the analysis of a case study in the Swiss Alps, the Rhonegletscher catchment, for which Schaefli et al. (2005) have presented a conceptual hydrological model (GSM-SOCONT) and Huss et al. (2008) a more complex glaciological model with higher spatial resolution.Both models simulate monthly, seasonal and annual glacier mass balances.However, the hydrological model GSM-SOCONT yields on average much less winter accumulation than the glaciological model and slightly more ablation during summer months.This striking difference, which would namely result in rather different predictions of glacier surface evolution, led us to the question whether a conceptual precipitation-runoff model calibrated (i) exclusively on runoff or (ii) on runoff and glacier-wide annual mass balance can reproduce the seasonal glacier mass balances or whether we need to incorporate some in-situ measurements of snow accumulation and ablation processes; and, if yes, how to incorporate such point observations.
Hereafter, we first present the case study (Sect.2) before discussing in detail the method developed to merge different types of glacio-hydrological data into a precipitation-discharge model (Sect.3).As illustrated and discussed in Sects.4 and 5, this method provides valuable new insights into the use of multi-signal calibration for model development in general and into the question of how much information we need for discharge prediction in high Alpine environments.Introduction

Conclusions References
Tables Figures

Back Close
Full The Rhonegletscher catchment at Gletsch has a size of 38.9 km 2 with a mean elevation of 2719 m a.s.l.The outlet is located at 1761 m a.s.l.Discharge measurements in daily resolution are available continuously since 1956.
Rhonegletscher is a medium-sized valley glacier and presently covers almost half of the catchment.Its size was 17.3 km 2 in 1980 and 15.9 km 2 in 2007 (Bauder et al., 2007).Several other small glacier and ice patches are present in the catchment (around 2.3 km 2 ), which are included in the total ice-covered area for discharge simulations.There are no meteorological measurement stations located within the catchment.richen lie downstream of the catchment outlet with a distance of 3.0 km and 7.4 km, respectively.The choice of the meteorological time series will be further discussed in Sect.4.2.1.The hypsometry of the catchment is derived from a digital elevation model with a resolution of 25 m (SwissTopo, 1995).For Rhonegletscher repeated topographical information documenting the changes in glacier surface elevation and area over the last decades are available from photogrammetry (Bauder et al., 2007).These digital terrain models were used to divide the glacier into five surface elevation bands with equal area.
A comprehensive data set of seasonal point mass balance observations is available Introduction

Conclusions References
Tables Figures

Back Close
Full the glacier and annual mass balance is determined using stakes drilled into the ice.Field data are available for the years 1979/1980 to 1981/1982 and for 2006/2007 and 2007/2008.For the individual surveys between 10 and 50 ablation measurements and up to 800 snow soundings were performed.
In addition, the repeated digital elevation models allow the determination of changes in glacier ice volume in periods of some years to a few decades (Bauder et al., 2007).These geodetic glacier mass balances are highly accurate and provide the change in long-term glacial storage of water over the entire 20th century.This data set was used by Huss et al. (2008) to derive monthly mass balances for Rhonegletscher using a distributed model calibrated by all available field data (see details in Sect.2.3).This data set is considered as the reference result for glacier mass changes over the last couple of decades and is used for final model validation (see Fig. 6d).

Hydrological model
The hydrological model GSM-SOCONT uses a semi-lumped simulation approach.It computes snow accumulation and snow-and ice melt per elevation bands (here 5 elevation bands for the glacier, 5 for the non-glacier part of the catchment).For each band, the reference precipitation and temperature time series are interpolated according to the difference between its mean elevation and the reference station altitude.For temperature, this interpolation uses a temperature lapse rate ρ [ • C m −1 ]; for precipitation, it uses a proportional increase with altitude, γ p [% m −1 ], of the amount observed at the reference meteorological station.In the original formulation, the aggregation state of precipitation at the mean elevation of the band is determined based on a simple temperature threshold T c [ To reduce the model parameter dependency on the spatial resolution (i.e. the number of elevation bands), we adopt here a slightly modified snowfall estimation method.We compute the amount of snowfall W (t,z) and of rainfall V (t,z) at a given time step t and a altitude z and integrate over the hypsometric curve of each elevation band.Introduction

Conclusions References
Tables Figures

Back Close
Full The average melt per elevation band, is computed based on a direct linear relation with positive air temperature (Hock, 2003) using a simple degree-day approach: where a s and a i are the degree-day factor for snow-respectively ice melt (mm d −1 • C −1 ), T m is the threshold temperature for melting that is set to 0  1) as a reference simulation.Huss et al. (2008) applied a distributed accumulation and melt model to the Rhonegletscher to obtain homogenous estimates of glacier mass balance in high temporal and spatial resolution.The model is driven by daily meteorological data and calculates the components of glacier surface mass balance for 25 m × 25 m grid cells.Thus, the spatial distribution of mass balance quantities is addressed in a more sophisticated way than in GSM-SOCONT.

Glaciological model
Snow and ice melt is computed using a distributed temperature-index model (Hock, 1999).Degree-day factors are varied in space as a function of potential direct solar radiation to account for the effects of slope, aspect and shading.Instead of simply applying an altitude dependent precipitation gradient, Huss et al. (2008) derive a spatial snow accumulation pattern from the detailed winter accumulation measurements and use this pattern to redistribute the precipitation input.Thus, effects of preferential snow deposition and wind-driven snow redistribution are taken into account.The amount of snowfall is estimated based on a linear transition from snowfall to rainfall between the thresholds of 0.5 • C and 2.5 • C. Measured precipitation is corrected using a multiplier Introduction

Conclusions References
Tables Figures

Back Close
Full so that the observed winter accumulation in the five seasons with direct field data is matched (Huss et al., 2008).
The parameters of the glaciological model are calibrated on the observed ice volume changes available for subdecadal to multidecadal periods, and the spatial mass balance variability is constrained by the seasonal in-situ field measurements (see Huss et al., 2008).Total annual discharge observations are used to validate the estimates for catchment precipitation, but no other discharge data entered the calibration procedure.

Optimization algorithm
If nothing else is stated, parameter optimization is completed with the global optimization algorithm called Queueing Multi-Objective Optimiser (QMOO) developed by Leyland (2002).For an application of this optimizer to hydrology, see Schaefli et al. (2004Schaefli et al. ( , 2005)).The algorithm has been designed to identify difficult-to-find optima and to solve far more complex problems than the ones presented in this paper, where we try to identify the optimal values for a few parameters (up to 14).We, therefore, assume that all identified parameter sets correspond to the best identifiable solution for a given calibration run.
3 Merging data and model

Theoretical background
Merging model and data is a crucial step in any model development and evaluation process.In hydrology, the main focus was, until recently, on how to integrate different data types into model parameter estimation (e.g.Seibert and McDonnell, 2002;Gupta et al., 1998).This focus is currently shifting to what we can learn about model structural deficiencies (e.g.Fenicia et al., 2008) or about the information content of different data sets (e.g.Winsemius et al., 2009;Reusser et al., 2009).In this paper, we combine Introduction

Conclusions References
Tables Figures

Back Close
Full different methods traditionally used in model calibration to extract as much information as possible from the available data sets.Thus, rather than aiming at simply estimating the globally best parameter set, we aim at "informing the model" about potential data and model structural deficiencies while identifying the most likely ranges of parameter values.
We call these most likely parameter ranges posterior or updated ranges as in classical Bayesian parameter estimation (e.g.Kavetski et al., 2006), since they result from updating the modeller's a priori knowledge about the parameter values with all available information about the system behavior.
If this updating includes several different types of data (as in the present case study), there are two possible approaches: we can either identify the posterior parameter range by confronting the model sequentially with different types of data or by confronting it simultaneously.As we discuss hereafter, the two different approaches will yield different types of results and, thus, extract different information from the data.

Model improvement through multi-objective optimization
Simultaneous incorporation of different types of reference data results in a Multi-Objective Optimization (MOO) problem, where for every reference data type an appropriate objective function is formulated (e.g.Gupta et al., 1998;Yapo et al., 1998).MOO tries to identify parameter sets that correspond to points on the so-called Paretooptimal frontier (POF), the limit in the objective function space beyond which it is not possible to improve one objective function without worsening the others (see an illustration for the case of 2 objective functions to be minimized in Fig. 1).
A real-world model with imperfect model structure and imperfect input data can generally not exactly reproduce any of the observed reference data sets.The trade-off between the different objective functions contains information about how well the model structure can explain the observed data.(This information is partly lost in sequential calibration where the result of each calibration step represents the joint information extracted from all previous steps).Introduction

Conclusions References
Tables Figures

Back Close
Full It is not straightforward to use this objective function space trade-off for model development or for statistical interpretation of the results (see Reichert and Mieleitner, 2009).The trade-off between the objective functions is related to the non-dominated volume of the POF (see Fig. 1), i.e. the volume of the objective function space between the POF and the origin.Since each objective function has its own scale, no absolute meaning can be attached to this non-dominated volume.However, if the system to be optimized is modified (e.g.changed model structure) while keeping the objective function definitions unchanged, then the POF with the lower non-dominated volume has a lower trade-off.This property of the POF can be used to improve the model structure (for examples, see Schaefli et al., 2004;Fenicia et al., 2007Fenicia et al., , 2008) ) but also to select the most suitable model input data: a simulation resulting from model M 1 with input Y 1 is judged to be better than a simulation from model M 1 with input Y 2 if its POF has at least one better end-point or a smaller non-dominated volume.
It is noteworthy that in MOO, there is always the risk that the used optimization algorithm was simply not able to find points on the true POF.The above assessment can, thus, be misleading for difficult to optimize problems (e.g. a very high number of parameters).A general guideline is that increasing the degree of freedom of a model should always result in a lower trade-off (see Fig. 1), since the model is more flexible.If this is not the case, the results should be interpreted with care.

Sequential merging
In hydrological modeling, sequential merging of the reference data with the model is either done in a formal Bayesian framework (e.g.Kuczera and Parent, 1998) or in a rejectionist framework (e.g.Freer et al., 2004).
Bayesian parameter estimation assumes that the data is distributed around some true value and that the model can predict this true value if the suitable parameter set is identified.If several different data sets are available, the parameters can be estimated in sequence.After having estimated a first parameter distribution with a first data set, every additional data set can be used to update the parameter distribution.Introduction

Conclusions References
Tables Figures

Back Close
Full This approach is precluded for two cases: (i) if the different data sets are incompatible, i.e. if they contain inconsistent information (this could namely arise if the field data contains systematic measurement errors); (ii) if the available data is sparse and not sufficient to reasonably formulate a statistical model (i.e.not enough data to develop and test the assumptions).Beven and Binley (1992) introduced the Generalized Likelihood Uncertainty Estimation (GLUE), partially to address the above problems.Later on, GLUE built the basis of a rejectionist framework (Freer et al., 2004), where acceptable parameter sets are selected among all a priori possible parameter sets by rejecting those that have too low model objective function performance or lie beyond some limits of acceptability (see, e.g.Liu et al., 2009).
While the formal and informal Bayesian approaches have been shown to be useful for model calibration on observed reference flux data (in particular discharge, see, e.g.Kavetski et al., 2006;Freer et al., 1996), its use with observed state variables is often precluded because these observations are sparse in time and in space (here we have for example data for only 5 time steps).
Since the problem at hand is typical for the calibration of hydrological models on times series of observed fluxes and some sparse observations of state variables, we propose here a model calibration approach combining elements of both of the above approaches and classical statistical hypothesis testing.

Proposed model calibration framework
We propose a step-wise calibration framework using MOO as well as sequential updating of an a priori parameter range.In a first step, we use MOO with the observed fluxes and state variables to improve the model structure based on all information that we can extract from the objective function trade-off; the objective functions for this optimization step are discussed in Sect.3.3.In a second and third step, we sequentially calibrate the resulting model.
This sequential calibration combines a statistical hypothesis testing approach to extract the available information from the state observations and proceeds with classical Introduction

Conclusions References
Tables Figures

Back Close
Full model calibration minimizing the sum of squared residuals between an observed and a simulated times series.As mentioned before, such a sequential parameter range updating approach only works if the different data sets do not contain inconsistent information.Here, this is ensured by the preliminary procedure of input data set selection and model structure updating.

Performance with respect to point observations
To assess the model performance with respect to seasonal glacier mass balance, we need to compare the simulated seasonal balance (average per elevation band) to the observed point balances.In the case of large sample sizes and assuming that the samples have been taken at representative locations within the elevation band, the straightforward method would be to estimate the spatial mean from the data and then minimize the residuals between this reference value and the simulated mean value.
Glaciological measurements consist typically of very small samples (i.e. 5 to 10 points per elevation band of a few hundred meters).Without any further source of information, it is highly questionable whether such a small sample contains enough information to estimate the spatial average.Assuming that the samples have been taken at points that reflect the range of seasonal balances, we can, however, at least test whether the observed sample is compatible with the simulated average value for the elevation band.We use a two-sided sign test of the null hypothesis that the sample data comes from an arbitrary continuous distribution with a given median m (e.g Gibbons and Chakraborti, 2003).This is a simple non-parametric test where the test statistics is computed as follows: count the number of samples s i that are lower than the median m, s i < m and the number of sample values s i > m.The test statistics equals to the smaller of the two numbers.This test statistics is known to follow a binomial distribution for n trials (where n is the Introduction

Conclusions References
Tables Figures

Back Close
Full sample size) and a probability of success of each trial of 0.5.We can reject the null hypothesis at the 0.05 level that the sample comes from a continuous distribution with median m if the p-value (p) of the test statistics is smaller than 0.025.This test is robust for small sample sizes.We used the data from the winters 2006/2007 and 2007/2008, which have a high spatial resolution, to verify that the observed point balances have sufficiently symmetric distributions to not bias the statistical test when we apply the above statistical test to the mean balance (the hydrological model only yields mean values) instead of the median.
The performance criterion for a band j and a year t y becomes: The performance criterion G(θ ) considering all available years and elevation bands becomes It is important to point out that the above performance criterion G can be used to select acceptable parameter sets but not to find a single optimum parameter set.This simply follows from the binary nature of g(j,t y ,θ ), which does not allow a further discriminate between acceptable sets.

Performance criteria for time series
For discharge, we use the classical Nash-Sutcliffe index (Nash and Sutcliffe, 1970) as a performance criterion where Q(t) [mm d −1 ] is the observed discharge on day t, Q the corresponding annual mean value, Q(t|θ ) is the simulated discharge given parameter set θ and n is the number of simulated time steps.( .denotes a value obtained from a model simulation, whereas .denotes an observed value).
Since for MOO we have to formalize the optimization problem as a minimization problem, we use the performance criterion f Q (θ ) for optimization: If time series of the glacier-wide annual mass balance are available, a similar performance criterion can be defined: where B(t y ) [mm y −1 ] is the observed glacier-wide mass balance for the year t y , B(t y |θ ) is the corresponding simulated value and n y is the number of available yearly values.

Sequential parameter range updating and final calibration
Once an optimal model structure has been identified, its parameters are sequentially updated starting with some a priori knowledge about the possible range of the param-

Reference data sets for parameter calibration
We use the following reference data sets for parameter calibration: - For the validation of the hydrological model, we complete these reference values with the glacier-wide annual balances simulated with the glaciological model of Huss et al. (2008) 4.2 MOO for modeling system improvement

Selection of meteorological input series
There is the typical problem that the available meteorological time series have been observed at points located outside the catchment.Assessing the information content of the available series for the modeling purpose is, thus, an essential modeling step.
Here, the overall goal is to simulate discharge and glacier mass balance.As discussed in Sect.3.1.1,analyzing the trade-off between the two corresponding objective functions for the different input series, is a valuable tool to identify the most suitable input time series.Figure 3a shows the trade-offs between the model performances for discharge simulation and for glacier-wide annual mass balance simulation using the different available input time series of precipitation and temperature.The 3 Pareto-optimal frontiers (POFs) show clearly that using the precipitation time series observed at Oberwald leads to much better discharge simulations and to a smaller trade-off between obtaining good results for discharge and for mass balance.Accordingly, we prefer this measurement station for this study (even if it stopped recording in 1999).Assessing in detail the differences in information content of the available meteorological series is beyond the scope of this paper.Introduction

Conclusions References
Tables Figures

Back Close
Full

Model structure modification
The GSM-SOCONT model (Schaefli et al., 2005) has 6 parameters that influence the glacier mass balance simulations: the ice and snow melt factors, a i and a s , the precipitation correction factor γ p , the temperature lapse rate ρ and the threshold temperature for rainfall/snowfall separation T c .In the original version, only the first 3 parameters are calibrated, the remaining were fixed, ρ = 0.0065 • C m −1 and T c = 0 • C. The original model, calibrated according to Schaefli et al. (2005) on discharge and glacier-wide annual balance for 3 hydrological years (1979/1980 to 1981/1982), cannot reproduce the seasonal mass balances.To illustrate this result, Fig. 2a shows the observed accumulation quantiles and the simulated winter accumulation per elevation band for one winter; the simulated mean value does not always lie between the observed 10% and 90% quantiles.This suggests that the inclusion of the winter balance observations adds additional information in the sense that it constrains the model differently than without using it.

New model structure for point balance simulation
An attempt of re-calibration of all of the above model parameters on the seasonal point balances, according to the performance criteria described in Sect.3.3, showed that GSM-SOCONT cannot reproduce both winter and annual point balances: for the winter balance, the smallest achievable value of G w (θ ) is 2 and for the annual balance G a (θ ) = 3.This means that for the winter balance, there are least two bands for which we can reject the hypothesis that the observed sample has the median value simulated by the model.In a first step, we tried to use a linear transition between rainfall and snowfall (e.g.Rohrer et al., 1994) and to calibrate the threshold parameters between which there is a mixture of snowfall and rainfall.Even if this makes the model more flexible, it is still not possible to achieve G w (θ ) = 0 and G a (θ ) = 0 simultaneously.Introduction

Conclusions References
Tables Figures

Back Close
Full We next tested the use of seasonally varying model parameters.We successively increased the number of seasonally varying parameters, starting with the temperature lapse rate.It is well known that temperature lapse rates show strong seasonal fluctuations, under the combined effect of dry and saturated adiabatic lapse rates and their dependence on temperature.The temperature decrease with altitude is on average lower during the cold season than during the warm season.This is also confirmed by an analysis of seasonal temperature variations at all meteorological stations within some tens of kilometers around Rhonegletscher, which showed that temperature lapse rates are strongest in June and reach a minimum from November to January.
We, thus, introduced the seasonal temperature lapse rates ρ w and ρ s (winter is defined as from 1 November to 30 April, which corresponds more or less to the dates of winter accumulation observation.)With this additional degree-of-freedom, the model can achieve G w (θ ) = 0 and G a (θ ) = 0, i.e. there are parameter sets that reproduce both the reference point data for winter accumulation and annual balance.This is a priori not surprising; we increased the degree-of-freedom and provided the model with more flexibility to reproduce observed data.It can now, for example, simulate snow melt during the winter season.This modification also reduces the performance trade-off between glacier-wide annual balance and discharge simulation (Fig. 3b).It is noteworthy that the performance with the Grimsel input station is still much worse, confirming our initial choice not to consider this station.The new model version has, however, still an important shortcoming: considering all good solutions (i.e.parameter sets) on the POF of Fig. 3b, the seasonal lapse rates show a distinct dependence on each other for good discharge parameter sets versus good mass balance parameter sets (see scatterplot enclosed in Fig. 3b).This implies that this model version needs different pairs of temperature lapse rates for mass balance simulation than for discharge simulation.
The seasonal lapse rates have, for evident reasons, a strong interaction with the precipitation input correction factor.A solution to overcome the above structural shortcoming is the introduction of seasonal input correction factors.As tests showed, this Introduction

Conclusions References
Tables Figures

Back Close
Full leads to complex parameter interactions with the rainfall/snowfall separation thresholds, which can be solved by modifying the input pre-processing: we replace the altitudinal precipitation correction factor with a solid input (snow) correction factor, γ s [% m −1 ] and an constant rainfall correction factor, γ r [% m −1 ]).
The new interpolation of precipitation as a function of elevation reads as where W (z) and V (z) are the snowfall and the rainfall at elevation z.This is also more meaningful from a physical perspective: rainfall and snowfall occur under different meteorological situations.Their average scaling with elevation can, therefore, be assumed to be quite different.
The new model structure has the nice property that we can fix the rainfall/snowfall separation thresholds (to 0 • C and 2 • C) and still obtain parameter sets that reproduce the reference winter accumulation and annual point balance.This result holds independent of the chosen meteorological input series (but, of course, for each series, we obtain different parameter sets).
We would like to add here that, in this area of the Alps, it is difficult to derive precipitation increase factors directly from observed data (e.g.Sevruk, 1997).Accordingly, it is tempting to think that the above input correction parameters are just "fetch" factors that allow a correction of the observed input for model calibration.The use of a separate snow accumulation correction factor is, however, in line with the finding of Sevruk (1997) that, in the Valais area, significant altitude gradients of precipitation can be found above 1500 m (but not below).This is also about the lower limit of the seasonal snowcover.For the Rhonegletscher, this observation is confirmed by the snow accumulation measurements that show a strong increase with elevation only for high elevations (Huss et al., 2008).
In summary, after confronting the model with the available data sets of winter accumulation and annual point balance, we retain the following model structure: -Fixed rainfall/snowfall separation thresholds ( 0• C and 2 • C).Introduction

Conclusions References
Tables Figures
-Constant ice melt factor a i , snow melt factor a s and snow accumulation correction factor γ s .

New model structure for glacier-wide balance and discharge
If we optimize this model structure on annual glacier-wide mass balance and discharge, there remains an important trade-off between achieving good mass balance and good discharge simulations (see Fig. 4a, model 11p).Introducing a separate summer snow melt factor a ss partly reduces this trade-off but good mass balance simulations lead to far too much winter discharge.Adding a separate summer snow accumulation correction factor considerably reduces the trade-off (see Fig. 4a, model 13p).However, there is still too much discharge in early spring for all good mass balance simulations (see Fig. 4b).This problem can be addressed by introducing a water retention capacity η s for the snow layer.In the original GSM-SOCONT, snow melt M s (t) is directly transferred to the water-runoff transfer module (two linear reservoirs for the glacier part, a linear and a nonlinear reservoir for the non-glacier part).In the new version, only if the ratio between melt water and snow layer depth is higher than η s , snow layer outflow occurs.
The final improved model structure, used in the rest of this paper, has 14 parameters to be calibrated: -Fixed rainfall/snowfall separation thresholds (0 • C and 2 • C) and linear transition between them.
-Ice melt factor a i , constant throughout the year.
-Winter and summer snow melt factors (a sw ,a ss ), -Winter and summer snow accumulation correction factors (γ w , γ s ).Introduction

Conclusions References
Tables Figures

Sequential calibration
We successively constrain the model parameter ranges from physically possible (a priori) ranges (see Table 2) to ones that are plausible given the point reference data (winter and annual point balances) to the final best parameter set for discharge and glacier-wide annual balance simulation.During this sequential updating, we fix all parameters to an initial guess (see Table 1) and vary only the ones corresponding to an updating step.

Winter point balance
We start with the parameters that influence winter accumulation, i.e. the winter accumulation correction factor γ w , the winter snow melt factor a sw and the temperature lapse rate ρ w .During this period, the other parameters have no influence on the model output.Figure 5b, d and f show the distribution of the parameter sets for which we cannot reject the null hypothesis that the observed samples have the simulated median value for all retained elevation bands and winters (Sect.4.1, a total of 100 000 parameter sets was randomly drawn in the initial priors of Table 2).The temperature lapse rates are very small and they show a certain correlation with the snow melt factors (linear correlation of −0.6, see Fig. 5i).This result is to be expected.Melt factors are known to be lapse-rate specific (e.g.Shea et al., 2009) since, to melt a given quantity of snow, the melt factor has to be higher if the temperature is lower (i.e. of the temperature lapse rate is more negative).For the ice melt factor, this relationship is obscured by the fact that ice melt only occurs if the glacier is snow-free.The other parameters do not show any pair-wise dependence structure.Introduction

Conclusions References
Tables Figures

Back Close
Full

Annual point balance
We reduce the prior for a sw , γ w and ρ w to the above identified ranges and draw another 100 000 parameter sets including the summer melt factor, a ss , the summer temperature lapse rate, ρ s , the summer accumulation correction factor γ s and the ice melt factor a i .The distributions of the parameter sets for which the null hypothesis cannot be rejected for all elevation bands and years are shown in Fig. 5 and summarized in Table 2.The summer snow melt factor a ss and the ice melt factor a i show a very strong dependence on each other (see Fig. 5h).This dependence shows the typical shape reported by previous studies (see, e.g.Schaefli et al., 2005, Fig. 4) or (Hock, 1999, Fig. 4).
Figure 6d shows the simulated annual glacier-wide balance corresponding to one of the acceptable parameter sets.This simulation is strikingly different from the reference simulation of Huss et al. (2008).This illustrates that the point mass balances contain some information but not enough to constrain the model.This is partly due to the fact that for annual point balance, only the lowest elevation bands have sufficiently large samples.

Discharge and glacier-wide balance
We complete a final parameter calibration minimizing f Q = 1 − N Q and f B , limiting the search space to the parameter ranges that are acceptable given the point balance observations.Given that neither discharge nor glacier-wide annual balance contain direct information on the temperature lapse rates, we fix their value to a previously identified possible parameter couple (see Table 1).Even under the final updated model structure, there remains a trade-off between f Q and f B (Fig. 6a), corresponding to a considerable range of possible system representations.This is illustrated in Fig. 6c and d that show the discharge and the glacier-wide mass balance simulations for all parameter sets on the Pareto-optimal frontier.The best performing parameter sets under f Q and f B are given in Table 1.The performance of the best f Q simulation in terms of mimicking the observed discharge is comparable to the calibration presented Introduction

Conclusions References
Tables Figures

Back Close
Full Figure 6d shows the simulated glacier-wide annual mass balance for all parameter sets on the POF.We compare this simulation to the one corresponding to the original model structure calibrated on f Q and on f B .Just as the original model, the new model strongly overestimates the negative cumulative mass balance at the end of the simulation period (further discussed in Sect.5.2) leading to a strong bias, where the bias between a simulated variable x and a reference variable y is defined as 100(x −y) y −1 .The original model shows a mean bias compared to the simulation of Huss et al. (2008) of 294%.In the new model, this is reduced to 101% for the best simulation under N Q and equals −33% for the best simulation under f B .
The variability of the monthly glacier-wide balance of the reference simulation is also better reproduced by the new simulation.This can be measured by the coefficient of variation, which corresponds to the standard deviation of a variable x divided by the mean of x.The coefficient of variation of the reference simulation is −14.9, the new model gives −7.1 under N Q and −13.3 under f B whereas the original model had a far too low value (−3.9).
Finally, Fig. 6b illustrates an important detail finding; in previous work, we used a warm-up period of 2 years for model calibration, assuming that this is sufficient to build-up the initial snowpack.Given the new model formulation, including the snowpack retention capacity, a much longer warm-up period is required since during the 3rd and 4th year, the built-up firn (snow that lasts more than one season) is still too small to retain the meltwater in early spring.Introduction

Conclusions References
Tables Figures

Back Close
Full

Multi-objective calibration for model improvement
The presented case study revealed the importance of considering flux and state variables for model development, i.e. of using several reference signals.First of all, for the choice of appropriate input series: considering discharge performance or mass balance performance individually, none of the available measurement stations would appear to contain more valuable information.Considering the trade-off between the two objective functions reveals that there is a considerable difference in information content.We believe that this type of analysis has great potential for hydrological modeling where assessing the (dis-)information content of input time series is still one of the big open questions.
During the model modification process, it is again the trade-off between the reproduction of these two model variables that provides clear indications about the value of adding a parameter or modifying a submodule.The multi-signal approach allows the detection of a subtle model improvement through the introduction of the snow layer retention capacity η s : adding this parameter does not lead to a better discharge simulation, i.e. we cannot find a parameter set under the new model structure (including η s ) that outperforms the best discharge simulation under the old model structure.But the best mass balance parameter sets do also a fairly good job for discharge simulation and vice-versa; the trade-off (see Fig. 1) between discharge performance and mass balance performance is reduced.
Our results also underline the value of analyzing the parameter sets associated with the simulation trade-offs, i.e. of analyzing the parameter sets corresponding to points on the Pareto-optimal frontier (Fig. 3): the strikingly different relationship between winter and summer lapse rate parameters for good discharge simulation versus good mass balance simulation was a very strong hint for a model structural deficiency.Again, this type of model diagnostics could be very valuable in other hydrological contexts.Introduction

Conclusions References
Tables Figures

Back Close
Full

Reproduction of glacier mass balance and water balance
As our final calibrated model demonstrates, a simple semi-lumped hydrological model can indeed reproduce glacier mass balance dynamics if there is enough reference data to identify the model structure and calibrate the model parameters.We can, however, not expect the hydrological model to reproduce the exact interannual glacier-wide mass balance dynamics of the more complex glaciological model, partly because we do not use the same meteorological input data.
The original GSM-SOCONT model systematically overestimates summer melt and underestimates winter accumulation, with respect to data (Fig. 2), as well as with respect to the reference simulation, even if the annual glacier-wide mass balance is fairly close to the reference simulation.This clearly highlights the need to include seasonal mass balance observations in the model development process.
The most important difference between the original GSM-SOCONT and the new model version becomes visible if we consider the individual mass balance terms (Table 3).The total area-averaged precipitation is much higher for the new model than for the original model, evaporation is lower.These differences in the mass balance terms can result in big differences in the relative contribution of glacier storage change to total annual runoff (Fig. 6e) in individual years.This relative contribution is not only a proxy for climate sensitivity but also a key variable for water management and in particular hydropower production (see Schaefli et al., 2007).
The question which of the two water balance representations is closer to the "truth" is extremely difficult to answer.There are estimated reference values on a 2 km by 2 km grid for evaporation and precipitation (see Table 3) but given the resolution and the complexity of the topography, these are just rough estimates for such a small catchment.A rather strong hint is, however, the following: for the years 1974-1990 (to which Table 3 refers to), the reference simulation of Huss et al. (2008)  Neither the original nor the new model can reproduce the annual glacier-wide mass balance simulations at the end of the simulation period.As discussed by Huss et al. (2009), this period corresponds to a period of intense melting where traditional temperature-index models might become unreliable.Huss et al. (2009) found in fact consistently negative degree-day factor anomalies for this period.The question how to account for such anomalies in temperature-index types hydrological models, especially for projections into the future, remains open to date.

Calibration on point balance observations
The proposed model calibration framework makes use of small samples of observed point mass balances (on average, around 20 samples per elevation band) to identify plausible parameter sets for which we cannot reject the hypothesis that the observed samples have the simulated values as median.
Our results suggest that even small samples of point mass balance observations contain valuable information to calibrate 4 additional hydro-meteorological model parameters with respect to the original GSM-SOCONT.The obtained trade-off for the reproduction of winter balance versus annual balance with the original model provides evidence that both data sets contain complementary information.The fact that this trade-off can be removed through simple model modifications suggests that the point data sets are not inconsistent.
Such observations are often deemed to be too sparse to be useful to calibrate a simple semi-lumped hydrological model.Our results, however, demonstrate that the data Introduction

Conclusions References
Tables Figures

Back Close
Full is indeed useful to identify a model structure that can reproduce the seasonal mass balance dynamics and furthermore, it constrains the plausible ranges of the parameters.This conclusion is at least partly unexpected: in fact, it is often assumed that winter accumulation data does not contain any information on temperature lapse rates arguing that in glacier catchments, the low winter temperatures lead to permanent snow fall independent of the temperature lapse rate.
The updated plausible parameter ranges contain other hints that support the value of the proposed method to extract information from the mass balance samples: (i) the temperature lapse rates are less strong in winter than in summer as expected for theoretical reasons (see Sect. 4.2.2) and (ii) the parameters do not show unexpected pair-wise dependencies, the dependence of a ss and a i shows the typical shape obtained in classical calibration on time series (Fig. 5).

Is glacier-wide balance data sufficient?
Our case study shows that using only discharge and annual glacier-wide mass balance data can lead to wrong intra-annual mass balance dynamics.And using only point mass balance observations can lead to wrong glacier-wide mass balance.Both results can be easily understood if we consider the general water balance equation: where F + j (t) is the influx of a spatial unit, F − t (t) the outflux and dS/dt the yearly storage change.In many catchments, it can be assumed that dS/dt is very small or even negligible.This does not hold, for evident reasons, in catchments with a significant glacier cover.The same storage change can be obtained with different compensations between the flux terms.For a reliable model calibration, observed reference data (or any other information about the behavior of the system) has to provide constraints (i) on dS/dt and either F + j (t) or F − t (t) or (ii) on F + t (t) and F − t (t).Introduction

Conclusions References
Tables Figures

Back Close
Full Glaciological data provides direct information on the different mass balance terms: winter accumulation observations on F + j (t), annual balance observations on F − j (t) (often observed only for low elevation bands) and annual glacier-wide mass balance estimations on dS/dt.Discharge data also contains information on these terms but obscured by the hydrological processes in the ice-free part; it namely does not contain any direct information on dS/dt given that there is an essentially unknown outflux: evaporation.
Since in a glacier catchment, any lack of snow accumulation during winter can be compensated by ice melt, discharge effectively contains little direct information on snow-or rainfall (i.e. on F + j (t)).This shows that for a reliable model calibration, we have to pick winter accumulation and one or two other elements of the above list.If we have a good model structure, winter accumulation data and discharge data could be sufficient to calibrate the model.In practice, however, annual balance data will be needed to compensate for imperfect model structure.The same holds for the use of winter accumulation with annual glacier-wide mass balance data.In any case, using only discharge and annual glacier-wide data is highly likely to lead to at least partly erroneous estimated fluxes.Model calibration only on discharge data, common practice in hydrology, can probably be assimilated to mere guessing in the context of prediction of high Alpine hydrology.

Conclusions
Getting the water balance right is a basic requirement for any hydrologic model -yet, in many environments, this is very difficult to achieve.In the case of glacier catchments, it is often assumed that some glacier-wide annual mass balance estimations, in addition to observed discharge, is good enough to obtain reliable estimates of the different water balance terms and to develop prediction models.In this paper, we present evidence that information on the seasonal mass balance is a pre-requisite to reliably calibrate a Introduction

Conclusions References
Tables Figures

Back Close
Full hydrological model and we demonstrate the value of a simple, but promising method to use small samples of point observations for this calibration.
The observed seasonal mass balance data could only be reproduced through a model structure modification.The identified new model structure shows some interesting features; it has seasonal lapse rates and a separate snow accumulation and rainfall correction factor.These last two parameters are required for the model to be sufficiently flexible to compensate differently for missing processes during summer and winter (e.g.raingauge undercatch, redistribution through wind etc.).In summary, we can say that a simple conceptual model, as the one presented here, can be a valuable tool to project the behavior of a glacier catchment but only if we have enough (seasonal) information to constrain the parameters that directly affect the mass balance.Since in most catchments around the world, virtually no observed ground-based data is available, future research could focus on the question of how to extract such information from remotely-sensed data.It would also be interesting to assess how the information content of seasonal mass balance data varies across hydro-climatological regions.
Glacier hydrology deals with systems where the main water storage term, the glaciers, can be directly observed -an invaluable advantage over other ecosystems.We showed in this paper possible ways of taking advantage of this direct data on storage and flux to select appropriate input time series and to improve the hydrologic model structure.We believe that such catchments offer ideal test cases for further developments in this direction.Since, in addition, these catchments are undergoing rapid change, they can also be used to assess the ability of hydrologic models to predict change.Introduction

Conclusions References
Tables Figures

-
Daily discharge, the years 1974-1982 for calibration, 1983-1999 for validation (performance criterion N Q ); -Observed winter point accumulation (direct measurements) for winters 1979/1980 to 1981/1982; for each year, we only use the 3 highest bands (to ensure that the ice melt parameter does not influence the model result).One sample (elevation band 4, winter 1981/1982) has less than 10 points and is excluded from the analysis.In total, we have 3 × 3 − 1 = 8 reference samples.Discussion Paper | Discussion Paper | Discussion Paper | Observed annual point balance (direct measurements) for the corresponding hydrological years.Only the two lowest elelvation bands have more than 10 observations, resulting in 6 reference samples.-Glacier-wide annual balance (performance criterion f B ): for the years 1979/1980 to 1981/1982, we have reference values estimated from direct measurements.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

-
Discussion Paper | Discussion Paper | Discussion Paper | Summer rainfall correction factor γ r .
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |inSchaefli et al. (2005) (Nash value of 0.92 and bias −0.01 for the calibration period, 0.90 and bias of 0.03 for the entire period 1976-1999).The rainfall/melt water-runoff transformation parameters of the non-glacier part have considerably different values then the original GSM-SOCONT, which is due to the modification of the transfer module with the introduction of η s .
Discussion Paper | Discussion Paper | Discussion Paper | corresponds to a mean glacier storage change of −260 mm w.e.(water equivalent) per year relative to the ice-covered area (around 17 km 2 ).Relative to the entire catchment area (38.9 km 2 ), this equals Discussion Paper | Discussion Paper | Discussion Paper | −110 mm w.e. a −1 , a value that is very close to the storage change simulated with the hydrological model for the same period (−100 mm w.e. a −1 ), whereas the original model gives a value of −250 mm w.e. a −1 .It is noteworthy that the hydrological model slightly underestimates the overall catchment water storage change since it does not simulate any evaporative losses on the glacier-covered catchment part.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Hingray, B., Schaefli, B., Mezghani, A., and Hamdi, Y.: Signature-based model calibration for hydrologic prediction in mesoscale Alpine catchments, Hydrolog.Sci.J., 55, 1002-1016, doi:10.1080/02626667.2010.505572,2010.8663 Hock, R.: A distributed temperature-index ice-and snowmelt model including potential direct solar radiation, J. Glaciol., 45, 101-111, 1999.8667Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Table Fig. 3. Pareto-optimal frontiers (POF) illustrating the trade-off between (a) model performance for the simulation of discharge and of annual glacier-wide balance (the mass balance criterion f B (θ) is normalized by the observed value); (b) as (a) but for the model version with seasonal lapse rates, the enclosed figure shows the points on the POF in the parameter space rather than in the objective function space (note: the calibration presented in Schaefli et al., 2005, has f B = 0.57 and f Q = 0.06.)

Fig. 4 .
Fig. 4. (a) Trade-off between discharge performance criterion and mass balance performance criterion for different versions of the modified GSM-SOCONT; 11p stands for 11 parameters (for parameters see Table 1; note: the figure does not show the tail end of the POF for the 11p model, the best performance for f Q is 0.06 as for the other model versions.(b) discharge simulations corresponding to all parameter sets of (a).

Fig. 5 .
Fig. 5. (a)-(g) Distribution of behavioral parameter sets for the seasonal point mass balances, (h) and (i) scatter plots of the parameters showing the strongest linear correlations.
Schaefli et al. (2005) depth of the snow cover at time step t.The transformation of firn (old snow that lasted more than a season) to ice is not accounted for.For comparison purposes, we use the parameter set fromSchaefli et al. (2005)(see Table 4. For each b[j,t|θ](t|θ v ), test the null hypothesis H 0 that the observed sample b[x,y,t|(x,y) ∈ j ] (point measurements) comes from a distribution having as median b[j,t|θ](t|θ v ) against the alternative hypothesis that it does not (see Sect. 3.3.1).
8. Use the updated parameter ranges as prior parameter ranges to calibrate the model on the available time series (discharge, glacier-wide mass balance) (see Sect. 3.3.2).

Table 1 .
(Schaefli et al., 2005)GSM-SOCONT parameter values; the default values are from(Schaefli et al., 2005).The calibrated values are for the final improved model structure with 14 parameters; the last column indicates in which model version the parameter is used, 8p stands for the original version with 8 parameters, > 8p stands for all model versions with more than 8 parameters.

Table 2 .
Walle and Rango, 2008)meter ranges for the generation of random parameter sets; if there is no updated range, this implies that the point mass balance data does not contain information for this parameter; the lower limit for ρ w and ρ s is the dry adiabatic lapse rate (e.g.DeWalle and Rango, 2008).Name UnitMin.prior Max.prior Min.updated Max.updated Ref. data on a 1km by 1km grid; S/ t = P − Q − E is the storage variation; for the last column, this value is put in brackets since it is computed from data sources referring to different periods; snowfall is indicated relative to total precipitation.