Calibration Approaches for Distributed Hydrologic Models Printer-friendly Version Interactive Discussion Calibration Approaches for Distributed Hydrologic Models Using High Performance Computing: Implication for Streamflow Projections under Climate Change Hessd Calibration Approaches for Distributed

This study utilizes high performance computing to test the performance and uncertainty of calibration strategies for a spatially distributed hydrologic model in order to improve model simulation accuracy and understand prediction uncertainty at interior ungaged sites of a sparsely-gaged watershed. The study is conducted using a distributed 5 version of the HYMOD hydrologic model (HYMOD_DS) applied to the Kabul River basin. Several calibration experiments are conducted to understand the benefits and costs associated with different calibration choices, including (1) whether multisite gaged data should be used simultaneously or in a step-wise manner during model fitting, (2) the effects of increasing parameter complexity, and (3) the potential to 10 estimate interior watershed flows using only gaged data at the basin outlet. The implications of the different calibration strategies are considered in the context of hydrologic projections under climate change. Several interesting results emerge from the study. The simultaneous use of multisite data is shown to improve the calibration over a step-wise approach, and both multisite approaches far exceed a calibration 15 based on only the basin outlet. The basin outlet calibration can lead to projections of mid-21st century streamflow that deviate substantially from projections under multisite calibration strategies, supporting the use of caution when using distributed models in data-scarce regions for climate change impact assessments. Surprisingly, increased parameter complexity does not substantially increase the uncertainty in streamflow 20 projections, even though parameter equifinality does emerge. The results suggest that increased (excessive) parameter complexity does not always lead to increased predictive uncertainty if structural uncertainties are present. The largest uncertainty in future streamflow results from variations in projected climate between climate models, which substantially outweighs the calibration uncertainty.


Figures Back Close
Full In an effort to advance hydrologic modelling and forecasting capabilities, the development and implementation of physically-based, spatially distributed hydrologic models has proliferated in the hydrologic literature, supported by readily available geographic information system (GIS) data and rapidly increasing computational power.Distributed hydrologic models can account for spatially variable physiographic properties and meteorological forcing (Beven, 2012), improving simulations compared to conceptual, lumped models for basins where spatial rainfall variability effects are significant (Ajami et al., 2004;Koren et al., 2004;Reed et al., 2004;Khakbaz et al., 2012;Smith et al., 2012) and for nested basins (Bandaragoda et al., 2004;Brath et al., 2004;Koren et al., 2004;Safari et al., 2012;Smith et al., 2012).The benefits of distributed modeling have been recognized by the US National Oceanic and Atmospheric Administration's National Weather Service (NOAA/NWS) and demonstrated in the Distributed Model Intercomparison Project (DMIP) (Reed et al., 2004;Smith et al., 2004Smith et al., , 2012Smith et al., , 2013)).Importantly, distributed hydrologic models can evaluate hydrological response at interior ungaged sites, a benefit not afforded by conceptual, lumped models.The use of distributed hydrologic modelling for interior point streamflow estimation is particularly relevant for poorly gaged river basins in developing countries, where reliable predictions at interior sites are often required to inform water infrastructure investments.As international development agencies begin to integrate climate change considerations into their decision-making processes (e.g., Yu et al., 2013), these investments need to be robust under both current climate conditions and alternative climate regimes.Despite their roots in physical realism, distributed hydrologic models can suffer from substantial uncertainty.A major source of uncertainty originates from the proper identification of parameter values that vary across the watershed, especially when observed streamflow data is only available at one or a few points.Parameters can be discretized across the watershed in several ways: uniquely for each grid cell Figures

Back Close
Full (fully distributed), based on hydrologic response units (semi-distributed), or in the simplest case, a single parameter set for all model grid cells (lumped).With limited data, the parameter identification problem, particularly for the fully distributed case, can be impractical or infeasible (Beven, 2001).The parameterization challenge has spurred substantial advances in understanding appropriate calibration techniques for distributed hydrologic models.Many studies have attempted to reduce the dimensionality of the calibration problem to alleviate the issue of equifinality (Beven and Freer, 2001), which is the phenomenon whereby multiple parameter sets produce indistinguishable model performance.This work has found favorable results when the parametric complexity of the distributed model is aligned with the data available for calibration (Leavesley et al., 2003;Ajami et al., 2004;Eckhardt et al., 2005;Frances et al., 2007;Zhu and Lettenmaier, 2007;Cole and Moore, 2008;Pokhrel and Gupta, 2010;Khakbaz et al., 2012).There has also been extensive research exploring the use of multiple objectives and different operational procedures to understand parameter estimation tradeoffs and identifiability for distributed model calibration, with great success (Madsen, 2003;Efstratiadis and Koutsoyiannis, 2010;Li et al., 2010;Kumar et al., 2013).Despite these advances, important questions still persist.It still remains difficult to compare the uncertainty that emerges from different operational calibration procedures for multisite applications (i.e.whether gages in series should be used sequentially or simultaneously for calibration) and under different levels of parametric complexity.Due to the computational burden required to calibrate distributed models, this uncertainty is problematic to explore.Further, in poorly gaged basins, it is challenging to quantify the lost accuracy and increased uncertainty for interior flow estimation when a distributed model is calibrated only at an outlet gage (which is often all that is available in developing country river basins).Many studies have reported that distributed models calibrated at the basin outlet are less accurate at interior locations (Anderson et al., 2001;Cao et al., 2006;Wang et al., 2012), but the extent of the error and uncertainty is unknown due to the computational expense needed to explore this issue.Finally, Introduction

Conclusions References
Tables Figures

Back Close
Full rarely have the implications of these calibration issues been explicitly examined for an alternative climate, which is required in climate change impact studies.This question has been explored for lumped, conceptual models (Wilby, 2005;Steinschneider et al., 2012), but has been difficult to evaluate for computationally expensive distributed models.
This study addresses the above research challenges by focusing on the following four questions: (1) How does calibration procedure for using multisite data effect the accuracy and uncertainty of distributed models used for streamflow predictions at ungaged sites, ( 2) what effects do increased parameter complexity have on distributed model calibration and prediction, (3) how much degradation in model accuracy and uncertainty can be expected for interior flow estimation based on a calibration procedure using only the basin outlet, and (4) how do different calibration formulations for a distributed model alter projections of streamflow at ungaged sites under climate change conditions?These questions are considered in an application of a distributed version of the daily HYMOD hydrologic model to the Kabul River basin in Afghanistan and Pakistan.To address these research questions, high performance computing is utilized to manage the computational burden that often hinders such explorations, a relatively recent technique employed in hydrological modeling research (Laloy and Vrugt, 2012;Zhang et al., 2013).

Study area
The Kabul River basin (67 370 km 2 ) is a plateau surrounded by mountains located in the eastern central part of Afghanistan (Fig. 1).Water resources from the basin are shared by Afghanistan and Pakistan and serve as a water supply source for more than 20 million people.The shared use of transboundary water between these two countries is central in establishing regional water resources development for this area (Ahmad, 2010).It is crucial to develop tools that can support engineering plans for existing and potential water infrastructure to take full advantage of the water resources in the basin.Introduction

Conclusions References
Tables Figures

Back Close
Full The streamflow regime can be classified as glacial with maximum streamflow in June or July and minimum streamflow during the winter season.Approximately 70 % of annual precipitation (475 mm) falls during the winter season (November-April).Glaciers and snow cover are the most important long-term forms of water storage and, hence, the main source of runoff during the ablation period for the basin.In total 5.7 % (3813 km 2 ) of the basin is glacierized based on the Randolph Glacier Inventory version 3.2 (Pfeffer et al., 2014).The melt water from glaciers and snow produce the majority (75 %) of the total streamflow (Hewitt et al., 1989).In recent years, most of the world's mountain glaciers have shown negative mass balance and rapid decrease in glacier area and volume (Dyurgerov and Meier, 2005), while in the Himalayan region trends depend on location (Bolch et al., 2012).The vulnerability of glacial streamflow regimes to changes in temperature and precipitation (Stahl et al., 2008;Immerzeel et al., 2012) highlights the need to assess the impact of climate change on water resources in this area (Immerzeel et al., 2010(Immerzeel et al., , 2013;;Molg et al., 2014;Radic et al., 2014).

Methods
The purpose of this study is to explore the implications of different calibration strategies and choices for a computationally expensive distributed hydrologic model.A variety of calibration experiments are conducted, with the results from preceding experiments informing choices made for subsequent ones.All calibration approaches are tested in terms of their ability to predict flows at interior site gages that were left out of the calibration process.In all cases, the genetic algorithm (GA) is used as an optimization method for model parameter calibration (Wang, 1991;Zhang et al., 2008;Kollat et al., 2012), and the objective function is based simply on the Nash Sutcliff efficiency (NSE) (Nash and Sutcliff, 1970), which is by far the most utilized performance metric in hydrological model applications (Biondi et al., 2012).A multisite average of the NSE is used when evaluating performance across multiple sites.We fully recognize that the use of one objective, such as the NSE, is inferior compared to Introduction

Conclusions References
Tables Figures

Back Close
Full multi-objective approaches that can identify Pareto optimal solutions that provide good model performance across different components of the flow regime (Madsen, 2003;Efstratiadis and Koutsoyiannis, 2010;Li et al., 2010;Kumar et al., 2013).However, in this particular study daily hydrologic model simulations can only be compared against available monthly streamflow records, reducing the number of viable objectives against which to calibration.That is, statistics representing peak flows, extreme low flows, and other daily flow regime characteristics often used in multi-objective optimization approaches are unavailable.We believe that the use of a monthly NSE value as a single objective, while coarse, does not inhibit our ability to provide insight into the research questions posed.
In this study, three levels of parameter complexity are considered: lumped, semidistributed, and fully distributed formulations (Fig. 2).The different levels of parameter complexity are defined according to the spatial distribution of unique hydrologic model parameters.In the lumped formulation a single parameter set is applied to the entire basin.In the semi-distributed formulation, a unique parameter set is assigned to each sub-basin, defined based on the location of available streamflow gaging sites.The fully distributed parameter structure follows the spatial discretization of climate input grids, allowing a unique parameter set for each grid cell.No matter the parameterization scheme, the model structure follows the climate input grids, i.e. the hydrological water cycle within each grid cell is modelled separately.
The parameter complexity will vary depending on the calibration experiment being conducted, but for each experiment regardless of the parameterization, the optimization is implemented 50 times using the GA algorithm to explore parameter uncertainty.The considerably high computational cost required to perform a large number of calibrations is managed using the parallel computing power provided by the Massachusetts Green High Performance Computing Center (MGHPCC), from which several thousands of processors are available.
In the first modeling experiment, we explore two calibration strategies for using multisite streamflow data, a stepwise and pooled approach.In the stepwise calibration, Introduction

Conclusions References
Tables Figures

Back Close
Full parameters are calibrated for upstream gaged sub-catchments and subsequently fixed during calibration of downstream points, while for the pooled approach, parameters are calibrated for multiple sub-catchments simultaneously.Both approaches are assessed for the semi-distributed formulation.The better of the two methods is identified for use in the second experiment, where the effects of increased parameter complexity are tested in terms of streamflow prediction accuracy and uncertainty.In the third experiment, we consider the situation where there is only gaged location at the basin outlet for calibration.Here, the model is calibrated against the outlet gage under all levels of parameter complexity and is compared against the best combination of calibration strategy (step-wise or pooled) and parameter complexity (lumped, semi-distributed, or fully distributed) identified in the previous experiments.Finally, a subset of the calibration approaches deemed worthy of further investigation are compared in terms of their projections of future streamflow under climate change to highlight how model calibration differences can alter the results of a climate change assessment for water resources applications.These experiments are described in further detail below.

Multisite calibration: stepwise and pooled approaches
In the first experiment, the semi-distributed parameterization concept is compared under alternative multisite calibration strategies, the stepwise and pooled calibration approaches.To conduct the stepwise calibration, a nested class of sub-basins is defined corresponding to multiple gaging stations.In the first step of the stepwise calibration, the optimization process is carried out with nested sub-basins at the lowest level (i.e., the most upstream sites).Once parameters of nested sub-basins are determined, the parameters are fixed, and the calibration procedure proceeds with nested basins at upper levels until parameters for the entire basin are determined.
In this particular application to the Kabul River basin, 5 gaged sub-basins were selected and the stepwise calibration procedure for those sub-basins followed this direction: Chitral → Gawardesh → Chaghasarai → Daronta → Dakah (Fig. S1 in the Supplement).The stepwise calibration approach involves a number of GA Introduction

Conclusions References
Tables Figures

Back Close
Full implementations corresponding to the number of gaging sites.The GA optimization was carried out a total of 250 times in this application, with 50 optimization runs containing GA implementations for 5 sub-basin regions.
The pooled calibration strategy involves calibrating all parameters of the model domain simultaneously against multiple streamflow gages within the watershed.This approach aims at looking for suitable parameters that are able to produce satisfactory model results at all gaging stations in a single implementation of GA optimization.That is, the GA searches the entire parameter space at once to maximize the average NSE across all sites.This operational feature reduces the processing time spent on the GA implementation compared to the stepwise calibration strategy.To identify the better of the two multisite calibration approaches, the comparison focused on their ability to predict streamflow and calibration uncertainties at two interior site gages (Kama and Asmar) that were assumed to be ungaged (Fig. S1 in the Supplement), as well as for validation data at the basin outlet.

Increased parameter complexity
In the second experiment, the better of the two approaches (step-wise or pooled) identified in the first experiment is further tested with respect to the three different levels of parameter complexity.In addition to the semi-distributed parameter formulation considered in the first experiment, lumped and fully-distributed parameter formulations are calibrated for the selected approach to investigate the gain or loss arising from different levels of parameter complexity.Since the hydrologic model HYMOD employed in this study involves 15 parameters, the lumped version of the HYMOD_DS contains a single, 15-member parameter set applied to all model grid cells.The semi-distributed conceptualization of HYMOD_DS contains a single parameter set for each sub-basin, totaling 75 parameters.In the distributed parameterization the number of parameters increases dramatically.With 160 0.25 • grid cells, the number of parameters requiring calibration reaches 2400.As the number of parameters increase across the parameterization schemes, calibration becomes increasingly 10281 Introduction

Conclusions References
Tables Figures

Back Close
Full computationally expensive.The number of model runs used in the GA optimization algorithm for the lumped, semi-distributed, and distributed parameterization schemes are 15 000 (150 populations × 100 generations), 75 000 (750 × 100), and 480 000 (2400 × 200), respectively.These population/generation sizes were supported using convergence tests for each calibration.Again, 50 separate GA optimizations were used to explore calibration uncertainties for each parameterization scheme.To give a sense of the computational burden of this experiment, we note that 50 trials of the HYMOD_DS calibration under the distributed conceptualization required 1000 processors over 7 days on the MGHPCC system.

Basin outlet calibration
The third experiment considers the situation where there is only gaged data at the basin outlet (Dakah) for calibration, a common situation when calibrating hydrologic models in data-scarce river basins.Here, we evaluate the potential of the basin outlet calibration to estimate interior watershed flows in terms of both accuracy and precision at all gaging stations.All levels of parameter complexity are considered for this calibration.The main purpose of this experiment is to compare the veracity of a distributed hydrologic model calibrated only using basin outlet data with results from multisite calibrations to better understand the degradation in model performance under data scarcity.Other than the use of an NSE objective only at the basin outlet, all other GA settings for each level of parameter complexity are same as the settings used in the second experiment.

Climate change projections of streamflow
The fourth experiment investigates how the choice of calibration approach can alter the projections of future streamflow under climate change.To explore this question, streamflow simulations for the 2050s, defined as the 30 year period spanning from 2036 to 2065, are carried out using climate projections from the World Climate Research Introduction

Conclusions References
Tables Figures

Back Close
Full Programme's Coupled Model Intercomparison Project Phase 5 (CMIP5) (Talyor et al., 2012).A total of 36 different climate models run under two future conditions of radiative forcing (RCP 4.5 and 8.5) are used.Streamflow projections are developed for the basin outlet (Dakah) and two interior gages left out of the calibration (Kama and Asmar).By using 36 different general circulation models (GCMs) and 50 optimization trials for each calibration scheme, this analysis compares the uncertainty in future streamflow projections originating from uncertainty in different hydrologic model parameterization schemes and under alternative future climates.Streamflow projections are considered under all three parameterization schemes (lumped, semi-distributed, and fully distributed) for both the basin outlet model and the best multi-site calibration approach (step-wide or pooled).Multiple streamflow characteristics are evaluated, including monthly streamflow climatology, wet (April-September) and dry (October-March) season flows, and daily peak flow response.The differences and uncertainty in these metrics across calibration approaches will highlight the importance of calibration strategy for evaluating future water availability and flood risk.

Data
Gridded daily precipitation and temperature products with a spatial resolution of 0.25 • were gathered between calendar years 1961-2007 from the Asian Precipitation Highly Resolved Observational Data Integration Towards Evaluation (APHRODITE) dataset (Yatagai et al., 2012).There has been some concern regarding underestimation of precipitation in APHRODITE for some regions of Asia (Palazzi et al., 2013); our preliminarily data analysis (intercomparison of precipitation products between 5 different databases) confirmed this for the Kabul River basin (shown in Fig. S2 in the Supplement).Thus, the APHRODITE precipitation was bias-corrected by the Introduction

Conclusions References
Tables Figures

Back Close
Full precipitation product from the University of Delaware global terrestrial precipitation (UD) dataset (Legates and Willmott, 1990).Daily series of bias-corrected APHRODITE precipitation were coupled with APHRODITE temperature for 160 0.25 • grid cells to produce a climate forcing dataset for the distributed domain of the Kabul River basin model.
This study used the set of global climate change simulations from the CMIP5 multi-model ensemble (Talyor et al., 2012).Monthly climate outputs of GCMs were downscaled to a daily temporal resolution and 0.25 • spatial resolution based on the bias-correction spatial disaggregation (BCSD) statistical downscaling method introduced by Wood et al. (2004).
Monthly streamflow observations for seven locations in the Kabul River basin (Fig. 1) were gathered between calendar years 1961-1980 from two data sources: the Global Runoff Data Centre (GRDC) database and the United States Geological Survey (USGS) database (Table 1).The available streamflow observations at each station were used for calibrating and validating the distributed hydrologic model (Fig. 3).Kama and Asmar stations are treated as ungaged sites and left out of the processes of multisite calibrations in order to evaluate the model's ability to predict streamflow at interior ungaged sites.Furthermore, half of the record at the Dakah station, located at the basin outlet, is also used for validation purposes.
The Randolph Glacier Inventory version 3.2 (RGI 3.2) dataset (Pfeffer et al., 2014) was used to extract glacial coverage in the Kabul River basin, which totaled 5.7 % of the basin area (Fig. S3 in the Supplement).In the hydrological modeling process, the model needs to be informed by reliable estimates on volume of water retained in glaciers, especially for future simulations under warming conditions.We followed the method proposed in Grinsted ( 2013 individual glacier is estimated using the global digital elevation model (DEM) from the shuttle radar topography mission (SRTMv4) in 250 m resolution (Jarvis et al., 2008).Density of ice (0.9167 g cm −3 ) is applied to calculate glacier/ice cap volume in meters of water equivalent.

Distributed hydrologic model (HYMOD_DS)
In this study the lumped conceptual hydrological model HYMOD (Boyle, 2001) is coupled with a river routing model to be suitable for modelling a distributed watershed system.We name it HYMOD_DS denoting the distributed version of HYMOD.Snow and glacier modules have been introduced to enhance the modelling process for glacier and snow covered areas within the Kabul River basin.The HYMOD_DS is composed of hydrological process modules that represent soil moisture accounting, evapotranspiration, snow processes, glacier processes and flow routing.The model operates on a daily time step and requires daily precipitation and mean temperature as input variables.The overall model structure of the HYMOD_DS and its 15 parameters are described in Fig. 4 and Table 2 respectively.Further details are provided below.
The HYMOD conceptual watershed model has been extensively used in studies on streamflow forecasting and model calibration (Wagener et al., 2004;Vrugt et al., 2008;Kollat et al., 2012;Gharari et al., 2013;Remesan et al., 2013).The HYMOD is a soil moisture accounting model based on the probability-distributed storage capacity concept proposed by Moore (1985).This conceptualization represents a cumulative distribution of varying storage capacities (C) with the following function: where the exponent B is a parameter controlling the degree of spatial variability of storage capacity over the basin and C max is the maximum storage capacity.The model assumes that all storages within the basin are filled up to the same critical level (C * (t)), Introduction

Conclusions References
Tables Figures

Back Close
Full unless this amount exceeds the storage capacity of that particular location.With this assumption, the total water storage S(t) contained in the basin corresponds to Consequently, two parameters are introduced for the runoff generation process with two components: where P (t) is precipitation, Runoff 1 is surface runoff, and Runoff 2 is subsurface runoff.A parameter (α) is introduced to represent how much of the subsurface runoff is routed over the fast (Q fast ) and slow (Q slow ) pathway: The potential evapotranspiration (PET) is derived based on the Hamon method (Hamon, 1961) and a bias correction factor (Coeff) is applied to the PET calculation.
The HYMOD_DS includes snow and glacier modules with separate runoff processes, i.e., the runoff from the glacierized area is calculated separately and added to runoff generated from the soil moisture accounting module coupled with the snow module.The implicit assumption here is that there is no interchange of water between soil layers and glacial area and runoff from glacial areas is regarded as surface flow.The runoff from each area is weighted by its area fraction within the basin to obtain total runoff.Introduction

Conclusions References
Tables Figures

Back Close
Full The time rate of change in snow and glacier volume governed by ice accumulation and ablation (melting and sublimation) is expressed by the Degree Day Factor (DDF) mass balance model (Moore, 1993;Stahl et al., 2008).The dominant phase of precipitation (snow vs. rain) is determined by a temperature threshold (T th ).The snow melt M s and glacier melt M g is calculated as: with DDF s (T s ) and DDF g (T g ) applied separately for snow and glacier modules, respectively.To account for the higher melting rate of glacier than snow owing to the low albedo (Konz and Seibert, 2010; Kinouchi et al., 2013), we introduced a parameter r > 1 to constrain DDF g to be larger than DDF s (i.e.DDF g = r • DDF s ).For the rain that falls on the glacierized area, the glacier parameter K g determines the portion of rain becoming surface runoff as a multiplier for the rainfall.The remaining rainfall is assumed to be accumulated to the glacier store.
The within-grid routing process for direct runoff is represented by an instantaneous unit hydrograph (IUH) (Nash, 1957), in which a catchment is depicted as a series of N reservoirs each having a linear relationship between storage and outflow with the storage coefficient of K q .Mathematically, the IUH is expressed by a gamma probability distribution: where, Γ is the gamma function.The within-grid groundwater routing process is simplified as a lumped linear reservoir with the storage recession coefficient of K s .
The transport of water in the channel system is described using the diffusive wave approximation of the Saint-Venant equation (Lohmann et al., 1998):

Conclusions References
Tables Figures

Back Close
Full where C and D are parameters denoting wave velocity (Velo) and diffusivity (Diff) respectively.

Results and discussion
For the remaining part of the paper, we introduce the following shorthand: Lump, Semi, and Dist indicate the lumped, semi-distributed, and fully distributed parameterization schemes, and Outlet, Stepwise, and Pooled correspond to basin outlet, stepwise, and pooled calibrations.The comparison between different calibration strategies is based on the model performance evaluated with the NSE, as well as an alternative metric, the Kling-Gupta efficiency (KGE) (Gupta et al., 2009), which equally weights model mean bias, variance bias, and correlation with observations.

Pooled calibration vs. stepwise calibration
This section reports the results from the first experiment comparing the stepwise and pooled calibration approaches for the semi-distributed model parameterization.
Figure 5 shows the comparison between the Semi-Stepwise and Semi-Pooled with boxplots representing the 50 trials of calibration.Under the stepwise calibration the results for 4 sub-basins (Chitral, Gawardesh, Chaghasarai, and Daronta) are optimal because there is no interaction between those sub-basins.However, the calibrated parameter sets of each sub-basin act as constraints in the last step of the Semi-Stepwise resulting in the degradation of model skill at the basin outlet (Dakah) and two left-out gages (Asmar and Kama).This becomes apparent when comparing the Semi-Stepwise to the Semi-Pooled results.The model skill under the Semi-Pooled is similar to that from the Semi-Stepwise with respect to the 4 upstream sub-basins, but it outperforms at the verification gages.This is particularly true for the Asmar gage, which exhibits a downward bias and substantial variability in performance under the Semi-Stepwise.The Semi-Pooled results suggest that small sacrifices of model performance Introduction

Conclusions References
Tables Figures

Back Close
Full at certain sites can improve and stabilize basin-wide performance.Expected values of KGE from 50 calibrations are also provided (values in parenthesis in the bottom of Fig. 5) and this performance metric also leads to the same conclusion.Therefore, the Semi-Pooled was selected as the better multisite calibration strategy and is considered for further analyses in the following sections.

Pooled calibration with alternative parameterizations
Here we examine results for the three levels of parameter complexity applied to the pooled calibration approach.Figure 6 shows the comparison of the pooled calibrations.Unsurprisingly, streamflow predictions from the Lump-Pooled have the lowest accuracy and largest uncertainty at the calibration sites, particularly for the Chaghasarai and Daronta sites.This demonstrates the well-known difficulty in representing flow characteristics of a spatially variable system with a homogenous parameter set (Beven, 2012).The pooled calibration substantially improves with increasing parameter complexity at the calibration sites.Both the Semi-Pooled and Dist-Pooled produce NSE values above 0.8 for all calibration sites, with the Dist-Pooled showing somewhat higher performance, undoubtedly from its greater freedom to over-fit to the calibration data.However, the advantage of the Dist-Pooled with respect to streamflow predictions at validation sites becomes less clear.Only the Dist-Pooled at Kama shows marginally better predictions, while the results are ambiguous at Dakah and Asmar.Overall, this likely suggests that the fully distributed conceptualization leads to over-fitting of the model as compared to the Semi-Dist conceptualization.We reached the same conclusion when examining the KGE values, which rise with greater parameter complexity at calibration sites but no longer follow this pattern strictly at validation sites.Interestingly, the Lump-Pooled performs well at the verification sites despite its poor performance at calibration sites.The Lump-Pooled does not show significant degradation in skill at Kama compared to the more complex parameterizations, and the flow prediction at Asmar actually exhibits the best performance of all three model variants.A partial reason for this unexpected result arises from different overlapping 10289 Introduction

Conclusions References
Tables Figures

Back Close
Full periods in the calibration and validation data (see Fig. 3).The periods used for the calibration for Chitral (1978)(1979)(1980)(1981) and Gawardesh (1975Gawardesh ( -1978) ) have no overlapping periods with the one for Asmar (1966)(1967)(1968)(1969)(1970)(1971), which encompasses those two subbasins.Instead, the validation at Asmar is mostly affected by the calibration to Dakah because of the overlapping 4 years (1968)(1969)(1970)(1971) between those two sites.This explains the reason why the Lump-Pooled shows high skill at Asmar despite the low skill at its sub-basins.However, the low model skill at Chaghasarai from the Lump-Pooled propagates to the validation result at Kama, as these two sites have a relatively long overlapping period (8 years from 1967 to 1974).

Limitations of the basin outlet calibration
In the third experiment the HYMODS_DS was calibrated only to data at the basin outlet under all levels of parameter complexity, and streamflow records for all 6 sub-basins, as well as flows at Dakah not used during calibration, are used for model validation.First, we consider the flows at Dakah.During the calibration period, all three parameterization schemes produce very accurate streamflow predictions with NSE (KGE) values above 0.95 (0.96) (Fig. 7).High accuracy holds even under the Lump_Outlet, which is somewhat surprising given the spatial heterogeneity of the basin.While NSE and KGE values at Dakah rise marginally with greater parameter complexity during calibration, this no longer holds during the validation period, suggesting no benefit with an increase in parameter complexity.
The validation results for the 6 sub-basins demonstrate the danger in relying on outlet data alone when calibrating a distributed model for flow prediction at interior points.Streamflow predictions at interior sites exhibit low accuracy and high uncertainty, with the worst performance at the Daronta site (all NSEs and KGEs are negative).Further examination (Fig. S4 in the Supplement) showed that the HYMOD_DS significantly overestimated streamflow at Daronta and underestimated flow at three sites in the eastern part of the basin (Chitral, Gawardesh, and Chaghasarai).Model performance at Kama and Asmar is somewhat better than the other validation sites, although Introduction

Conclusions References
Tables Figures

Back Close
Full improvements are not the same across all parameterizations.The Lump-Outlet predictions at these sites still have low average accuracy (average NSE < 0.7 and average KGE < 0.6), while the Semi-Outlet exhibits large uncertainty in performance across the 50 optimization trials.Surprisingly, the Dist-Outlet shows promising results with high expected accuracy at Kama and Asmar (mean NSE (KGE) of 0.84 (0.71) and 0.90 (0.88), respectively) and comparable performance at many of the other sites.One exception is Gawardesh, where the Lump-Outlet outperforms the other model variants, although the reason for this is not immediately clear.Overall, the results indicate that any calibration based on basin outlet data should be used with substantial caution when predicting flows at interior basin sites.
After reviewing all of the calibration experiments, it becomes clear that the Semi-Pooled and Dist-Pooled calibrations provide more robust performance compared to the basin outlet calibrations due to their improved representation of internal hydrologic processes across the basin.To further compare these calibration strategies against one another, we evaluate the variability in optimal parameters resulting from the 50 trials of the GA algorithm.Figure 8 shows the coefficient of variation (CV) of Cmax (a parameter for the soil moisture account module) over the basin from all combinations of calibration approaches (the outlet and pooled) and 3 parameterization schemes.A clear pattern of increasing variability (high uncertainty in Cmax) emerges as parameter complexity increases for both the outlet and pooled calibration strategies.
That is, the semi-and fully-distributed parameterizations lead to significantly variable parameter sets that produce similar representations of the observed basin response.Figure 8 also suggests that the equifinality can be alleviated to an extent by pooling data across sites.The pooled calibration approaches consistently show lower variability in Cmax compared to the outlet calibration at the same level of parameter complexity.These results are relatively consistent across the remaining 14 HYMOD_DS parameters.The implications of parameter stability on streamflow projections under climate change is addressed in the next section.Introduction

Conclusions References
Tables Figures

Back Close
Full

Climate change projections of streamflow with uncertainty
Here we explore how projections of future water availability and flood risk under climate change are influenced by the choice of calibration approach.For the Kabul River basin, the CMIP5 GCM projections of monthly total precipitation and mean temperature are shown in Fig. S5 in the Supplement.According to the CMIP5 ensemble, precipitation projections show no clear trend; the average precipitation change in monthly total precipitation fluctuates between −10 mm and 10 mm.On the other hand, temperature clearly shows an upward trend for both radiative forcing scenarios.The average changes in annual temperature are +2.2 • C and +2.8 • C for RCP4.5 and RCP8.5, respectively.
We first examine monthly streamflow climatology across four calibration strategies: the Semi-Pooled and Dist-Pooled (most promising calibration strategies), as well as the Lump-Outlet (as a baseline) and Dist-Outlet (the best outlet calibration strategy).Figure 9 shows the monthly streamflow predictions for the historical period and the 2050s under the RCP 4.5 and 8.5 scenarios.The whisker bars indicate the range across the 50 calibration trials; for the future scenarios, the whisker bars are derived by averaging over the 36 different climate projections for each of the 50 trials.For the historical time period, all calibration schemes match the observed climatology at Dakah well, but monthly streamflow is underestimated in most of months at Kama and Asmar under the basin outlet calibrations, particularly by the Lump-Outlet.The historical streamflow climatology from the outlet calibration strategies also tends to be highly uncertain for the months of June, July, August, and September, especially compared to the SemiPool and DistPool.
Under future climate projections, the four calibration strategies show similar changes in climatology at Dakah, but the magnitudes of change are somewhat different.All calibration strategies suggest reduction in streamflow for June, July, and August under both RCP4.5 and RCP8.5 scenarios.Also, the peak monthly flow, which occurred in June or July in the historical period, is shifted to May at Dakah.However, the Introduction

Conclusions References
Tables Figures

Back Close
Full Lump-Outlet predicts less reduction of flow in June and July and a greater reduction in August and September as compared to the other three calibrations.Considering that all calibration schemes had similar levels of good performance at this site for both calibration and validation periods, it is notable that they project future streamflow climatology somewhat differently.
Future streamflow climatology at Kama and Asmar vary widely between the four calibration schemes, mostly an artifact of their historic differences (Fig. 9).Streamflow projections under the outlet calibration strategies tend to show large uncertainties at these two sites, particularly the Lump-Outlet calibration.For three months, July through September, the outlet calibration and pooled calibration strategies provide substantially different insights about future water availability at Kama and Asmar.The outlet calibrations suggest less water with large uncertainties for those months as compared to the pooled calibrations.At Kama, the pooled calibrations suggest significant changes in the pattern of peak monthly flow timing under both RCP scenarios; instead of having a clear peak in July, streamflow from May to August show similar amounts of water.
To further understand the sources of uncertainty in future water availability, we evaluate the separate and joint influence of uncertainties in parameter estimation and future climate on seasonal streamflow projections across all calibration schemes.Figure 10 represents the uncertainty of wet and dry seasonal streamflow at Dakah from three sources: (1) parameter uncertainty across the 50 trials, with future climate uncertainty averaged out for each trial, (2) future climate uncertainty across the 36 projections, with parameter uncertainty averaged out across the 50 trials, and (3) the combined uncertainty across all 1800 (50 × 36) simulations.The results suggest somewhat surprisingly that uncertainty reduction can be expected as parameter complexity increases, and less surprisingly, by applying pooled calibration approaches.
Another clear point is that the uncertainty resulting from different climate change scenarios substantially outweighs that from parameter uncertainty.
Up to this point, there has been little difference between the Semi-Pooled and Dist-Pooled model variants.These two versions were further analyzed with respect Introduction

Conclusions References
Tables Figures

Back Close
Full to extreme streamflow to see if distinguishing characteristics emerge.It has been demonstrated that clear gains in predicting peak flows from distributed models are noticeable (Reed et al., 2004) and spatial variability in model parameters significantly influence the runoff behavior (Brath and Montanari, 2000;Pokhrel and Gupta, 2011).The spatial variability of optimal parameters derived from the Semi-Pooled and Dist-Pooled was shown in Fig. S6 in the Supplement, with larger variability across all parameters for the Dist-Pooled than for the Semi-Pooled.To understand the effects of parameter variability and uncertainty on extreme event estimation, the 100 year flood event was calculated under the Semi-Pooled and Dist-Pooled for each of the 50 historic simulations and 1800 future simulations across both RCP scenarios.While no observed data is available against which to compare the results, an inter-model comparison is useful to distinguish the differences between the parameterization schemes.Projections of the 100 year flood, estimated using a Log-Pearson type III distribution fit to annual peaks of 30 years, differ somewhat between the Semi-Pooled and Dist-Pooled (Fig. 11).At 3 validation sites, extreme floods are consistently larger under the Semi-Pooled than the Dist-Pooled, and the mean difference in the 100 year flood estimate between the two calibration approaches grows between the historic runs and the RCP 4.5 and 8.5 scenarios.This suggests that the flood-generation process is fundamentally different between the two parameterizations, with the Semi-Pooled formalization magnifying the effect of climate change on extremes.Furthermore, there is substantially more uncertainty in the 100 year flood estimate under the Semi-Pooled.Figure 11 shows the combined uncertainty across both climate projections and calibrations, but this uncertainty is broken down further in that no daily data was ever used in the calibration of either model.It appears that a lack of model parsimony does not necessarily lead to greater uncertainty in model simulations under different climate conditions, somewhat counter to what would be expected of over-fit models.One possible reason for this result would be if increased parametric freedom somehow offset the effects of structural deficiencies in the model.
However, further research is needed to investigate this issue.

Conclusion
In this study we examined a variety of calibration experiments to better understand the benefits and costs associated with different calibration choices for a complex, distributed hydrologic model in a data-scarce region.The goal of these experiments was to provide insight regarding the use of multisite data in calibration, the effects of parameter complexity, and the challenges of using limited data for distributed model calibration, all in the context of projecting future streamflow under climate change.This study tested two multi-site calibration strategies, the stepwise and pooled approaches, finding that the pooled approach using all data simultaneously provides improved calibration results.This suggests that small sacrifices of model performance at certain sites can improve and stabilize basin-wide performance.The pooled calibration substantially improves with increasing parameter complexity at the calibration sites, but the similar streamflow predictions at the validation sites between the semi-distributed and distributed pooled calibrations were found, suggesting overfitting of the model from the fully distributed conceptualization.
It is difficult to expect hydrologic models to yield reliable streamflow estimates at interior locations of a watershed when calibration is only based on data at the basin outlet, yet this is all too common in hydrologic model applications.The pooled calibration approach is superior to the basin outlet calibration in terms of its ability to represent interior hydrologic response correctly.This study shows the danger in relying on an outlet calibration for interior flow prediction.Introduction

Conclusions References
Tables Figures

Back Close
Full From the test of implications of the pooled calibration in the context of climate change, it was found that applying the pooled calibration with semi-distributed and distributed parameter formulations showed clear gains in reducing uncertainties in predictions of monthly and seasonal water availability as compared to the basin outlet calibrations.Surprisingly, increased parameter complexity in the calibration strategies does not increase the uncertainty in streamflow projections, even though parameter equifinality does emerge.The results suggest that increased (excessive) parameter complexity does not always lead to increased uncertainty if structural uncertainties in the model are present.The semi-distributed pooled and distributed pooled calibrations are very similar for monthly streamflow projection, yet different for the projection of extreme flows owing in part to difference in the spatial variability of optimal parameters, with the distributed pooled calibration showing less uncertainty for 100 year flood events.We evaluated the separate and joint influence of uncertainties in parameter estimation and future climate on projections of seasonal streamflow and 100 year flood across calibration schemes and found that the uncertainty resulting from variations in projected climate between GCMs substantially outweighs the calibration uncertainty.
Successful automatic calibration algorithms for hydrologic models are based primarily on global optimization algorithms that are computationally expensive and require a large number of function evaluations (Kuzmin et al., 2008).Although the speed and capacity of computers have increased multi-fold in the past several decades, the time consumed by running hydrological models (especially complex, physically based, distributed hydrological models) is still a concern for hydrology practitioners.A single trial of parameter optimization of HYMOD_DS associated with 100 000 runs can take 28 days on a single processor (Fig. S7 in the Supplement).The use of high performance computing power was essential in this study to better understand the implications of different calibration choices and their associated uncertainty for streamflow projections.
In the future, remote sensing and satellite information can be integrated into calibration approaches to develop more robust estimates of spatially distributed Introduction

Conclusions References
Tables Figures

Back Close
Full parameter values for distributed hydrological modeling.Significant progress has been made toward this end (Tang et al., 2009;Khan et al., 2011;Thirel et al., 2013).Future work will consider using advanced computing techniques to understand how such information can enhance the hydrologic simulation at ungaged sites and reduce the parameter uncertainty of distributed hydrologic models in data-scarce regions.
The Supplement related to this article is available online at doi:10.5194/hessd-11-10273-2014-supplement.Introduction

Conclusions References
Tables Figures

Back Close
Full  Full Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | ), which uses multivariate scaling relationships to estimate glacier and ice cap volume based on elevation range and area.Specifically, the scaling law including area and elevation range factors was applied to estimate glacier/ice cap volume when the glacier depth exceeded 10 m.Otherwise, glacier/ice cap volume was estimated with the area-volume scaling law.The elevation range spanned by each Introduction Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Table 1 .
Streamflow gaging stations in the Kabul River basin.Dual station ID for stations archived in both USGS and GRDC database.Introduction