HESSHydrology and Earth System SciencesHESSHydrol. Earth Syst. Sci.1607-7938Copernicus PublicationsGöttingen, Germany10.5194/hess-20-2913-2016Simultaneous calibration of hydrological models in geographical spaceBárdossyAndrásHuangYingchunyingchun.huang@iws.uni-stuttgart.deWagenerThorstenhttps://orcid.org/0000-0003-3881-5849Institute for Modelling Hydraulic and Environmental Engineering, University of Stuttgart, Stuttgart, GermanyDepartment of Civil Engineering, Queen's School of Engineering, University of Bristol, Bristol, UKYingchun Huang (yingchun.huang@iws.uni-stuttgart.de)19July2016207291329283October201530October201527June20161July2016This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/This article is available from https://hess.copernicus.org/articles/20/2913/2016/hess-20-2913-2016.htmlThe full text article is available as a PDF file from https://hess.copernicus.org/articles/20/2913/2016/hess-20-2913-2016.pdf
Hydrological models are usually calibrated for selected catchments
individually using specific performance criteria. This procedure assumes that
the catchments show individual behavior. As a consequence, the transfer of
model parameters to other ungauged catchments is problematic. In this paper,
the possibility of transferring part of the model parameters was
investigated. Three different conceptual hydrological models were considered.
The models were restructured by introducing a new parameter η which
exclusively controls water balances. This parameter was considered as
individual to each catchment. All other parameters, which mainly control the
dynamics of the discharge (dynamical parameters), were considered for spatial
transfer. Three hydrological models combined with three different performance
measures were used in three different numerical experiments to investigate
this transferability. The first numerical experiment, involving individual calibration
of the models for 15 selected MOPEX catchments, showed that it is difficult
to identify which catchments share common dynamical parameters. Parameters of
one catchment might be good for another catchment but not the opposite. In the
second numerical experiment, a common spatial calibration strategy was used.
It was explicitly assumed that the catchments share common dynamical
parameters. This strategy leads to parameters which perform well on all
catchments. A leave-one-out common calibration showed that in this case a
good parameter transfer to ungauged catchments can be achieved. In the third
numerical experiment, the common calibration methodology was applied for
96 catchments. Another set of 96 catchments was used to test the transfer of
common dynamical parameters. The results show that even a large number of
catchments share similar dynamical parameters. The performance is worse than
those obtained by individual calibration, but the transfer to ungauged
catchments remains possible. The performance of the common parameters in the
second experiment was better than in the third, indicating that the selection
of the catchments for common calibration is important.
Introduction
Hydrological models are widely used to describe catchment behavior, and for
subsequent use for water management, flood forecasting, and other purposes.
Hydrological modeling is usually done for catchments with observed
precipitation and discharge data. The unknown (and partly not measurable)
parameters of a conceptual or, to some extent, physics-based model are adjusted
in a calibration procedure to reproduce the measured discharge from the
observed weather and catchment properties. Due to the high variability of
catchment properties and hydrological behavior ,
this modeling procedure is usually performed individually for each catchment.
Different catchments are often modeled using different models. This great
variety of models and catchments makes a generalization of the description of
the hydrological processes very challenging .
Additionally, even for a selected model applied for a specific catchment, the
parameter identification is not unique. A great number of parameter vectors
might lead to a very similar performance .
Moreover, due to overreliance on measured discharge for model calibration,
estimation of model parameters for ungauged basins is a big challenge.
Instead of model calibration, parameters have to be estimated on the basis of
other information . A decade of worldwide
research efforts have been carried out for the runoff prediction in ungauged
basins (PUB) . The PUB synthesis book
takes a comparative approach to learning from similarities
between catchments and summarizes a great number of interesting methods that
are being used for predicting runoff regimes in ungauged basins. Many
attempts have been made to develop catchment classification schemes to
identify groups of catchments which behave similarly .
However, the task is of great importance. discussed the need for a widely accepted classification
system and pointed out that a good
classification would help to model the rainfall–runoff process for ungauged catchments.
give a comprehensive review of regionalization
methods for predicting streamflow in ungauged basins. Catchment similarity
can be determined by comparing their corresponding discharge series using
correlation or copulas . Much of
the variability in discharge time series is controlled by the weather
patterns. Therefore, it is likely that similarity in discharge is higher for
catchments with well correlated weather, which often requires geographical
closeness . However, discharge series produced by
catchments can be very different under different meteorological conditions.
Even the same catchment behaves differently in a dry and in a wet year. Due
to the different weather forcing, the above methods would consider the same
catchment in one time period as dissimilar to itself in another time period.
One can also define catchment similarity using hydrological models
.
Catchments are similar if they can be modeled reasonably well by the same
model using the same model parameters . Due to observational
errors and specific features in the calibration period, the adjustment of the
model can be very specific to the observation period leading to an
overcalibration . To overcome such limitations, a
regional calibration approach is suggested to
identify single parameter sets that perform well for all catchments within the
modeled domain. indicate that the iterative
regional calibration indeed reduced the uncertainty of most parameters.
Regional calibration can result in a better temporal robustness than normal
individual calibration and it provides an effective
approach in large-scale hydrological assessments .
The focus of this paper is to investigate if the transformation of
precipitation to discharge is possible independently of the weather. For this
purpose, the hydrological model parameters are separated into two groups:
parameters describing the water balances which are strongly related to
climate; and
parameters describing the dynamics of the runoff triggered by weather.
The second group of parameters is supposed to be weather independent and
represent the focus of this paper. To simplify the problem, a single new
parameter η was introduced to describe water balance. This parameter is
conditional on the other model parameters and adjusts the long-term water balances.
The purpose of this paper is to investigate to what extent the different
catchments share a similar dynamical rainfall–runoff behavior and can be
modeled using the same model parameters, with the exception of the newly
introduced individualized water balance parameter η.
Hydrological models are usually judged according to the degree of reproducing
discharge dynamics and water balances. While water balances are mainly driven
by weather in terms of precipitation, temperature, radiation, and wind,
dynamics are controlled by catchment properties in terms of size, terrain,
slopes, soils, etc. Formation of landscapes as a result of long-time climate
is a quasi-equilibrium process. The hypothesis of this paper is that this
equilibrium is mirrored in a similar dynamic behavior. Thus, a large number
of catchments can be modeled by using the same dynamic parameters.
Three simple conceptual hydrological models combined with three different
performance measures are used to describe the rainfall–runoff behavior on
the daily timescale for a large number of catchments.
Location of the catchments selected for the experiments.
The following three different numerical experiments, including calibration
and validation procedures, are carried out for different sets of selected catchments:
The usual catchment-by-catchment calibration is carried out. In order to
test if dynamical model parameters are shared, the parameters are directly
transferred to all of other catchments.
Instead of the traditional catchment-by-catchment calibration, it is
assumed that the model parameters are similar for a set of catchments in a close
geometrical setting. Thus, a simultaneous calibration of the models is carried
out and tested both in a gauged and an ungauged version.
The geographical extent of the catchments used for simultaneous calibration
is expanded. A great number of assumed ungauged catchments are used for testing
the hypothesis.
The hypothesis is that the rainfall–runoff process can be described using
the same dynamical hydrological model parameters for a number of catchments.
The very different climatic conditions and water balances of the catchments
are considered by the newly introduced specific parameter η controlling
the long-term water balance of each catchment individually. The other model
parameters control the discharge dynamics on both short and long timescales.
These dynamical parameters are supposed to be shared despite the great
heterogeneity of the catchments. This procedure simplifies the hydrological
model parameter estimation for ungauged catchments, namely the procedure is
reduced to the estimation of a single parameter η, which can be related
to long-term water balances.
The paper is structured as follows: after the introduction, the investigation
area is described. This is followed by a description of the three conceptual
hydrological models and the three performance criteria used for calibration
and validation. In Sect. 4, the new model parameter η controlling
the water balance is introduced. In Sects. 5–7, three numerical
experiments are described and the results are presented, starting with the
individual calibration of the models and ending with a transfer of the model
parameters to randomly selected catchments. The paper concludes with
a discussion of the results.
Investigation area and available data
The study area is the eastern United States. Locations of the 196 catchments
used in this study are shown in Fig. . The catchments for
a subset used for the international Model Parameter Estimation Experiment (MOPEX)
project. Catchments range in size from 134 to 9889 km2 and
exhibit aridity indices (long-term potential evapotranspiration to
precipitation rates) between 0.41 and 3.3, hence representing a heterogeneous
data set. Time series data of daily streamflow, precipitation, and temperature
for all catchments were provided by the MOPEX project .
Catchments within this data set are minimally impacted by human influences.
Streamflow information within this data set was originally provided by the
United States Geological Survey (USGS) gauges, while precipitation and
temperature was supplied by the National Climate Data Center (NCDC). The
MOPEX data set has been used widely for hydrological model comparison studies
(see references in ).
Hydrological models and performance criteria
Three simple conceptual hydrological models were applied in this study. The
reason for this is that the great number of calibration and validation
experiments could only be performed with relatively simple model structures.
It is important to see if the results are similar for different models and
performance measures. In a subsequent study, spatially distributed models
will be considered.
HYMOD model
The HYMOD model is a conceptual rainfall–runoff model
derived from the Probability Distributed Model . The soil
moisture accounting module of HYMOD utilizes a Pareto distribution function
of storage elements of varying sizes. The storage elements of the catchment
are distributed according to a probability density function defined by the
maximum soil moisture storage CMAX and the distribution of soil
moisture store b. Evaporation from the soil
moisture store occurs at the rate of the potential evaporation estimates
using the Hamon approach. After evaporation, the remaining rainfall and
snowmelt are used to fill the soil moisture stores. A routing module divides
the excess rainfall using a split parameter α which separates fluxes
amongst two parallel conceptual linear reservoirs meant to simulate the quick
and slow flow response of the system (defined by residence times kq and ks).
HBV model
The HBV model is a conceptual model and was originally developed at the
Swedish Meteorological and Hydrological Institute (SMHI) .
Snow accumulation and melt, actual soil moisture, and runoff generation are
calculated using conceptual routines. The snow accumulation and melt is based
on the degree-day approach. Actual soil moisture is calculated by considering
precipitation and evapotranspiration. Runoff generation is estimated by
a nonlinear function of actual soil moisture and precipitation. The dynamics
of the different flow components at the subcatchment scale are conceptually
represented by two linear reservoirs. The upper reservoir simulates the near
surface and interflow in the subsurface layer, while the lower reservoir
represents the base flow. They are connected through a linear percolation
rate. Finally, there is a transformation function consisting of a triangular
weighting function with one free parameter for smoothing the generated flow.
Xinanjiang model (XAJ)
The Xinanjiang model (XAJ) model was established in the early 1970s in China. This conceptual
rainfall–runoff model has been applied to a large number of basins in the
humid and semi-humid regions in China. The lumped version of XAJ model
consisted of four main components . The evapotranspiration is
represented by a three-layer soil moisture module which differentiates upper,
lower, and deeper soil layers. Runoff production is calculated based on
rainfall and soil storage deficit, tension water capacity curve is introduced
to provide for a nonuniform distribution of tension water capacity
throughout the whole catchment. The runoff separation module separates the
determined runoff into three parts, namely surface runoff, interflow, and
groundwater. The flow routing module transfers the local runoff to the outlet
of the basin. In order to account for the precipitation that is contributed
from snowmelt, the degree-day snowmelt approach is added in this model. In
this study, the model has 16 parameters which can be adjusted using calibration.
Performance criteria
Model calibration depends strongly on the performance criteria used. In order
to obtain reasonably general results, three different criteria were selected
to evaluate model performance.
The Nash–Sutcliffe efficiency between the observed and
modeled flow is most frequently taken as the first evaluation criterion:
O(1):NS=1-∑t=1TQo(t)-Qm(t)2∑t=1TQo(t)-Qo‾2.
Here, Qo(t) is the observed discharge and
Qm(t) is the modeled discharge on a given day t. The
abbreviation NS is used subsequently for this performance measure.
The NS model performance criterion was often criticized for example,
in, and several modifications and other criteria were
suggested. One interesting suggestion was published in : the
authors suggest using a performance measure which accounts for the water
balances and the correlation of the observed and modeled time series
separately. Their approach was slightly modified and the following
performance criterion was introduced:
O(2):GK=1-β∑t=1TQo(t)-Qm(t)∑t=1TQo(t)2-1-rQo,Qm2.
Here, r(Qo, Qm) is the correlation coefficient
between the observed and modeled time series of discharge. β is
a weight to express the importance of the water balance. In our study,
β= 5 was selected. The reason for selecting this version of the coefficient is
that a model should produce good water balances and appropriate discharge
dynamics simultaneously. The quadratic form in Eq. () assures
that both aspects are considered, and the worse of them is dominant. The
abbreviation GK is used subsequently for this performance measure.
The Nash–Sutcliffe coefficient of the logarithm of the discharges is
focusing on the low flow conditions more than the traditional NS coefficient:
LNS=1-∑t=1TlogQo(t)-logQm(t)2∑t=1TlogQo(t)-logQo‾2.
To equally concentrate on high and low flows, a combination of the original NS
and the logarithmic NS is used as a third measure:
O(3):NS+LNS=NS+LNS2.
The abbreviation NS + LNS is used subsequently for this performance measure.
The three performance criteria were modified, hence the higher the value, the
better the model. Further the best value for the criteria is 1.
MethodModel parameter to control water balance
Climatic conditions are of central importance for water balances. The
relationship of potential to actual evapotranspiration can differ strongly
due to water or energy limitations. This suggests that catchments might have
similar dynamical behavior but with different water balances. In order to
account for this, the model parameters could be separated to form two groups,
one group with parameters controlling the water balances and another
controlling the discharge dynamics. This separation of existing model
parameters is difficult, as they often simultaneously influence both
components. Instead of an artificial model-specific separation, a new
parameter η was introduced to all three models. This parameter controls
the ratio between daily potential and actual evapotranspiration depending on
the available water and depends on the long-term water balance only. This
parameter η gives
Eta=EtpifSMCMAX>ηminSMη⋅CMAXEtp,SMelse.
Here, SM is the actual soil water available for evapotranspiration.
CMAX is the maximum possible soil moisture. Etp stands
for the potential and Eta for the actual evapotranspiration, respectively.
The parameter η regulates the water balances in accordance with the
dynamical parameters. It can be calculated directly for each parameter
vector θ. This is necessary as it is thought to establish
correct water balances. Thus, parameter η depends on the catchment and
parameter vector θ. Here, f(η)=ViM(η, θ) is a
monotonically decreasing function of η. If the model can provide correct
long-term water balances then
ViM(1,θ)<ViO<ViM(0,θ).
As f(η)=ViM(η, θ) is continuous, there is a unique
η(θ) for which
ViM(η(θ),θ)=ViO.
If Eq. () is not fulfilled, then the parameter vector θ
is not appropriate for the model.
The parameter η is fitted individually for each θ, and this way a correct water balance is assured for the calibration period.
Experimental design
In this study, the ROPE algorithm was applied for model
parameter optimization. This parameter optimization method could obtain
a predetermined number of optimal parameter sets that perform very similar to
the models, although the parameter sets are very heterogeneous. In this
study, each calibration yielded 10 000 convex sets of good parameter
vectors. Three numerical experiments on a large number of catchments were
carried out to investigate the transferability of the model parameters under
different calibration strategies. For a clear explanation and understanding
of the methods, the procedure and results for these three experiments are
presented in the following three sections.
Numerical experiment 1: individual calibration and parameter transfer
The first experiment is thought to test the transferability of the model
parameters under the usual individual calibration for each catchment.
As a first step, 15 catchments with reliable data and slightly varying
catchment properties in the eastern US were selected. Locations of
the selected gauges are marked as the red plus symbols in Fig. .
Table lists the basic catchment properties and
Table summarizes the meteorological conditions for the
selected 15 catchments, respectively . The tables
show that despite their geographical proximity, these catchments have quite
different climate and hydrographic properties.
For the 15 selected catchments, an individual calibration was performed using
all three models and all three performance measures. Data series from 1951
to 2000 were split up into five subperiods. This leads to 45 calibrations
for each catchment. Each calibration yielded convex sets Gi of
good parameters for each catchment i. A total of 10 000 parameter vectors from each
of these sets were generated. (Note that the corresponding parameter η
was estimated for each element of the parameter set separately.)
Let Oi(j)(θ) denote the value of the objective function j
for a parameter vector θ in catchment i. The best
objective function value for each individual catchment is denoted with
Oi(j)*. The parameter sets display substantial equifinality as all of
them perform very similarly. For simplicity, we used the average value of the
10 000 performances to represent the simulation result for each catchment.
The left part of Fig. shows the mean values of the
objective function NS for the 10 000 parameter vectors for the calibration
period 1971–1980 for the three selected models (denoted as individual
calibration). As expected, the model performance varies across catchments.
The reasons for this are observation errors both in input and output as well
as a possible inability of the model to represent the main
hydrological processes reasonably well.
Catchment properties for the selected 15 catchments.
Performance of the individually calibrated and the common calibrated
models using NS as performance criterion.
The ranges of the model parameters are relatively large. As a first step, we
checked if the catchments have common parameter vectors. For each pair of
catchments (i, j), for the same performance measure and time period, the
intersection of the convex hull of the good parameter sets Gi∩Gj
is empty, showing that there are no common best parameters. From the result,
seemingly none of the catchments are similar.
As a next step, the 10 000 generated best dynamical parameter vectors for
a given time period and hydrological model obtained for catchment i were
applied to model all other catchments using the same hydrological model and
time period. Note that the value of η is not transferred but adjusted to
the true long-term water balance. In the numerical experiments, we assume
that the long-term discharge volumes are known variables for all simulations.
However, it highlights the issue of estimating the real water balance in
ungauged basins, which will be addressed in the discussion.
Figure shows the color-coded matrices for the mean NS
performance and GK performance of the three hydrological models using
transferred parameters for all 15 catchments for a calibration period (1971–1980).
Color-coded matrices for the mean model performance of the parameter
transfer for the selected 15 catchments. The upper panel used NS as
performance measure, the lower panel used GK as performance
measure.
The performance of the transferred parameter vectors displays a strongly
varying picture. While in some cases the catchments seem to share parameter
vectors with reasonably good performance, in other cases the transfer led to
weak performances. A further surprising fact is that none of the matrices are
symmetrical. One can see that some catchments are good donors as their
parameters are good for nearly all catchments, while others have parameters
which are hardly transferable.
The asymmetry of the parameter transition matrices cannot be explained by
catchment properties. Two different catchments seem to share well-performing
parameters if calibrated on one catchment and no common good parameters if
calibrated on the other one. Take the catchments 1 and 12 with the NS
performance as an example. For all three models, parameters calibrated for
catchment 1 are not suitable for catchment 12, but parameters of catchment 12
perform reasonably well for catchment 1. From the observation data, we found
that catchment 12 is under relatively dry climate conditions during the
calibration period. We also found, from the simulated hydrographs, that the
parameter sets calibrated on catchment 1 could not adequately capture the
dynamic behavior of catchment 12 as the low flows were underestimated for
most of the time and the peak flows were obviously overestimated. The
matrices for NS show different performances with different models. In
general, the HBV model performs the best. The average value of the matrix is
0.62 for HBV, 0.55 for HYMOD, and 0.54 for XAJ. Furthermore, the correlations
of transferred model performance between different models are all greater
than 0.7. From the viewpoint of parameter transferability, the three models
perform similarly, if a parameter transfer is reasonable from catchment i
to j for one model then it is also reasonable for the other models. The
results for the GK performance differ from those of the NS performance. Here,
the XAJ model seems to give the generally best transferable parameters.
Parameter vectors from other catchments generally fail to perform on
catchment 15 across all three models.
The difference of the transferability for these two performance measures
could be explained by different focuses; while NS is mainly focusing on the
squared difference between the observed and modeled discharge, GK focuses on
water balances and good timing, and the combination of NS and LNS is strongly influenced by low flow
events. It is interesting to observe that catchment 12 is a very bad receiver
for model parameters for NS, while it is an excellent receiver for GK. This
means that different events have different influence on the performance.
A possible explanation for the asymmetry is the fact that the catchments have
different weather forcing in the calibration period. It could be that runoff
events which are most important for a performance measure occur in the
calibration period frequently in one catchment leading to good
transferability, and seldom in the other, causing weak transferability of the
parameters from one catchment to another.
The transferability of the model parameters was also tested for an
independent validation period between 1991 and 2000.
Figure shows the corresponding color-coded results for NS
as performance measure. The matrices are similar to those obtained for
calibration. Catchment 12 remained a bad receiver but a good donor, indicating
that the bad performance is unlikely to be caused by observation errors.
Further, for some columns the off-diagonal elements are larger than the
diagonal ones which is a sign of a possible overcalibration of models.
To investigate the influence of climate on calibration, the hydrological
models calibrated for different time periods using the same model and
performance measure were compared. As the different time periods represent
different climate conditions, the calibrations led to different parameter
sets. As a comparison, the differences in calibrated model parameters using
the same model and performance measure for different catchments were
compared. As an example, the left part of Fig.
shows two calibrated parameters of the HYMOD model for catchment 13 on three
different 10-year time periods. The right part of
Fig. shows the same parameters obtained by
calibration for three different catchments (7, 8 and 13) during the time period
1951–1960. The structural similarity of the two scatterplots suggests that
the difference between the different catchments is comparable to the
difference between the different time periods. In hydrological modeling, it
is usually assumed that model parameters are constant over time, assuming no
significant change in climate or other characteristics. The results, however,
show the assumption that parameters are the same over space is not completely
unrealistic. The figures even suggest that there might be parameter vectors
which perform reasonably well for all 15 catchments. As a next step, an
experiment to test this assumption was devised.
Color-coded matrices for the mean NS model performance of the
parameter transfer for the validation period for the selected
15 catchments.
Scatterplots for two selected HYMOD parameters (CMAX and α)
obtained via model calibration using NS as performance measures. Left panel:
for catchment 13 (black: 1951–1960, blue: 1971–1980, and red: 1991–2000);
Right panel: for catchments 7 (red), 8 (blue), and 13 (black) for
1951–1960.
Numerical experiment 2: simultaneous calibration
For many pairs of catchments, the parameter transfer worked reasonably
well. As a next step, we investigated if there are parameters which perform
reasonably well for all catchments. As seen in the previous section, none of
the catchments share optimal parameters. Therefore, common suboptimal
parameters have to be found.
In order to identify parameter vectors which perform simultaneously well for
each catchment, the hydrological models were calibrated for all 15 catchments
simultaneously. The simultaneous calibration of the model for all catchments
is a multi-objective optimization problem. The goal is to find parameter
vectors which are almost equally good for all catchments with no exception.
As the models perform differently for the different catchments due to data
quality and catchment particularities, the performance was measured through
the loss in performance compared to the usual individual calibration. Thus,
the objective function was formulated using the formulation of the compromise
programming method :
R(j)(θ)=∑i=1nOi(j)*-Oi(j)(θ)p.
Here, index i indicates the catchment number and index j indicates the
type of the individual performance measure specified in Eqs. (),
(), and (). The goal in this objective function is
to minimize R(j). Here, p is the so-called balancing factor. The
larger the value of p is, the more the biggest loss in performance contributes to the
common performance. In order to obtain parameters which are good for all
catchments, a relatively high p= 4 was selected for all three performance measures.
In the same way as individual calibration, the ROPE algorithm was used for the
simultaneous calibration. The optimized parameter sets H(j)
are simultaneously well performed for each model and time period. The left
part of Fig. compares the performance of the individually
calibrated and the common calibration for the 15 selected catchments using NS
as performance criterion. As expected, the results show that the individual
calibrations led to better performances, but the joint parameter vectors
perform reasonably well for all catchments.
Mean NS model performance of the calibration, individual parameter
transfer, and for the leave-one-out transfer for the selected 15 catchments
for the calibration time period 1971–1980. Left panel: HBV, right panel:
HYMOD.
Mean NS model performance of the calibration, individual parameter
transfer, and for the leave-one-out transfer for the selected 15 catchments
for the validation time period 1991–2000. Left panel: HBV, right panel:
HYMOD.
As the goal of modeling is not the reconstruction of already observed data,
the performances on a different validation period (1991–2000) were also
compared. The right part of Fig. shows the mean model
performances for the 15 individually calibrated and the common calibrated
data sets. The observation that parameter vectors obtained through common
calibration may outperform individual on-site calibration may also indicate
the weakness of the calibration process for an individual catchment, which
should ideally be able to identify the best parameter set.
Runoff hydrographs for catchment 14 obtained using individual and
leave-one-out common calibrations of HBV using the GK performance
measure.
Runoff hydrographs for catchment 5 obtained using individual and
leave-one-out common calibrations of HBV using the NS performance
measure.
These results indicate that instead of transferring model parameters from
a single catchment, a parameter transfer might perform better if the
parameters obtained through common calibration on all other catchments are
used. In order to test this kind of parameter transfer, a set of simple
leave-one-out calibrations were performed. This means that for
a catchment i, the hydrological models were simultaneously
calibrated for the remaining 14 catchments. Each time another catchment i
was not considered for calibration, leading to 15 simultaneous
calibrations. These common model parameters were then applied for the
catchment which was left out. The performance of the models on these
catchments in the calibration period is reasonably good for all catchments.
Figure shows the result of HBV and HYMOD using the NS
performance measure. It compares the performance of the parameters obtained
via individual calibrations (red x mark), parameter transfers from other
catchments individually (blue plus), and the transfer of the common parameters
obtained by leave-one-out procedure (green diamond). The performance of
common parameters is obviously weaker than that of the individual
calibration but better than many parameter transfer obtained using
individual parameter transfer. To test the potential of the transferability
of the common parameters, a validation period was used.
Figure shows the results for the validation time period
1991–2000. In this case, the common calibration performs very well. For
HYMOD, it outperforms the parameter vectors obtained by individual
calibration for 6 out of the 15 catchments. For the other catchments, the
loss in performance is relatively small. Note that this good performance of
the common models was obtained without using any information of the target
catchment. The transfer of parameters obtained from individual calibrations
on other catchments shows a highly heterogeneous picture, as described in
experiment 1. The transferred common calibration is better than most of these
performances. Further, note that the results of experiment 1 show that there
is no explanation for why certain transfers work well and others do not. Thus, for
the transfer of model parameters to ungauged catchments, common calibration
seems to be a reasonable method.
In order to illustrate how model parameters of the leave-one-out common
calibration perform in validation, two hydrographs are presented.
Figures and show a part of the observed, the
modeled, and the common calibration transferred hydrographs for a randomly
selected parameter set obtained by individual calibration and leave-one-out
common calibration of HBV for catchments 5 and 14. While for catchment 5 the
common calibration leads to a hydrograph which is slightly better than that
obtained by individual calibration, in the second case for catchment 14 the
performance is reversed. However, in both cases the common parameters, which
were obtained without using any observations of the catchment, perform
surprisingly well.
Numerical experiment 3: extension to other catchments
The results of the previous experiment suggest that even more catchments
might share parameters which perform well on all. The 15 catchments used in
experiments 1 and 2 are however, to some extent, similar and thus can not
necessarily be considered as representative of a great number of other
catchments. Thus, for the third experiment, 192 catchments of the MOPEX
data set were considered. Of them, 96 were randomly selected for common
calibration (marked as blue circles in Fig. ); the other 96
catchments were used as receivers to test the performance of the common
parameters (marked as green triangles in Fig. ). The HBV model
using three selected performance measures was considered in this experiment.
For each of the 192 catchments, an individual model calibration was carried
out using 1971–1980 as the calibration period. Common calibration was performed
for the selected 96 catchments the same way as in experiment 2, and for the HBV model
using all performance measures.
Histograms of the NS model performance of HBV for the 96 selected
(donor) catchments. Left panel: calibration period (1971–1980), right panel:
validation period (1991–2000).
Histograms of the NS model performance of HBV for the 96 test
(ungauged) catchments. Left panel: calibration period (1971–1980), right
panel: validation period (1991–2000).
As a first step, the model performances for the individual and common
calibration were compared. As expected and already seen in experiment 2, the
performance for the common calibration is lower than the individual one for
HBV using all performance measures. For example, the mean performance NS over
all 96 catchments drops from 0.69 to 0.50. When one applies the models for
the validation period 1991–2000, the individually calibrated model mean
performance is 0.65, while for the common calibration the mean increases to 0.51.
Figure shows the histograms of the performance
NS for the calibration and validation periods for the individual and the
common calibrations. Results indicate the robustness of the common
calibration. The transfer to the 96 assumed ungauged catchments shows very
similar performance for the common parameters as for the catchments selected
for common calibration. Figure shows the histograms of
the performance NS for the individual calibration and the transfer for the
assumed ungauged catchments. It can be seen clearly from the histogram that
there is very little difference between the performance for the gauged and
the ungauged catchments. In 90 % of catchments, the common calibration
works reasonably well, even for the ungauged cases. The common parameters
describing runoff dynamics of all 192 catchments indicate that there is
a high degree of similarity of these catchments.
Comparing the results of the common calibration using the 96 catchments to
that obtained using the 15 catchments, one can observe that the increase of
catchments considered for the common calibration led to a decrease of the
performance. The common parameter sets calibrated by 15 catchments in a
reasonable geographic proximity perform better than the parameter sets
calibrated by 96 catchments. If one is interested in finding model parameters
for a specific ungauged catchment, the common calibration using a more
careful selection of the donor set of catchments is likely to lead to good
parameter transfers.
The water balances of the 192 catchments are different leading to different
η parameters. Figure shows the distribution of η
values for three randomly selected common good parameter sets for the HBV model
using NS as a performance measure for the calibration time period. It can be
seen clearly from the curve that for the same catchment, η is specific
for different dynamical parameter sets. Also, due to the differences in water
balance, different catchments requires different η values to control actual
evapotranspiration. Furthermore, for all 192 catchments, the parameter η
presents a very similar tendency for different dynamical parameter sets.
Figure plots the mean η value against the ratio of the
long-term actual evapotranspiration to potential evapotranspiration (Eta/Etp)
for each catchment. It shows strong negative correlation (-0.72) between
η and Eta/Etp.
DiscussionRobust parameter sets
The three experiments were carried out in way that a set of parameters
(usually represented by 10 000 individual parameter sets) was used. This
leads to a considerable fluctuation of the results. Modelers often prefer to
use single parameter vectors. If a single parameter vector is desired, then
according to , the deepest parameter set (which
represents the most central point in the whole parameter vector) is the most
likely candidate to be robust. This study also indicates the deepest
parameter set performs slightly better than the mean of the parameter sets considered.
Distribution of water balance parameter η for three randomly
selected common parameter vectors obtained via HBV using the NS performance
measure for 192 selected catchments.
Scatterplots of mean η value and ratio of actual
evapotranspiration to potential evapotranspiration for 192 selected
catchments.
The discharge coefficient of the catchments selected for the
experiments.
Variability and estimation of η
As defined, the water-balance-related parameter η is specific for each
catchment and each model parameter vector. Therefore, each individual
catchment has a large variation in η for the calibrated 10 000 parameter
sets. Also, for the same set of good parameters that match different water
balances, different catchments always require very different η values to
control actual evapotranspiration. Parameter η is estimated because it
controls the water balance and can be estimated at other catchments. The
remainder of the parameters (the dynamic ones) are regionally calibrated (all
catchments are given the same parameter set). Therefore, only η varies
between catchments. As η is specific for each parameter vector,
regionalization of η directly is not feasible and η remains
different for different parameter vectors after regionalization. In the
numerical experiments, in order to estimate water balance parameter η,
the long-term discharge volumes were treated as known variables for both
gauged and ungauged catchments. For application in practical systems, the long-term
discharge volumes have to be estimated for ungauged catchments. This
problem is not explicitly treated in this paper. The estimation of parameter η
is a limitation of the presented simultaneous calibration approach.
Regionalization of long-term discharge volumes is a prerequisite for the
application in ungauged basins. For the study area, the discharge
coefficients which relate discharge volumes to (known) precipitation show
quite a smooth spatial behavior as shown in Fig. . Thus, the
regionalization of this parameter does not seem to be an extremely complicated task
in this particular region. According to the previous analysis of η, for
each common dynamical parameter set, one can have a possible estimator of η
for a certain catchment based on the regionalization of discharge
coefficients. The potential application of this approach in other regions
needs to be investigated in future work.
Prediction in ungauged basins
The results of this study supported the general finding of
and , where the
simultaneous calibration led to weaker model performance than the individual
one for both calibration and validation time periods. The loss of model
performance in validation is smaller than that in calibration. When applied
to ungauged catchments, the simultaneous calibration shows more robustness
than the individual one. Simultaneous calibration of models in geographical
space offers a good possibility for the runoff prediction in ungauged basins.
Compared with traditional regionalization method, only the water balance
parameter η has to be estimated based on the regionalization of
discharge coefficients.
It was examined from the hydrographs that high flows are often underestimated
and low flows are probably overestimated. This kind of phenomenon has also
been detected in previous regional calibration studies
. This behavior is mainly due to
the uncertainty of model structure and the low spatial and temporal
resolutions of both models and input variables .
Conclusions
In this paper, the transfer of the dynamical parameters of hydrological
models was investigated. A new model parameter η controlling the actual
evapotranspiration was introduced to cope with the clear differences in water
balances due to water or energy limitations. Three hydrological models were
used in combination with three different performance measures in three
numerical experiments on a large number of catchments.
The individual calibration and transfer results indicate that models are
often overfitted during calibration. The parameters are sometimes more
specific for the calibration time period and their relation to catchment
properties seems to be unclear. This makes parameter transfers or parameter
regionalization based on individual calibration difficult. The common spatial
calibration strategy, which explicitly assumed that catchments share
dynamical parameters, was tested on 15 catchments and 96 catchments,
respectively. The common calibration provides an effective way to
identify parameter sets which work reasonably for all catchments within the
modeled domain. Testing the parameters on an independent time period shows
that common parameters perform comparably well to those obtained using
individual calibration. The transfer of the common parameters to model
ungauged catchments works well. The performance of common parameters on a
small number of catchments (15) was better than on a big number of
catchments (96) covering a large spatial scale. It indicates that the performance
of the common parameters depends strongly on the selection of the catchments
used to assess them and a reasonable geographic proximity of the catchments
might be a good choice for common calibration. The results of the experiments
were similar for all three hydrological models applied independently of the
choice of the performance measures. Note, however, that the common parameters
corresponding to the different performance measures differ considerably.
Common behavior is dependent on how one evaluates the performance of the models.
The fact that many catchments share common parameters which describe their
dynamical behavior does not mean that they have the same dynamical behavior.
The model output highly depends on the parameter η which varies from
catchment to catchment and also as a function of the other model parameters
describing dynamical behavior. Common parameters offer a good possibility for
the prediction of ungauged catchments; only the parameter η, which
controls the long-term water balances, has to be estimated individually. This,
however, can be done using other modeling approaches including regionalization methods.
In this study, all the models were tested on the daily timescale. The
results show that many catchments that behave similar to the same dynamical
parameter sets could perform reasonably for all of them. This means that
hydrological behavior on the daily scale is mainly dominated by precipitation
characteristics and actual evapotranspiration, and we believe that differences
in catchment properties have rather significant effects on smaller temporal
scales (e.g., hourly). Results also indicate that the differences in catchment
properties cannot be captured well by simple lumped model parameters.
The Supplement related to this article is available online at doi:10.5194/hess-20-2913-2016-supplement.
Acknowledgements
The study of the second author (Yingchun Huang) was supported by China
Scholarship Council. The authors gratefully acknowledge two anonymous
reviewers for their invaluable and constructive suggestions, and thank
Stephen Kwakye for proofreading the manuscript.
Edited by: R. Merz
Reviewed by: R. Arsenault and one anonymous referee
References
Ali, G., Tetzlaff, D., Soulsby, C., McDonnell, J. J., and Capell, R.: A
comparison of similarity indices for catchment classification using a
cross-regional dataset, Adv. Water Resour., 40, 11–22, 2012.
Andréassian, V., Le Moine, N., Perrin, C., Ramos, M.-H., Oudin, L.,
Mathevet, T., Lerat, J., and Berthet, L.: All that glitters is not gold: the
case of calibrating hydrological models, Hydrol. Process., 26, 2206–2210, 2012.Archfield, S. A. and Vogel, R. M.: Map correlation method: Selection of a
reference streamgage to estimate daily streamflow at ungaged catchments,
Water Resour. Res., 46, W10513, 10.1029/2009WR008481, 2010.Bárdossy, A.: Calibration of hydrological model parameters for ungauged
catchments, Hydrol. Earth Syst. Sci., 11, 703–710, 10.5194/hess-11-703-2007, 2007.Bárdossy, A. and Singh, S. K.: Robust estimation of hydrological model
parameters, Hydrol. Earth Syst. Sci., 12, 1273–1283, 10.5194/hess-12-1273-2008, 2008.
Bergström, S. and Forsman, A.: Development of a conceptual deterministic
rainfall-runoff model, Nord. Hydrol., 4, 174–190, 1973.Beven, K. J.: Uniqueness of place and process representations in hydrological
modelling, Hydrol. Earth Syst. Sci., 4, 203–213, 10.5194/hess-4-203-2000, 2000.
Beven, K. J. and Freer, J.: Equifinality, data assimilation, and data uncertainty
estimation in mechanistic modelling of complex environmental systems using
the GLUE methodology, J. Hydrol., 249, 11–29, 2001.
Blöschl, G., Sivapalan, M., Wagener, T., Viglione, A., and Savenije, H. E.:
Runoff Prediction in Ungauged Basins: Synthesis across Processes, Places and
Scales, Cambridge University Press, Cambridge, 2013.
Boyle, D. P., Gupta, H. V., Sorooshian, S., Koren, V., Zhang, Z., and Smith,
M.: Toward Improved Streamflow Forecasts: Value of Semidistributed Modeling,
Water Resour. Res., 37, 2749–2759, 2001.
Duan, Q., Schaake, J., Andreassian, V., Franks, S., Goteti, G., Gupta, H.,
Gusev, Y., Habets, F., Hall, A., Hay, L., Hogue, T., Huang, M., Leavesley,
G., Liang, X., Nasonova, O., Noilhan, J., Oudin, L., Sorooshian, S., Wagener,
T., and Wood, E.: Model Parameter Estimation Experiment (MOPEX): An overview
of science strategy and major results from the second and third workshops,
J. Hydrol., 320, 3–17, 2006.
Falcone, J. A., Carlisle, D. M., Wolock, D. M., and Meador, M. R.: GAGES: A
stream gage database for evaluating natural and altered flow conditions in
the conterminous United States: Ecological Archives E091-045, Ecology, 91, 621, 2010.
Fernandez, W., Vogel, R., and Sankarasubramanian, A.: Regional calibration of a
watershed model, Hydrolog. Sci. J., 45, 689–707, 2000.
Gaborit, É., Ricard, S., Lachance-Cloutier, S., Anctil, F., Turcotte, R.,
and Polat, A.: Comparing global and local calibration schemes from a differential
split-sample test perspective, Can. J. Earth Sci., 52, 990–999, 2015.
Grigg, D.: The logic of regional systems 1, Ann. Assoc. Am. Geogr., 55, 465–491, 1965.
Gupta, H., Kling, H., Yilmaz, K., and Martinez, G.: Decomposition of the mean
squared error and NSE performance criteria: Implications for improving
hydrological modelling, J. Hydrol., 377, 80–91, 2009.
Hrachowitz, M., Savenije, H. H., Blöschl, G., McDonnell, J. J., Sivapalan,
M., Pomeroy, J., Arheimer, B., Blume, T., Clark, M. P., Ehret, U., Fenicia, F.,
Freer, J. E., Gelfan, A., Gupta, H. V., Hughes, D. A., Hut, R., Montanari, A.,
Pande, S., Tetzlaff, D., Troch, P. A., Uhlenbrook, S., Wagener, T., Winsemius,
H., Woods, R. A., Zehe, E., and Cudennec, C.: A decade of Predictions in Ungauged
Basins (PUB) – a review, Hydrolog. Sci. J., 58, 1198–1255, 2013.
McDonnell, J. and Woods, R.: On the need for catchment classification, J. Hydrol.,
299, 2–3, 2004.McIntyre, N., Lee, H., Wheater, H., Young, A., and Wagener, T.: Ensemble
predictions of runoff in ungauged catchments, Water Resour. Res., 41, W12434,
10.1029/2005WR004289, 2005.
Moore, R. J.: The probability-distributed principle and runoff production at
point and basin scales, Hydrolog. Sci. J., 30, 273–297, 1985.
Nash, J. and Sutcliffe, J.: River flow forecasting through conceptual models.
1. A discussion of principles, J. Hydrol., 10, 282–290, 1970.Oudin, L., Kay, A., Andréassian, V., and Perrin, C.: Are seemingly
physically similar catchments truly hydrologically similar?, Water Resour. Res.,
46, W11558, 10.1029/2009WR008887, 2010.Parajka, J., Blöschl, G., and Merz, R.: Regional calibration of catchment
models: Potential for ungauged catchments, Water Resour. Res., 43, W06406,
10.1029/2006WR005271, 2007.
Razavi, T. and Coulibaly, P.: Streamflow prediction in ungauged basins: review
of regionalization methods, J. Hydrol. Eng., 18, 958–975, 2012.
Ricard, S., Bourdillon, R., Roussel, D., and Turcotte, R.: Global calibration
of distributed hydrological models for large-scale applications, J. Hydrol. Eng.,
18, 719–721, 2012.Samaniego, L., Bárdossy, A., and Kumar, R.: Streamflow prediction in ungauged
catchments using copula-based dissimilarity measures, Water Resour. Res., 46,
W02506, 10.1029/2008WR007695, 2010.Sawicz, K., Wagener, T., Sivapalan, M., Troch, P. A., and Carrillo, G.:
Catchment classification: empirical analysis of hydrologic similarity based
on catchment function in the eastern USA, Hydrol. Earth Syst. Sci., 15,
2895–2911, 10.5194/hess-15-2895-2011, 2011.
Schaefli, B. and Gupta, H.: Do Nash values have value?, Hydrol. Process.,
21, 2075–2080, 2007.Sivakumar, B. and Singh, V. P.: Hydrologic system complexity and nonlinear dynamic
concepts for a catchment classification framework, Hydrol. Earth Syst. Sci.,
16, 4119–4131, 10.5194/hess-16-4119-2012, 2012.
Sivapalan, M.: Prediction in ungauged basins: a grand challenge for theoretical
hydrology, Hydrol. Process., 17, 3163–3170, 2003.Toth, E.: Catchment classification based on characterisation of streamflow and
precipitation time series, Hydrol. Earth Syst. Sci., 17, 1149–1159,
10.5194/hess-17-1149-2013, 2013.Wagener, T., Boyle, D. P., Lees, M. J., Wheater, H. S., Gupta, H. V., and
Sorooshian, S.: A framework for development and application of hydrological
models, Hydrol. Earth Syst. Sci., 5, 13–26, 10.5194/hess-5-13-2001, 2001.
Wagener, T., Sivapalan, M., Troch, P., and Woods, R.: Catchment classification
and hydrologic similarity, Geogr. Compass, 1, 901–931, 2007.Zeleny, M.: Multiple Criteria Decision Making, McGraw-Hill, New York, USA, 1981.
Zhao, R. J. and Liu, X.: The Xinanjiang model, in: Computer Models of Watershed
Hydrology, Water Resources Publications, Littleton, Colorado, USA, 215–232, 1995.