Understanding the projection performance of hydrological models under contrasting climatic conditions supports robust decision making, which highlights the need to adopt time-varying parameters in hydrological modeling to reduce performance degradation. Many existing studies model the time-varying parameters as functions of physically based covariates; however, a major challenge remains in finding effective information to control the large uncertainties that are linked to the additional parameters within the functions. This paper formulated the time-varying parameters for a lumped hydrological model as explicit functions of temporal covariates and used a hierarchical Bayesian (HB) framework to incorporate the spatial coherence of adjacent catchments to improve the robustness of the projection performance. Four modeling scenarios with different spatial coherence schemes and one scenario with a stationary scheme for model parameters were used to explore the transferability of hydrological models under contrasting climatic conditions. Three spatially adjacent catchments in southeast Australia were selected as case studies to examine the validity of the proposed method. Results showed that (1) the time-varying function improved the model performance but also amplified the projection uncertainty compared with the stationary setting of model parameters, (2) the proposed HB method successfully reduced the projection uncertainty and improved the robustness of model performance, and (3) model parameters calibrated over dry years were not suitable for predicting runoff over wet years because of a large degradation in projection performance. This study improves our understanding of the spatial coherence of time-varying parameters, which will help improve the projection performance under differing climatic conditions.

Long-term streamflow projection is an important part of effective water resources planning because it can predict future scarcity in water supply and help prevent floods. Streamflow projections typically involve the following: (i) calibrating hydrological model parameters with partial historical observations (e.g., precipitation, evaporation, and streamflow); (ii) projecting streamflow under periods that are outside of those for model calibration; and (iii) evaluating the model projection performance with certain criteria. One of the most basic assumptions of this process – that the calibrated model parameters are stationary and can be applied to predict catchment behaviors in the near future, has been widely questioned (Brigode et al., 2013; Broderick et al., 2016; Chiew et al., 2009, 2014; Ciais et al., 2005; Clarke, 2007; Cook et al., 2004; Coron et al., 2012; Deng et al., 2016; Merz et al., 2011; Moore and Wondzell, 2005; Moradkhani et al., 2005, 2012; Pathiraja et al., 2016, 2018; Patil and Stieglitz, 2015; Westra et al., 2014; Xiong et al., 2019; Zhang et al., 2018).

Many previous studies have explored the transferability of stationary parameters to periods with different climatic conditions. They have concluded that hydrological model parameters are sensitive to the climatic conditions of the calibration period (Chiew et al., 2009, 2014; Coron et al., 2012; Merz et al., 2011; Renard et al., 2011; Seiller et al., 2012; Vaze et al., 2010). For instance, Merz et al. (2011) calibrated model parameters using six consecutive 5-year periods between 1976 and 2006 for 273 catchments in Austria and found that the calibrated parameters representing snow and soil moisture processes showed a significant trend in the study area. Other studies have found that degradation in model performance was directly related to the difference in precipitation between the calibration and verification periods (Coron et al., 2012; Vaze et al., 2010). One proposal for managing this problem is to calibrate model parameters in periods with similar climatic conditions to the near future, but future streamflow observations are unavailable. Thus, it is still necessary to reduce the magnitude of performance loss and improve the robustness of the projection performance using calibrated parameters based on the historical records, even though the climatic conditions in the future may be dissimilar to those used for model calibration.

Several recent studies have found that hydrological models with time-varying parameters exhibited a significant improvement in their projection performance compared with those using the stationary parameters (Deng et al., 2016, 2018; Westra et al., 2014). The functional method is one of the most promising ways to model time-varying parameters and shows its excellence in improving the model projection performance (Guo et al., 2017; Westra et al., 2014; Wright et al., 2015). This method models the time-varying parameter(s) as the function(s) of physically based covariates (e.g., temporal covariate and Normalized Difference Vegetation Index). Generally, the hydrological model is run with various assumed functions, and the best functional forms of time-varying parameters can be obtained by comparing the evaluation criteria. However, a major challenge for the application of the functional method remains in finding effective information to control the large uncertainties that are linked to the additional parameters describing these regression functions.

The similarity of adjacent catchments has been verified, along with the validity of controlling the estimation uncertainty of model parameters (Bracken et al., 2018; Cha et al., 2016; Cooley et al., 2007; Lima and Lall, 2009; Najafi and Moradkhani, 2014; Sun and Lall, 2015; Sun et al., 2015; Yan and Moradkhani, 2015). The level of similarity of different catchments is known as spatial coherence. For instance, Sun and Lall (2015) used the spatial coherence of trends in annual maximum precipitation in the United States and successfully reduced the parameter estimation uncertainty in their on-site frequency analysis. In general, there are three methods to consider the spatial coherence between different catchments in parameter estimation. The first one is no pooling, which means every catchment is modeled independently, and all parameters are catchment-specific. The second one is complete pooling, which means all parameters are considered to be common across all catchments. The third and last one is the hierarchical Bayesian (HB) framework, also known as partial pooling, which means some parameters are allowed to vary by catchments and some parameters are assumed to drown from a common hyper-distribution across the region that consists of different catchments. In these three approaches, the HB framework has been proven to be the most efficient method to incorporate the spatial coherence to reduce the estimation uncertainty because it has the advantage of shrinking the local parameter toward the common regional mean and including an estimation of its variance or covariance across the catchments (Bracken et al., 2018; Sun and Lall, 2015; Sun et al., 2015). In the field of hydrological modeling, most preceding studies were focused on no-pooling models that neglect the spatial coherence between catchments (Heuvelmans et al., 2006; Lebecherel et al., 2016; Merz and Bloschl, 2004; Oudin et al., 2008; Singh et al., 2012; Tegegne and Kim, 2018; Xu et al., 2018); little attention has been paid to the HB framework. Thus, we want to fill this gap and explore the applicability of the spatial coherence through the HB framework in hydrological modeling with the time-varying parameters.

The objectives of this paper were to (1) verify the effect of the time-varying model parameter scheme on model projection performance and uncertainty analysis compared with stationary model parameters, (2) verify the projection performance of a scheme that considers the spatial coherence of adjacent catchments through the HB framework compared with spatial incoherence, and (3) compare the model projection performance for different climatic transfer schemes.

The rest of the paper is organized as follows. Section 2 outlines the methodology employed in this study including differential split-sample test (DSST) for segmenting the historical series, the hydrological model, and the two-level HB framework for incorporating spatial coherence from adjacent catchments. Section 3 presents the information on the study area and data. The results and discussion are described in Sect. 4. Section 5 summarizes the main conclusions of the study.

The methodology is outlined by a flowchart in Fig. 1, and is summarized as
follows:

A temporal parameter transfer scheme is implemented (described in Sect. 2.1) using a classic DSST procedure in which the available data are divided into wet and dry years.

A daily conceptual rainfall–runoff model is used (outlined in Sect. 2.2).

A two-level HB framework is used to incorporate spatial coherence in hydrological modeling (described in Sect. 2.3). The process layer (first level) of the framework models the temporal variation in the model parameters using a time-varying function, while the prior layer (second level) models the spatial coherence of the regression parameters in the time-varying function. Four modeling scenarios with different spatial coherence schemes and one scenario with a stationary scheme for the model parameters are used to evaluate the transferability of hydrological models under contrasting climatic conditions.

Likelihood function and parameter estimation methods are applied (outlined in Sect. 2.4).

The criteria are used to evaluate the model performance for various model scenarios (described in Sect. 2.5).

Flow chart of the methodology for integrating inputs from spatially coherent catchments and temporal variation of model parameters into a hydrological model under contrasting climatic conditions (wet and dry years).

To verify the projection performance of the rainfall–runoff model under contrasting climatic conditions (wet and dry years), a classic DSST using annual rainfall records was adopted.

Two separate tasks were needed to develop the DSST method into a working system. The first step was to define “dry years”. The method to define the dry years is adopted from Saft et al. (2015), which is a rigorous identification method that treats autocorrelation in the regression residuals, undertakes global significance testing, and defines the start and end of the droughts individually for each catchment. Saft et al. (2015) tested several algorithms for dry-year delineation, which considered different combinations of dry run length, dry run anomaly, and various boundary criteria and found that the identification results of dry years by one of the algorithms showed marginal dependence on the algorithm and the main results were robust to different algorithms. The detailed processes could be found on Saft et al. (2015) and are also generalized as follows.

First, the annual rainfall data were calculated relative to the annual
mean, and the anomaly series was divided by the mean annual rainfall and
smoothed with a 3-year moving window. Second, the first year of the
drought remained the start of the first 3 years of the negative anomaly period.
Third, the exact end date of the dry years was determined through analysis
of the unsmoothed anomaly data from the last negative 3-year anomaly. The
end year was identified as the last year of this 3 year period unless (i)
there was a year with a positive anomaly

In the second step, the wet years were defined as the complement of the dry years in the historical records. A similar approach to define the dry and wet years was used by Fowler et al. (2016).

In the DSST method, the model parameters calibrated in the wet years were
evaluated in the dry years, and vice versa. In addition, criteria (i.e,
NSE

The hydrological model used in this study is the GR4J (modèle du
Génie Rural à 4 paramètres Journalier), which is a lumped
conceptual rainfall–runoff model (Perrin et al., 2003). The original
version of the GR4J model (Fig. 2) comprised four parameters (Perrin et
al., 2003): production store capacity (

Schematic diagram of the GR4J rainfall–runoff model adopted by
Perrin et al. (2003). In the figure,

The GR4J model is a parsimonious but efficient model. The model has been used successfully across a wide range of hydro-climatic conditions across the world, including the crash testing of model performance under contrasting climatic conditions (Coron et al., 2012), and the simulation of runoff for revisiting the deficiency in insufficient model calibration (Fowler et al., 2016). For example, Fowler et al. (2016) verified that conceptual rainfall–runoff models were more capable under changing climatic conditions than previously thought. These characteristics make the GR4J particularly suitable as a starting point for implementing modifications and/or improving predictive ability under changing climatic conditions.

In this study, various versions were constructed for evaluating the
projection capabilities of models for contrasting climatic conditions (wet
and dry years), and for considering the temporal variation and spatial
coherence of parameter

As described in the literature (Pan et al., 2019; Perrin et al., 2003;
Renard et al., 2011; Westra et al., 2014), parameter

Thus, for any catchment

For a heterogeneous region that is distinctly nonuniform in climatic and
geologic conditions, different catchments within the region typically have
different catchment storage capacities and different values of production
store capacity

In this study, independent Gaussian prior distributions were used for the
amplitude

Five modeling scenarios (Table 1) were carried out to assess the effect of
the spatial coherence on the time-varying function. Different levels of spatial
coherence of

Different spatial coherence scenarios for amplitude

NB:

The objective function and parameter inference methods were used to derive the posterior distribution of all unknown quantities, as illustrated below.

For a specific catchment, the model parameters were calibrated to minimize
the following objective function, which was adopted from Coron et al. (2012):

Coron et al. (2012) showed that this objective function performed well.
In this function, the combination of

In the case of multiple catchments, the objective function of the HB
framework was the product of Eq. (3) and the conditional probability of spatial
coherence of regression parameters

The uniform distribution is used as the prior distribution for
hyper-parameters and spatially irrelevant parameters. Meanwhile, spatially
relevant parameters are sampled from the Gaussian distributions. Because the
prior distribution has no impact on the final evaluation of different
scenarios, the prior distributions are not presented in Eq. (5). The likelihood
functions defined in Eqs. (3) and (5) pose a computational challenge because
their dimensionality grows (primarily related to the number of
catchment-specific parameters) with the number of catchments considered. The
unknown quantities, including model parameters (

Five criteria were used to assess the projection performance during the
verification periods.

The first criterion was NSE

The second criterion is the BIAS, one of the most popular indexes to
reflect the deviation degree between the modeled runoff and observations,
and this is also a part of the objective function Eq. (3).

The third criterion is the deviance information criterion (DIC), which
was defined by Spiegelhalter et al. (2002). It is a widely used and
popular measure designed for Bayesian model comparison and is a Bayesian
alternative to the standard Akaike information criterion. The DIC value for
a Bayesian scenario is obtained as follows:

The fourth and fifth criteria are the mean annual maximum flow (MaxF,
mm d

To evaluate the model performance, we used daily precipitation (mm d

The attributes of the southeastern Australian catchments are shown in Table 2 and Fig. 3. The IDs of these catchments are 225219 (Glencairn station on the Macalister River: mean annual rainfall, potential evapotranspiration, and runoff are 1106, 1184, and 368 mm, respectively), 405219 (Dohertys station on the Goulburn River: mean annual rainfall, potential evapotranspiration, and runoff are 1171, 1196, and 420 mm, respectively), and 405264 (D/S of Frenchman Ck Jun station on the Big River: mean annual rainfall, potential evapotranspiration, and runoff are 1408, 1160, and 465 mm, respectively). As shown in Fig. 3, these catchments are adjacent to each other. All catchments experienced a severe multiyear drought around the end of the millennium. Saft et al. (2015) identified that the rainfall–runoff relationship in these catchments was altered during the long-term drought.

Comparison of catchments attributes in terms of mean annual rainfall (mm), mean annual evaporation (mm), and mean annual runoff (mm) for 1976–2011.

Locations of study catchments in Victoria, Australia. The catchment IDs are 225219 (Macalister River catchment), 405219 (Goulburn River catchment), and 405264 (Big River catchment).

Results from the DSST were used to assess the model projection performance for five scenarios under contrasting climatic conditions. First, a DSST was conducted in each catchment to divide original records into wet and dry years. Then, the projection performance for the five scenarios and associated parameter uncertainties were evaluated using the criteria described above.

As illustrated in Table 3 and Fig. 4, the drought definition method identified that the three catchments had similar dry-year characteristics, with the same drought start (1997) and end (2009) points. The length of dry years for the studied catchments is the same, 13 years. The mean dry years' anomaly was more severe in the Macalister catchment (225219), with an 11.70 % reduction in the mean dry years' anomaly while the other two catchments experienced reductions of 11.16 % (405219) and 11.14 % (405264).

Drought identification results for the catchments.

NB:

The identified dry years in all catchments. The annual anomaly is defined as a percentage of the mean annual rainfall.

In terms of changes in rainfall, on average catchments had an 11 % reduction from the wet years to the dry years (Table 3). Meanwhile, these catchments experienced a 26.3 % decrease in runoff during the dry years, which is much more severe than the reduction in rainfall. The similar findings can be derived out from the comparison of runoff coefficients of different periods; that is, all catchments experienced a decrease in its runoff coefficients during the dry years.

As shown in Figs. 5a, 6a, and 7, the calibrated model parameters
yielded a good simulation performance over the calibrated periods for all
criteria. For example, the mean NSE

NSE

NSE

Long-term simulation BIAS of

Figure 5 shows the NSE

Similarly, Fig. 6 illustrates the NSE

Comparing the DIC results for both DSST schemes in Tables 4 and 5, the
best DIC value is achieved by scenario 3, which incorporates the spatial
coherence of both regression parameters and is the most complex scenario in
the comparison. This finding is consistent with the results obtained by using the
NSE

Comparison of five scenarios in terms of the deviance information criterion (DIC) when model parameters were calibrated in the wet years and verified in the dry years.

Comparison of five scenarios in terms of the deviance information criterion (DIC) when model parameters were calibrated in the dry years and verified in the wet years.

Tables 6 and 7 illustrate the performance of high and low flows during the
verification period in terms of MaxF and MinF estimates for the median
projected streamflows in both DSST schemes. As shown in Table 7, for the
projection of the high-flow part, scenario 3 exhibits the best performance in
all catchments among five scenarios under the scheme of calibrating in the
dry years and verifying in the wet years. For the projection performance in
the other DSST scheme (Table 6), scenario 3 has the best projection
performance in the high-flow part in catchment 225219 and is the second-best
scenario in the other two catchments. It indicates that the incorporation of
spatial coherence of both amplitude

Comparison of the projection performance of median flows during the
verification period associated with the mean annual maximum flow (MaxF,
mm d

Note: (1) the data in 1976 have been used for model warm-up to reduce the impact of the initial soil moisture conditions during the calibration period, and is not counted in the table; (2) the scenarios with bold values are labeled as the best scenario for projecting the streamflow during the verification periods, and the values from these scenarios have the least absolute percentage difference with the observed values.

Comparison of the projection performance of median flows during the
verification period associated with the mean annual maximum flow (MaxF,
mm d

Note: (1) The data in 1997 have been used for model warm-up to reduce the impact of the initial soil moisture conditions during the calibration period, and is not counted in the table; (2) The scenarios with bold values are labeled as the best scenario for projecting the streamflow during the verification periods, and the values from these scenarios have the least absolute percentage difference with the observed values.

Figure 7 shows the BIAS estimates for the median of the posterior
distribution of model parameters for all modeling scenarios across all
catchments when transferability between the wet and dry years was examined.
Although BIAS was a component of the objective function (Eq. 3), the
10-year rolling average BIAS still deviated considerably from a value of 1
for all the scenarios in the two DSST schemes. The median estimates of the
posterior distribution in both scenarios performed well in the NSE

Posterior distributions of the regression parameters (

Posterior distributions of the regression parameters (

The uncertainty of the parameters was characterized by the posterior
distribution of the regression parameters and was derived by the MCMC
iteration. As mentioned in Sect. 2.3.2, amplitude

In summary, by combining the results of parameter uncertainty estimation and
model projection performance evaluation, the incorporation of spatial
coherence successfully improved the robustness of the projection performance
in both DSST schemes by controlling the estimation uncertainty of amplitude

In this study, a two-level HB framework was used to incorporate the spatial
coherence of adjacent catchments to improve the hydrological projection
performance of sensitive time-varying parameters for a lumped conceptual
rainfall–runoff model (GR4J) under contrasting climatic conditions. First,
a temporal parameter transfer scheme was implemented, using a DSST procedure
in which the available data were divided into wet and dry years. Then, the
model was calibrated in the wet years and evaluated in the dry years, and
vice versa. In the first level of the proposed HB framework, the most
sensitive parameter in the GR4J model, i.e., the production storage capacity
(

The precipitation, potential evapotranspiration, and streamflow data of the studied catchments in south-eastern Australia are taken from publicly available data (

The supplement related to this article is available online at:

All of the authors helped to conceive and design the analysis. ZP and PL performed the analysis and wrote the paper. SG, JX, JC, and LC contributed to the writing of the paper and made comments.

The authors declare that they have no conflict of interest.

The numerical calculations were done on the supercomputing system in the Supercomputing Center of Wuhan University. The authors would like to thank the editor and anonymous reviewers for their comments, as well as Chong-Yu Xu in the University of Oslo for proofreading an earlier version of the paper, which helped improve the quality of the paper.

This research has been supported by the National Key Research and Development Program (grant no. 2018YFC0407202), the National Natural Science Foundation of China (grant nos. 51861125102 and 51879193), the Natural Science Foundation of Hubei Province (grant no. 2017CFA015), and the Innovation Team in Key Field of the Ministry of Science and Technology (grant nos. 2018RA4014).

This paper was edited by Fabrizio Fenicia and reviewed by two anonymous referees.