Two key sources of uncertainty in projections of future runoff for climate
change impact assessments are uncertainty between global climate models
(GCMs) and within a GCM. Within-GCM uncertainty is the variability in GCM
output that occurs when running a scenario multiple times but each run has
slightly different, but equally plausible, initial conditions. The limited
number of runs available for each GCM and scenario combination within the
Coupled Model Intercomparison Project phase 3 (CMIP3) and phase 5 (CMIP5)
data sets, limits the assessment of within-GCM uncertainty. In this second of
two companion papers, the primary aim is to present a proof-of-concept
approximation of within-GCM uncertainty for monthly precipitation and
temperature projections and to assess the impact of within-GCM uncertainty
on modelled runoff for climate change impact assessments. A secondary aim is
to assess the impact of between-GCM uncertainty on modelled runoff. Here we
approximate within-GCM uncertainty by developing non-stationary stochastic
replicates of GCM monthly precipitation and temperature data. These
replicates are input to an off-line hydrologic model to assess the impact of
within-GCM uncertainty on projected annual runoff and reservoir yield. We
adopt stochastic replicates of available GCM runs to approximate within-GCM
uncertainty because large ensembles, hundreds of runs, for a given GCM and
scenario are unavailable, other than the Climate

Comparison between MIROCM(1) GCM estimates of mean and standard deviation of annual precipitation and mean annual temperature of 20C3M data and stochastically generated values for six worldwide catchments. Generated values are based on 100 replicates, each 151 years long.

This study is part of a research project that seeks to enhance our understanding of the uncertainty of future annual river flows, leading to more informed decision-making for the sustainable management of scarce water resources. This is the second of two papers examining the uncertainty of streamflow estimates derived from global climate models (GCMs). In the first paper, McMahon et al. (2015) assessed the adequacy of GCMs from phase 3 of the Coupled Model Intercomparison Project (CMIP3; Meehl et al., 2007) to simulate observed values of mean annual precipitation, standard deviation of annual precipitation, mean annual temperature, monthly patterns of precipitation and temperature, and Köppen climate classification. Five GCMs (HadCM3, MIROCM, MIUB, MPI and MRI; see Table 1 of McMahon et al. (2015) for full GCM names) were selected as better performing GCMs for use in this second paper.

In this paper we address a significant limitation to characterising the uncertainty of future runoff which is the lack of sufficient GCM runs of historical (20C3M) and future projections (e.g. A1B). Modelling historical runoff involves numerous uncertainties (Peel and Blöschl, 2011) including uncertainties in observed input data used to drive the hydrologic model (Andréassian et al., 2004; McMillan et al., 2011), observed data against which the hydrologic model is calibrated (Di Baldassarre and Montanari, 2009; McMillan et al., 2010), the calibration method and objective function adopted (Efstratiadis and Koutsoyiannis, 2010) and the hydrologic model structure itself (Andréassian et al., 2009; Vogel and Sankarasubramanian, 2003). Additional uncertainty is introduced when modelling future runoff through (1) assuming the hydrologic model calibration applies into the future (Chiew et al., 2014), (2) assuming a bias correction for adjusting GCM data developed over the observed period applies into the future and (3) through differences in future climate projections between GCMs and within a GCM. Recent investigations into uncertainty introduced at different stages of the model train, from GCM to hydrologic model, for climate change impact assessments include Bosshard et al. (2013), Dobler et al. (2012), Hingray and Saïd (2014), Kay et al. (2009), Lafaysse et al. (2014), Prudhomme and Davies (2009a, b), Steinschneider et al. (2012), Teng et al. (2012) and Woldemeskel et al. (2014).

The uncertainty between GCM projections of future climate can be assessed through analysis of runs from a wide range of GCMs, such as those available from CMIP3 and being collated within the Coupled Model Intercomparison Project phase 5 (CMIP5). Our selection of five better performing GCMs from CMIP3 in the companion paper (McMahon et al., 2015) is an attempt to reduce between-GCM uncertainty by removing poorly performing GCMs from the analysis conducted in this paper. Our primary aim in this proof-of-concept paper is to present an approximation of within-GCM uncertainty, which is the variability in GCM output that occurs when running a scenario multiple times but each run has slightly different, but equally plausible, initial conditions. Although the importance of within-GCM uncertainty for climate change impact assessments has been highlighted by Tebaldi and Knutti (2007), Hawkins and Sutton (2009, 2011) and Deser et al. (2012, 2014), to date it has received little attention in the hydrology climate change impact literature. Here we develop an approximation of within-GCM uncertainty and apply it to a climate change impact assessment for future runoff and reservoir yield.

The magnitude of within-GCM uncertainty for a metric like mean annual
precipitation can be assessed directly from GCM output if a large enough
ensemble of runs from a GCM for a given emission scenario are available. The
number of GCM runs required to adequately assess uncertainty depends upon
the metric of interest and the level of confidence adopted. For example, for
a given level of confidence an extreme value metric will require many more
runs than a mean to obtain a reliable estimate. For a more detailed
discussion of this issue see Salas (1992). Currently, large ensembles of
runs from each GCM and scenario are unavailable. In the CMIP3 data set most
GCMs have a single run of a given scenario from which a direct assessment of
within-GCM uncertainty is impossible. In terms of ensemble members, CMIP5 is
an improvement over CMIP3 in that more runs of each scenario are being
reported for each GCM. However, the number of runs per GCM and scenario
combination in CMIP5 is still of the order of 3 to 10, rather than the
hundreds of runs required for adequate estimation of within-GCM uncertainty
of some metrics. The Climate

Previous assessments of the impact of within-GCM uncertainty on runoff have been limited by the lack of available GCM runs. For example, when investigating sources of uncertainty in the climate change impact on hydrology, Chen et al. (2011) were limited to 5 runs with different initial conditions from the MRI GCM. Similarly, Velázquez et al. (2013) were limited to five runs from one GCM and three runs from a second in their comparison of the uncertainty due to hydrologic models and within-GCM uncertainty. Prudhomme and Davies (2009a) sought to overcome the limited number of GCM runs by introducing a seasonal block-resampling technique to estimate natural climate variability via 100 bootstrap replicates of observed and GCM time series. In Prudhomme and Davies (2009b) they applied seasonal block-resampling to a 30-year baseline period and future period to assess whether climate change impacts were significantly different to baseline climate variability. However, seasonal block-resampling is unable to address inter-decadal variability, as noted by Kay et al. (2009), or periods with significant trend as the bootstrap replicates will scramble any inter-decadal variability or trend. Finally, Hingray and Saïd (2014) and Lafaysse et al. (2014) adopted a stochastic approach whereby they generated 100 stochastic replicates from each of 6 statistical downscaling models for each of 11 runs from 5 GCMs (Hingray and Saïd, 2014), or 12 runs from 6 GCMs (Lafaysse et al., 2014). They used this multi-model ensemble of stochastic replicates to investigate the magnitude of within- and between-GCM uncertainty for the Durance catchment in France.

In this proof-of-concept paper, we develop an approximation of within-GCM uncertainty using non-stationary stochastic replicates of GCM monthly precipitation and temperature data that seeks to preserve any inter-decadal variability and trend. Unlike Hingray and Saïd (2014) and Lafaysse et al. (2014) whose replicates were produced by the statistical downscaling model, here we stochastically replicate the original GCM runs prior to downscaling. Estimating uncertainty in a time-series metric via stochastic modelling of a time series is standard hydrologic practice (Hipel and McLeod, 1994). A stochastic model is fit to the time series of interest and an ensemble of time-series replicates with the same stochastic properties as the original series is generated. The metric of interest is calculated for each ensemble member and the metric uncertainty is estimated from the distribution of metric values. In this paper we stochastically replicate the GCM output data, then use an ensemble of stochastic replicates as input to an off-line hydrologic model to estimate an ensemble of future runoff projections, from which we estimate the variability in mean and variance of annual runoff. Finally, the ensemble of future runoff projections is used to investigate the impact of within- and between-GCM uncertainty on future reservoir yield.

In this paper we model runoff in an off-line hydrologic model rather than adopt GCM generated runoff. Arora (2001) demonstrated the quality of GCM runoff mainly depends on the quality of GCM precipitation, with any bias in precipitation amplified in the resulting runoff. In the companion paper (McMahon et al., 2015), we assessed GCM bias in reproducing observed precipitation conditions and found substantial biases for all GCMs; thus, we would expect significant bias in runoff generated by a GCM. Furthermore, Sperna Weiland et al. (2012) found that runoff estimates from an external hydrologic model generally outperformed GCM runoff estimates. However, Sperna Weiland et al. (2012) noted that when the GCM Land Surface Scheme is specifically tuned to reproduce observed runoff and a routing scheme is added then GCM runoff becomes more acceptable. We also use the terms streamflow and runoff interchangeably and adopt depth (in millimetres) as a measure of flow rather than a volume unit.

Following this introduction, in Sect. 2 we outline the approximation methodology (ensemble empirical mode decomposition (EEMD), stochastic data generation, quantile–quantile bias correction of precipitation and temperature, precipitation–evapotranspiration–runoff modelling and uncertainty in reservoir yield) and related literature. We test our stochastic within-GCM uncertainty approximation for the largest ensemble of GCM runs in the CMIP3 data set for a given GCM and scenario in Sect. 3. In Sect. 4 results of applying the methodology to output from five GCMs identified in the companion paper (McMahon et al., 2015) are presented and discussed. Conclusions from the analysis and discussion are presented in Sect. 5. Further details about the precipitation–evapotranspiration–runoff model, source code and example input and output are provided in the Supplement.

The methodology to approximate within-GCM uncertainty and assess the impact of within- and between-GCM uncertainty on future runoff and reservoir yield is shown in Fig. 1. Five better performing GCMs were identified in the companion paper for use in this paper through a literature review and assessment of how well CMIP3 GCMs reproduced observed mean annual precipitation, annual temperature and average monthly precipitation and temperature at GCM grid cell scales (McMahon et al., 2015). The five GCMs identified were HadCM3, MIROCM(1), MIUB(1), MPI(1) and MRI(3), where the number in brackets refers to the run number for that GCM in the CMIP3 data set (see McMahon et al. (2015), Table 1 for full GCM names). As part of the analysis in McMahon et al. (2015) catchment average values of concurrent monthly precipitation and temperature for the 20C3M and A1B emissions scenarios were extracted from each GCM in the CMIP3 data set. The catchment average was calculated for each catchment and GCM combination by determining the proportion of catchment area associated with each GCM grid cell and performing an area weighted average of the GCM data for each month. Catchment average precipitation and temperature from the five better performing GCMs are used throughout this paper.

Outline of process to approximate within-GCM uncertainty of future runoff and reservoir yield. The companion paper is McMahon et al. (2015).

An ideal assessment of within-GCM uncertainty would involve analysis of hundreds of runs of a single GCM for a given scenario with each run having slightly different, but equally plausible, initial conditions. Each run in this ideal ensemble would have a different sequence of monthly values and a different overall trend. How different the monthly sequence and overall trend is from one run to the next represents the within-GCM uncertainty. In this paper we do not seek to approximate the overall trend, as this information is best provided by a GCM responding to an emissions scenario. Here we approximate differences in the monthly sequence around the trend by using stochastic data generation. To achieve this we de-trend the catchment average GCM data, stochastically replicate the de-trended series and add the trend to the stochastic data to form a stochastic replicate of the GCM data for the entire period of GCM record. In this way we approximate the uncertainty around the overall trend, but not the uncertainty in the trend. Therefore, the approximation presented here represents an underestimate of the true within-GCM uncertainty as the trends used are restricted to those available in GCM runs in the CMIP3 data set. This stochastic methodology is a temporary solution for approximating within-GCM uncertainty until sufficient GCM runs become available to directly estimate within-GCM uncertainty from a large ensemble of GCM runs.

The procedure adopted here to approximate within-GCM uncertainty for a
catchment consists of the following steps (see Fig. 1):

De-trend the 20C3M and A1B catchment average GCM monthly precipitation and temperature data using EEMD. EEMD also allows any low-frequency signals in the time series to be identified.

Generate stochastically at a monthly time step

Add the appropriate trend to the time series for each replicate of monthly precipitation and temperature.

Bias correct both the precipitation and the temperature time series using the quantile–quantile procedure.

Calibrate the precipitation–evapotranspiration–runoff monthly model (PERM) for each catchment using observed precipitation, temperature and runoff data.

Model runoff using PERM and the bias-corrected stochastic replicates of GCM monthly precipitation and temperature.

Compute mean annual runoff (MAR), standard deviation of annual runoff (SDR), the lag-1 serial correlation of annual runoff (lag-1) and hypothetical reservoir yield for each replicate.

Estimate the within- and between-GCM uncertainty in MAR, SDR and lag-1 and hypothetical reservoir yield based on the 100 replicates.

GCM projections of precipitation and temperature are non-stationary in terms of mean and existing stochastic data generation techniques generally deal with stationary data. In order to apply existing stochastic methods we de-trend the GCM monthly precipitation and temperature data using EEMD.

The original empirical mode decomposition (EMD) algorithm, introduced by Huang et al. (1998), is an adaptive spectral analysis technique that is robust when applied to non-linear and non-stationary data. EMD decomposes a time series into a set of intrinsic mode functions (IMFs) and a residual. Each IMF is a zero-mean fluctuation in which the frequency and amplitude may vary within a given IMF. Subsequent IMFs represent progressively lower frequency fluctuations. The EMD residual captures any trend in a time series which may be an unresolved low-frequency fluctuation with an average period longer than the period of record or a linear or non-linear trend. The nature of the EMD residual is not assumed prior to running the algorithm, rather it is a data-driven output. More recently, Wu and Huang (2009) proposed ensemble EMD (EEMD), a noise assisted data analysis procedure as an improvement over the original EMD. In EEMD, an ensemble of EMD trials is obtained by adding white noise of finite amplitude to the time series prior to each EMD run. The IMFs and residual from each trial are grouped by IMF order into ensembles and the average of each IMF group and the average residual yield the EEMD result. Because the white noise is different for each EMD trial, during averaging the noise cancels out as the ensemble size increases. The purpose of the noise is to change the ordering of local maxima and minima within the time series, thus generating a different EMD outcome in each trial. Details are given in Wu and Huang (2009) and an application to the Southern Oscillation Index is presented by Peel et al. (2011b) and to Australian monthly rainfall and temperature by Srikanthan et al. (2011).

For this analysis the relevant features of EEMD are the residual, which
represents the time-series trend, and any low-frequency signal in the GCM
data. Some GCMs reproduce features of the El Niño–Southern Oscillation
(ENSO) and associated low-frequency variability (van Oldenborgh et al.,
2005). For GCMs with an ENSO signal in precipitation we would like to
maintain this information in the stochastic replicates. To identify low-frequency signals in GCM data we follow Wu and Huang (2004) and compare each
set of EEMD results against a white noise model. Low-frequency IMFs (average
period

EEMD was applied to 20C3M and A1B precipitation and temperature data and the
residual (trend) identified. For temperature data all IMFs are summed
together to form a de-trended time series ready for stochastic replication.
For precipitation data, where a low-frequency signal is not present, all IMFs
are summed together to form a de-trended time series ready for stochastic
replication. Where a low-frequency precipitation signal is identified, all
IMFs with an average period

In this EEMD analysis we use a rational spline EMD (Pegram et al., 2008) with
a tension parameter

In this step we approximate uncertainty around the GCM trend by generating
stochastic replicates of de-trended GCM catchment average time series of
concurrent monthly precipitation and temperature. In order to preserve any
cross-correlation between the precipitation and temperature series and their
auto-correlations, the Matalas (1967) multi-site stochastic data generation
procedure was adopted. In order to preserve any low-frequency precipitation
information, the generation procedure also needs to be able to simulate both
high- and low-frequency time series. To achieve this we adapt the method of
McMahon et al. (2008) who used EMD to decompose 6-month precipitation
data into intra- and inter-decadal components, replicated each component
separately, and then combined the component replicates to form the
6-month precipitation replicate. In this way their stochastic replicates
were able to reproduce observed multi-year dry periods. Replicating intra-
and inter-decadal components separately was possible in McMahon et al. (2008)
as IMFs from EMD, and EEMD, are orthogonal to each other. In this
paper we use EEMD to identify any low-frequency component (

The first step in the data generation process is to remove the trends (one
for precipitation and one for temperature) identified through EEMD analysis
in Sect. 2.2 from the monthly precipitation and temperature time series.
If the GCM precipitation does not contain a low-frequency component then
there are two separate time series to replicate concurrently: (1) the
de-trended temperature (sum of EEMD IMFs), and (2) the de-trended
precipitation (sum of EEMD IMFs). If GCM precipitation does contain a low-frequency component then the de-trended precipitation is divided into a high-frequency component (sum of EEMD IMFs with average period

For sites without a low-frequency precipitation component, the following
auto-regressive lag-1 (AR1) model is appropriate:

To take into account the skewness in a time series,

The matrices

For sites with a low-frequency precipitation component, an AR(2) model is
used to incorporate the low-frequency component. A general multi-site AR(2)
model takes the following form:

Due to problems with inverting matrices that are not positive semi-definite
and only the low-frequency precipitation is AR(2), a contemporaneous form of
the model in Eqs. (12) and (13) is used (Hipel and McLeod, 1994):

Matrix

As mentioned above, to ensure the generated monthly precipitation data
preserved the annual characteristics, the generated monthly precipitation
data were nested in an annual AR(1) model.

Comparison between MIROCM(1) GCM estimates of mean and standard deviation of monthly precipitation and mean monthly temperature of 20C3M data and stochastically generated values for catchment 6304. Generated values are based on 100 replicates, each 151 years long.

The stochastic model was tested by applying the above procedure to monthly
precipitation and temperature data for 20CM3 from the MIROCM GCM after the
data were subjected to EEMD analysis. Table 1 summarises the performance of
the stochastic procedure to replicate annual data for six catchments
covering a range of climate types worldwide. Five of the catchments were
modelled by an AR(1) process, whereas station 6304 required an AR(2) model
because it exhibited a low-frequency precipitation component. Table 2
summarises the performance of the stochastic procedure to replicate monthly
data for station 6304. Overall, the stochastic model performed
satisfactorily at the monthly and annual timescales. As a general rule one
would expect the value of the input parameters (GCM in this study) to be
within

Prior to using GCM, or stochastic replicates of GCM, data in a climate change impact assessment, any bias between GCM and observed conditions needs to be corrected. The extent of bias in the GCM precipitation and temperature data is reported in the companion paper (McMahon et al., 2015). For example, the MAP data for MIUB(3) compared with CRU MAP data at the GCM grid scale exhibit a slope of 0.69 on logarithmic scales, which indicates the GCM overestimates low MAP and underestimates high MAP. Mean annual temperatures are much less biased and require only a small amount of bias correction.

Ehret et al. (2012) presented a detailed review of bias correction and discusses the associated assumptions and implications of applying bias correction to GCM or regional climate model data. Many procedures are available for bias correction, with techniques falling into two categories: dynamical downscaling and statistical downscaling. Dynamical downscaling procedures are sophisticated and resource intensive (Tisseuil et al., 2010) and are impractical for applying to globally distributed catchments and a range of GCMs as proposed in this study. In keeping with the proof-of-concept nature of this paper we adopt a simple empirical-statistical downscaling and error correction approach that is appropriate for bias correcting catchment average monthly (not daily) GCM outputs for input into a lumped (not spatially distributed) hydrologic model. We did not adopt the delta change method, also known as simply daily scaling (Chiew, 2010), where the observed series is scaled by the relative difference between future and baseline conditions, as delta change would not make full use of the re-ordering of precipitation and temperature events provided by the stochastic replicates. Rather, we adopted quantile—quantile or quantile mapping as discussed in Themeßl et al. (2012) and Bárdossy and Pegram (2011). The basis of the quantile–quantile bias correction is a comparison of the empirical cumulative distribution functions (ECDF) of the observed data and the GCM data for a common period. Here the common period is the observed catchment record and the concurrent period of GCM data from the 20C3M scenario. The difference between observed and GCM ECDFs for a given value provides the bias correction. Here we also adopt the frequency adaptation method discussed in Themeßl et al. (2012) for when the GCM series has a higher frequency of zero values than the observed series. The issue of new extremes, values outside the range of the GCM and observed data during the period in which the bias correction is established, was also investigated by Themeßl et al. (2012). We adopt option QMv1a of Themeßl et al. (2012), which takes the bias correction at the highest (lowest) quantile and applies that correction to all new upper (lower) extremes. In our analysis we establish and apply a bias correction for each calendar month (12 corrections in all), rather than a single correction for the whole of record at each catchment.

An assumption of using this bias correction is that the correction applies
into the future under different conditions. This assumption is supported by
Teutschbein and Seibert (2013) who found the quantile–quantile method
performed best out of six alternate bias corrections in differential split
sample tests for non-stationary conditions. The quantile–quantile bias
correction is applied to the precipitation and temperature stochastic
replicates (100 replicates of

In order to convert GCM monthly precipitation and temperature into runoff, the PERM model was developed specifically to meet the requirements for hydrologic modelling in this project. PERM is a simple lumped, not spatially distributed, conceptual precipitation–runoff model run on a monthly time step with 5 parameters to be optimised. The time step was dictated by the availability of monthly streamflow data and concurrent precipitation and temperature data. Further details about the precipitation–evapotranspiration–runoff model, source code and example input and output are provided in the Supplement.

Structure of the monthly conceptual precipitation–evapotranspiration–runoff model (PERM) where the five calibration parameters are highlighted in bold.

Locations of the initial 699 catchments and the final sub-set of 17 catchments.

The structure of PERM is shown in Fig. 2 with the parameters to be
calibrated highlighted in bold. As observed in Fig. 2 monthly
precipitation is either added to the interception store (if the monthly mean
daily temperature is

Two other hydrologic processes – impervious area runoff and deep recharge – were considered for inclusion in PERM. The inclusion of impervious area was considered unnecessary. With respect to deep seepage, the reviews of Petheram et al. (2002), Scanlon et al. (2006) and Crosbie et al. (2010) suggest the maximum effect could be, on average, equivalent to 5 % of the long-term average annual precipitation. From the results of these reviews and taking into account the model time step, the available data and the fact that the parameters in PERM are calibrated, it was concluded that incorporating deep seepage would yield little benefit to the modelling exercise.

PERM is run on a monthly time step and calibrated against observed annual
runoff. Details of the calibration are set out in the Supplement.
In summary, an objective function, defined as the sum of squared
differences between the estimated and observed annual runoff, was minimised
with penalties applied to the objective function to ensure the calibrated
model approximately reproduced the mean and coefficient of variation of
observed annual runoff. An automatic pattern search optimisation method was
used to calibrate the model (Hooke and Jeeves, 1961; Monro, 1971) with
10 different parameter sets used as starting points to increase the likelihood
of finding the global optimum of parameter values. A

Details of the 17 selected catchments.

An objective of our study is to examine the within-GCM uncertainty in runoff
estimated from GCM projections of precipitation and temperature. To examine
this uncertainty we need to minimise any uncertainty in future runoff due to
poor hydrologic model calibration. Therefore, a sub-set of the
699 catchments was selected for further analysis that exhibited minimum error as
a result of the calibration process. Several criteria were used to assess
the adequacy of the PERM calibration for selecting the catchments. These
criteria included the following: the annual Nash–Sutcliffe efficiency (NSE) (Nash and
Sutcliffe, 1970), between observed and modelled runoffs, was

Details of the modelling performance of PERM for the 699 catchments are
presented in the Supplement. For the 17 selected
catchments the difference between the average modelled and observed MAR is

A key assumption of using PERM, or any hydrologic model, to model future runoff is that the calibrated parameters are appropriate for the future climatic conditions. Where future climatic conditions are similar to the observed calibration period, then this assumption is likely to hold. If climatic conditions differ from the calibration period, then there is no evidence to support this assumption. However, in terms of the analysis conducted in the next section this assumption is a pragmatic one that may well affect the bias of future runoffs but should have less impact on the range of uncertainty.

The 17 catchments modelled by PERM are unregulated catchments and do not
have an existing reservoir on which to base our analysis. Therefore, we need
to assume a hypothetical reservoir for each catchment. Many procedures exist
to estimate reservoir yield from a hypothetical storage (see McMahon and
Adeloye, 2005). For the purposes of this analysis we require a method that
is simple to apply as there are 100 replicates of future runoff generated by
PERM from 5 GCMs for the 17 selected catchments. Here we adopt the
Gould–Dincer Gamma (G-DG) procedure for estimating reservoir yield, which is
defined as (McMahon and Adeloye, 2005; Petheram et al., 2008)

Testing our stochastic approximation of within-GCM uncertainty requires multiple runs from a single GCM for a given scenario from which to estimate within-GCM uncertainty and compare against our stochastic results. In the CMIP3 data set the Community Climate System Model (CCSM) GCM has the most runs (seven) for the 20C3M and A1B scenarios. In this section we test the ability of our stochastic methodology to approximate within-GCM uncertainty for the CCSM GCM using the seven available runs for the period 1870–2100 (20C3M and A1B emissions scenarios).

A comparison of within-GCM uncertainty based on seven runs from the CCSM GCM
and the stochastic approximation of within-GCM uncertainty for (a) annual
precipitation and (b) annual temperature for the Herbert River at Gleneagle,
Australia,
is shown in Fig. 4. The CCSM runs and stochastic replicates presented in
Fig. 4 are not bias corrected. In each plot the maximum, median and
minimum annual value for a given year are shown for the seven CCSM runs and
are compared with the maximum, median and minimum of the 700 (7

Within-GCM uncertainty for the Herbert River at Gleneagle
based on seven runs from the CCSM GCM compared with the stochastic
approximation of within-GCM uncertainty for un-bias corrected

In this section we present and discuss results from the methodology described in the previous section to approximate within-GCM uncertainty of precipitation and temperature from five GCMs and assess the consequent impact of these uncertainties on estimated runoff and reservoir yield at 17 catchments for two 30-year periods – 1965–1994 (20C3M emissions scenario) and 2015–2044 (A1B emissions scenario).

Box plots of 30-year mean annual

To assist interpretation of within- and between-GCM uncertainty results for 17 catchments and 5 GCMs subsequently presented in tables and figures, we present results for an example catchment, the Herbert River at Gleneagle in Australia, in Fig. 5. The box plots of MAP (Fig. 5a) and MAT (Fig. 5b) are presented for two 30-year periods for each GCM. These box plots represent our approximation of within-GCM uncertainty of MAP and MAT. The box represents the inter-quartile range of MAP (MAT) from the 100 bias-corrected stochastic replicates of GCM precipitation (temperature). The median MAP (MAT) is represented by the bar across the box and the box-plot whiskers represent the maximum and minimum MAP (MAT) from the 100 replicates. The range of within-GCM uncertainty of MAP (Fig. 5a) is similar for all GCMs except MIROCM(1), where the inter-quartile and maximum–minimum range are approximately 50 % larger. The range of within-GCM uncertainty of MAT (Fig. 5b) is similar for all GCMs.

The box plots in Fig. 5 can also be used to assess between-GCM uncertainty through differences between GCMs in the range of within-GCM uncertainty and differences in the direction of change between 30-year-period box plots. All GCMs have an increasing trend in MAP over time for this catchment except HadCM3 (Fig. 5a), whereas all GCMs show a similar increasing trend in MAT over time (Fig. 5b).

Also shown in Fig. 5 is a Raw symbol plotted next to each box plot. These MAP and MAT values are calculated from bias-corrected original CMIP3 GCM runs and are the only values of MAP and MAT available for this combination of catchment, GCM and scenario if stochastic replication is not used. In a traditional climate change impact assessment, without stochastic replication, the Raw values are all that are available for analysis and the magnitude of uncertainty associated with them is unknown. Figure 5 shows that the range of within-GCM uncertainty associated with Raw values of MAT is smaller than for MAP.

Figure 5 can also be used to check whether our stochastic methodology is performing well at this catchment. Our stochastic methodology generates statistically similar replicates of each 20C3M and A1B GCM run from which we calculate MAP and MAT over two 30-year periods to obtain our box plots. If our methodology is performing well we would expect the Raw values from the original GCM runs to fall within our box-plot range, which they do in all cases. It should be noted that the true within-GCM uncertainty range for MAP and MAT will be larger than what is shown by our box plots since we have only replicated the uncertainty around the GCM trend and not the uncertainty in the trend itself.

In Table 4 within-GCM uncertainty results are presented for the five GCMs
over the period 1965–1994 at the 17 catchments for six variables – MAP,
SDP, MAT, MAR, SDR and lag-1.
Here the six variables have been calculated for each stochastic replicate
and the results are presented as the mean

Although Table 4 provides absolute values of within-GCM uncertainty for each
combination of catchment, GCM and variable, it is difficult to draw
conclusions from this table. Therefore, in Tables 5, 6 and 8 we express
within-GCM uncertainty in relative form as the standard deviation of the
100 replicate estimates as a percentage of the mean of the 100 replicate
estimates. If the 100 replicate values are normally distributed, then
approximately 95 % of the values will be within

Variation and uncertainty in key hydrologic statistics for the five highest ranking GCMs and 17 catchments, based on 100 replicates of de-trended 20C3M for the period 1965–1994.

Continued.

Continued.

Relative within-GCM uncertainty of mean and standard deviation of annual precipitation, mean annual temperature and mean, standard deviation and lag-1 serial correlation of annual runoff. Relative uncertainty is the standard deviation of the 100 replicate estimates as a percentage of the mean replicate estimate for each GCM during the period 1965–1994 (20C3M). The average of the 17 catchment relative uncertainty values is presented, except for lag-1 annual runoff which is the average of the 17 standard deviations.

Relative within-GCM uncertainty of mean and standard deviation of annual precipitation, mean annual temperature and mean, standard deviation and lag-1 serial correlation of annual runoff. Relative uncertainty is the standard deviation of the 100 replicate estimates as a percentage of the mean replicate estimate for each GCM during the period 2015–2044 (A1B). The average of the 17 catchment relative uncertainty values is presented, except for lag-1 annual runoff which is the average of the 17 standard deviations.

Within-GCM uncertainty (mm yr

In this sub-section we present and discuss within- and between-GCM
uncertainty results for annual precipitation (MAP and SDP) and temperature
(MAT). In Table 5 a summary is presented of the within-GCM uncertainty
results shown in Table 4. The uncertainty results in Table 5 are in relative
form (standard deviation as a percentage of the mean), except for lag-1
where the standard deviation is used, and are the average uncertainty across
the 17 catchments for each GCM. Averaging relative uncertainty values across
the catchments allows differences in within-GCM uncertainty between GCMs to
be examined and the average uncertainty across all GCMs for a given variable
of interest to be estimated. For MAP within-GCM uncertainty varies between
GCMs from 3.4 to 4.6 % and the average across the five GCMs is
4.1 %. Given a normal distribution of MAP values across the 100 replicates
this translates into 95 % of MAP values being within

A similar set of results to those in Table 5 are presented in Table 6 for the 30-year period 2015–2044 (A1B). On average across the five GCMs, within-GCM uncertainty of MAP, SDP and MAT are similar between the two periods. However, differences in uncertainty between GCMs are apparent across the two periods. For example, for MAP the maximum and minimum values in Tables 5 and 6 are from MIROCM(1) (4.6 and 4.9 %) and MIUB(1) (3.4 and 3.4 %), respectively. It should be noted that the uncertainties in Table 6 are for the projected values of precipitation, temperature and runoff in 2015–2044, and not the uncertainties in their changes between the earlier and the later periods. Often, regional climate change projections present uncertainties in the projected changes in variables. However, here we present the uncertainties in the projected values, as these are important for the projected runoff and reservoir yield.

Within-GCM uncertainty (mm yr

Within-GCM uncertainty results for MAP, SDP and MAT from Table 4 are also
summarised in Figs. 6 to 8, where results from the 17 catchments are
plotted for each GCM. Figure 6 shows that within-GCM uncertainty in MAP varies
from below 20 mm yr

In this sub-section we present and discuss within- and between-GCM
uncertainty results for annual runoff (MAR, SDR and lag-1). For each GCM and
catchment, 100 bias-corrected stochastic replicates of monthly

Within-GCM uncertainty (

Box plots of 30-year

Within-GCM uncertainty results for MAR and SDR averaged across the
17 catchments for each of the 5 GCMs and expressed in relative form (standard
deviation as a percentage of the mean) are shown in Table 5 for the 30-year
period 1965–1994 (20C3M). Also shown in Table 5 are the standard deviation
and the lag-1 serial correlation of runoff. In Table 5 within-GCM
uncertainty of MAR varies between GCMs from 8.1 to 10.9 % and the
average across the GCMs is 9.7 %. Although the 100 MAR values may not be
normally distributed we would expect roughly 95 % of the MAR values to be
within

Within-GCM uncertainty of MAR, SDR and lag-1 serial correlation for the 30 year period 2015–2044 (A1B) in Table 6 is similar to the results shown in Table 5. However, differences in uncertainty between GCMs are apparent across the two periods. For example, for MAR the minimum values in Tables 5 and 6 are from MIUB(1) (8.1 and 8.0 %), whereas the second highest MAR in Table 5 and the maximum value in Table 6 are from HadCM3 (10.8 and 13.0 %).

Within-GCM uncertainty (mm yr

Within-GCM uncertainty results for MAR and SDR from the 17 catchments and
5 GCMs in Table 4 are summarised in Figs. 10 and 11. Figure 10 shows the
within-GCM uncertainty in MAR varies from below 10 mm yr

Within-GCM uncertainty (mm yr

Average and within-GCM uncertainty of reservoir yield (mm yr

In this sub-section we present and discuss within- and between-GCM
uncertainty results for reservoir yield. The impact of within-GCM
uncertainty on reservoir yield is shown in Fig. 12 for the Herbert River
at Gleneagle. The Gould–Dincer Gamma method was used to estimate average
annual yield from a hypothetical reservoir of capacity equal to 3

Relative within-GCM uncertainty of reservoir yield for hypothetical
reservoirs of 1

Box plots of average reservoir yield for the 30-year periods
1965–1994 (20C3M) and 2015–2044 (A1B) for five GCMs. Reservoir yield is
estimated using the Gould–Dincer Gamma reservoir storage model for reservoir
size

Table 7 lists reservoir yield results for the 17 catchments, 5 GCMs and two
hypothetical reservoir capacities (1

The yield uncertainties in Table 7 due to within-GCM uncertainty are
expressed as a percentage of the mean yield and averaged across the
17 catchments in Table 8. For the larger storage,

Within-GCM uncertainty (mm yr

The main driver of uncertainty in reservoir yield is the variability of
annual reservoir inflows. Figure 14 shows the relationship between
uncertainty in reservoir yield, for

Climate change impact assessments for future hydrology are subject to significant uncertainties. The contribution of within-GCM uncertainty to total uncertainty has not been well quantified due to the limited number of GCM runs available for each GCM and scenario combination. In this paper we developed a methodology to approximate within-GCM uncertainty of precipitation and temperature projections using non-stationary stochastic data generation. Our methodology is a contribution toward quantifying within-GCM uncertainty and provides an objective approach for communicating the uncertainty in climate change impact assessments in a quantitative manner. In a proof-of-concept application of our procedure we estimated the impact of within-GCM uncertainty on annual runoff and reservoir yield, which can inform water resources engineers and management decision makers about the uncertainty in climate change impacts in the short- to medium-term planning horizon. For the research community our stochastic data generation methodology provides a way to assess within-GCM uncertainty on a temporary basis until the number of GCM runs for a given GCM and scenario combination becomes adequate to estimate within-GCM uncertainty directly from GCM runs.

Within-GCM uncertainty (mm yr

In our proof-of-concept application, we de-trended GCM projections of monthly precipitation and temperature from five better performing CMIP3 GCMs (HadCM3, MIROCM, MIUB MPI and MRI) identified in the companion paper (McMahon et al., 2015). We stochastically replicated the de-trended series 100 times, combined the replicates with their respective trends and applied a bias-correction to the replicates. Within-GCM uncertainty of precipitation and temperature were assessed using the stochastic replicates from each GCM for two periods: (1) 1965–1994 (20C3M scenario), and (2) 2015–2044 (A1B scenario) at 17 catchments distributed around the world. At each catchment within-GCM uncertainty was estimated as the standard deviation of the replicate values divided by the mean replicate value. The uncertainty value for a given GCM was taken as the average of the 17 catchment values for that GCM. Within-GCM uncertainty of mean annual precipitation varied from 3.4 to 4.9 % between GCMs over the two periods and averaged approximately 4.1 % across the five GCMs. For the standard deviation of annual precipitation the average within-GCM uncertainty (14.3 %) was 3–4 times larger than for mean annual precipitation, while within-GCM uncertainty of mean annual temperature was smaller (1 %).

The stochastic replicates were input to a calibrated hydrologic model (PERM)
to estimate future projections of annual runoff. The impact of within-GCM
uncertainty on mean annual runoff varied from 8.0 to 13.0 % between
GCMs over the two periods and averaged approximately 9.9 % across the five
GCMs. The uncertainty in the standard deviation of annual runoff varied from
16.0 to 20.1 % between GCMs and averaged approximately 17.5 % across
the five GCMs. The within-GCM uncertainty in precipitation and temperature
is amplified in the runoff through the precipitation–runoff relationship.
Summary statistics for the two periods were estimated from each annual
runoff series (100 per catchment) and used in the Gould–Dincer Gamma method
to estimate reservoir yield from two hypothetical reservoir capacities
(1

In this analysis between-GCM uncertainty was limited to small differences in within-GCM uncertainty for a given variable and differences in trend between the two 30-year periods analysed. The reason why differences between GCMs are not larger here is due to the application of bias correction. The quantile–quantile bias correction forces the mean and variance of the GCM precipitation and temperature data over the observed period to match the observed mean and variance. Thus, a significant source of between-GCM uncertainty, their bias in mean and variance, has been removed.

A significant implication of our results is that within-GCM uncertainty is
important when interpreting climate change impact assessments. Although the
variables calculated from the stochastic replicates and hydrologic modelling
of the replicates are not strictly normally distributed, a rough guide to
the magnitude of within-GCM uncertainty is to double the values reported
above (

The amplification of precipitation and temperature within-GCM uncertainty in
runoff and reservoir yield has significant implications for interpreting
climate change impact assessments of these variables. For example, in
Fig. 5 (MAP, MAT), Fig. 9 (MAR, SDR) and Fig. 12 (3

Finally, we expect our results are an underestimate of the true within-GCM
uncertainty due to our stochastic method only approximating the uncertainty
around the overall GCM trend and not the uncertainty in the GCM trend
itself. To obtain a true estimate of within-GCM uncertainty requires
analysis of many (

This research was financially supported by Australian Research Council grants LP100100756 and FT120100130, Melbourne Water and the Australian Bureau of Meteorology. Lionel Siriwardena, Sugata Narsey and Ian Smith assisted with extraction and analysis of CMIP3 GCM data. We acknowledge the modelling groups, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) and the WCRP's Working Group on Coupled Modelling (WGCM) for their roles in making available the WCRP CMIP3 multi-model data set. Support of this data set is provided by the Office of Science, US Department of Energy. We also acknowledge the contribution of two anonymous reviewers whose comments and suggestions improved the paper. Edited by: A. Shamseldin