A simple tool for refining GCM water availability projections , applied to Chinese catchments

There is a growing desire for reliable 21st-century projections of water availability at the regional scale. Global climate models (GCMs) are typically used together with global hydrological models (GHMs) to generate such projections. GCMs alone are unsuitable, especially if they have biased representations of aridity. The Budyko framework represents how water availability varies as a non-linear function of aridity and is used here to constrain projections of runoff from GCMs, without the need for computationally expensive GHMs. Considering a Chinese case study, we first apply the framework to 5 observations to show that the contribution of direct human impacts (water consumption) to the significant decline in Yellow river runoff was greater than the contribution of aridity change by a factor of approximately 2, although we are unable to rule out a significant contribution from the net effect of all other factors. We then show that the Budyko framework can be used to narrow the range of Yellow river runoff projections by 34 %, using a multi-model ensemble and the high end RCP8.5 emissions scenario. This increases confidence that the Yellow river will see an increase in runoff due to aridity change by the end of the 10 21st century. Yangtze river runoff projections change little, since aridity biases in GCMs are less substantial. Our approach serves as a quick and inexpensive tool to rapidly update and correct projections from GCMs alone. This could serve as a valuable resource when determining the water management policies required to alleviate water stress for future generations.


Introduction
Climate change is a global problem, but the impacts and associated vulnerability are not homogeneous.There is therefore a demand for robust projections of changes in regional climate, particularly water availability.At the largest scales, the majority of the literature on projected changes in aridity suggests a global land-drying tendency (Dai, 2013;Cook et al., 2014;Scheff and Frierson, 2015): a consequence of ubiquitous increases in potential evapotranspiration (E p ) but mixed signals in precipitation (P ).This has been challenged, however, by some recent studies (Roderick et al., 2015;Greve et al., 2017;Scheff et al., 2017).At the river catchment scale, direct human impacts (non-climatic, human interventions directly affecting the partitioning of P into runoff -Q -and evapotranspiration -E) are already having a significant, but poorly quantified, effect on water availability (Nilsson et al., 2005;Gerten et al., 2008;Destouni et al., 2013;Haddeland et al., 2014).For the Indus River catchment Haddeland et al. (2014) showed that current direct human impacts on water availability (decreases due to water consumption for irrigation) are expected to be greater in magnitude than end-of-21st-century climatic impacts on water availability.Increasing E due to irrigation is commonly observed in heavily populated catchments, especially across southern and eastern Asia (Gordon et al., 2005).
The literature on future water availability projections has typically been framed around the net atmospheric supply of water versus the net demand for water resulting from direct human impacts (land-use change, dam construction and reservoir operation, and surface water and groundwater consumption for irrigation).Recent studies have considered (1) the projected human water demand using integrated assessment models, with water supply fixed to present condi-Published by Copernicus Publications on behalf of the European Geosciences Union.
Using GCM output alone in hydrological projections is not considered suitable, because GCMs have coarse resolution; simplified land surface schemes; and, crucially, biases in simulating hydrological cycle components.The usual approach for generating hydrological projections is to use bias-corrected and downscaled GCM output to force offline GHMs (Wood et al., 2004).Using GHMs in addition to GCMs greatly increases the computational expense of a study.Here, we propose an approach for refining projections of water availability from GCM models participating in phase 5 of the Coupled Model Intercomparison Project (CMIP5).We use Q as a measure of water availability.The term "refine" is used in the sense that we expect to generate projections of Q on an improved physical footing, compared to using GCM output directly.The approach uses model-simulated aridity and a bias correction, within the Budyko framework (Budyko, 1974).We do not consider future human water demand, only future net atmospheric supply of water, of which aridity is a key determinant.
We consider a simple water balance and assume that changes in storage are negligible: (1) GCM variables require bias correction to be of value at catchment scales (Schewe et al., 2014).Bias-correcting a GCMsimulated future Q or E is a complex process.To illustrate this point, we introduce the Budyko framework (Budyko, 1974).Within this framework the partitioning of (annual to long-term mean) P into Q and E scales as a non-linear function of aridity.Aridity, within the Budyko framework, is the dimensionless ratio of E p to P .The evaporative index, the dimensionless ratio of E to P , is dependent on E p /P .The relationship is represented by the non-linear Budyko curve, which is constrained by the physical limits of the atmospheric demand for water (E < E p ; the red dashed 1 : 1 line in Fig. 1) and the atmospheric supply of water (E < P ; the blue dashed horizontal line in Fig. 1).
The original deterministic and non-parametric Budyko formula was developed using data mainly from European river catchments (Budyko, 1974): The trajectory taken in the Budyko space due to a change in P , E p , or E is dependent on the initial values of these three fluxes (the mean state) (van der Velde et al., 2014).Therefore, an accurate representation of the observed climatology is important in any modelling study looking at hydrological projections, especially since changes in variables are  3).The ω = 2.6 ± 1 curves are also shown (dot-dashed grey curves bounding the shaded region).The atmospheric supply limit (E < P ; horizontal dashed blue line) and atmospheric demand limit (E < E p ; diagonal dashed red line) are shown.Energy-limited conditions are represented to the left of the vertical dashed black line (E p P < 1), and water-limited conditions are represented to the right (E p /P > 1).
typically small compared with climatology.If the presentday aridity is biased, then the future-minus-present changes in runoff ( Q) and evapotranspiration ( E) will also be biased, even if the future-minus-present changes in precipitation ( P ) and evapotranspiration ( E p ) are correctly simulated.Section S1 in the Supplement further demonstrates this point with an example that, although arbitrary, is illustrative of the magnitude of aridity biases in CMIP5 models.
Recent work has shown that aridity can only explain part of the differences between catchments (e.g.Zhang et al., 2001).This has led to the derivation of a number of parametric forms of the Budyko curve.One of the more popular forms is the Fu equation (Fu, 1981;Zhang et al., 2004), a one-parameter function expressed as where ω is an empirical parameter that is calibrated against local data.The traditional Budyko curve (Eq.2) corresponds to ω = 2.6 in Eq. ( 3) (Yang et al., 2008).
Here, an attempt is made to utilise biased but plentiful GCM output without the need for GHMs.We apply our approach to two major catchments in China, the Yangtze and the Yellow.There is a wealth of literature that uses the Budyko framework to understand water changes in other Chinese basins (Yang et al., 2007;Xu et al., 2014;Liang et al., 2015).The Yangtze and Yellow rivers dominate the wetter south and drier north, respectively (Fig. 2).The spatial variability in P means that the north of the country, which is poleward of the east Asian monsoon rains, is more waterstressed than the south.This is exacerbated by the fact that the north has 65 % of the total arable land in China (Piao et al., 2010).
The mismatch in water supply versus water demand could be the reason behind the stark decline in Yellow River streamflow (the temporally lagged, spatial integral of upstream Q) seen in recent years (Yang et al., 2004).The contributions of climate change (which incorporates not only aridity change but also changes in seasonality, snow dynamics, storminess, and many other factors; Gudmundsson et al., 2016) and direct human impacts to this "drying up" are widely discussed in the recent literature (Wang et al., 2006;Piao et al., 2010;Miao et al., 2011).Most studies suggest a significant contribution of direct human impacts, including afforestation and land-use change (Huang et al., 2003;Liu et al., 2008;Zhang et al., 2008;Qiu et al., 2011), although methods and attributed contributions vary.
We can also use the Budyko framework to quantify the contribution of aridity change alone (changes in P and E p only) to the 20th-century decrease in Q in the Yellow River catchment.Qualitative agreement with previous studies will serve to validate the use of the Budyko framework in refining 21st-century projections, while hopefully shedding new light on 20th-century observed changes.The 20th-century Budyko framework estimate is compared with an estimate of Q simulated by an offline land surface model (LSM) that does not include a representation of direct human impacts, with the exception of land-use change.If these estimates reconcile, it suggests that the Budyko framework is suitable for this attribution, since Q simulated by the LSM should largely reflect changes in P and E p only.Further, we ask if the difference between the total change in Q and the component attributed to aridity change, for the Yellow River catchment, is in close agreement with a simple estimate of the change in Q due to direct human impacts.

Introducing ideas
Sect. 1 The partitioning of precipitation (Eq. 1) and the non-parametric and one-parameter versions of the Budyko curve (Eqs. 2 and 3, respectively).
3) Estimate Q h using time series of water consumption derived from time series of Chinese irrigated area.
4) Separate the measured runoff changes into Q a , Q h and a residual term using Eq. ( 7).
5) Use Q as simulated by a LSM, to test the calculation of Q a .
2) Use Eq. ( 5) to calibrate ω, using observed P, E p and E (E is calculated using observed P and Q; Eq. 1).
3) Bias-correct P and E p using Eq. ( 10).Section 2.1 details the observed and modelled data used, and Sect.2.2 describes the methodology.Results are presented in Sect.3, first applying the Budyko framework to the 20th-century observed water availability (Sect.3.1), before extending the approach to constrain 21st-century model projections of water availability (Sect.3.2).We finish with a discussion (Sect.4) and conclusions (Sect.5) The two-pronged approach of this paper is summarised in Fig. 3.It shows how the approaches share the same theory and use many of the same equations but are independent in their objectives.However, the 20th-century application of the Budyko framework supports the suitability of the 21st-century application, as indicated (Fig. 3).

20th-century historical changes
We use the Dai et al. (2009) Global River Flow and Continental Discharge Dataset to calculate observed Q for the Yangtze and Yellow River catchments.This dataset aimed to use the farthest downstream gauging station (to maximise spatial representation) that had good temporal coverage.Q is calculated by dividing river discharge at a gauging station by the upstream catchment area.In keeping with many other hydrological studies we use annual mean values throughout, but we consider the water year (October-September).Data are available for October 1950 to September 2000.
To ensure an accurate comparison between observed P and Q, we produce high-resolution catchment masks on a 0.5 • × 0.5 • grid to match that of the P dataset used (Fig. 2).We select the latest Climatic Research Unit (CRU) highresolution P dataset, CRU TS3.23 (Harris et al., 2014).The interpolated version of the dataset is used, which offers complete global terrestrial coverage.This allows for direct comparison with the spatial and temporal coverage of observed Yangtze and Yellow River Q.By restricting our analysis of observations to 1951-2000, we find that our conclusions are not sensitive to using either the interpolated version or raw version of the precipitation dataset (see Sect.S2 and Fig. S1 in the Supplement).We then calculate E as P − Q (Eq.1).
Likewise, we use the CRU TS3.23 E p dataset (0.5 • × 0.5 • resolution), which is estimated from variables such as temperature, vapour pressure, cloud cover, and wind speed, using a variant of the Penman-Monteith equation.This E p estimator is computed from variables that are often poorly observed, both spatially and temporally.An energy-only E p estimator would be preferable (Sheffield et al., 2012;Milly and Dunne, 2016), but required observations are not available.
We also use Q output from the Lund-Potsdam-Jena (LPJ) LSM (Sitch et al., 2003;Osborne et al., 2015).This is forced over the 1951-2000 historical period with observed CRU P , as well as other observed CRU climate variables (Harris et al., 2014) and changing CO 2 concentrations (more details are given in Sect.S3).The run used in our primary analyses was also driven by historical land-use changes, calculated from the History Database of the Global Environment (HYDE) (Klein Goldewijk and Verburg, 2013).A separate run excludes the HYDE dataset, so that we are able to test the sensitivity to land-use changes.Assuming that any sensitivity is minimal, we only comment on this separate run briefly.Simulated Q is available at a monthly frequency at 0.5 • × 0.5 • resolution.The LPJ LSM is chosen from a multimodel ensemble that forms the TRENDY intercomparison project (Sitch et al., 2015) because it simulates a long-term mean  runoff coefficient (Q/P ) that is closest to that observed for both major Chinese river catchments (not shown).

21st-century projected changes
We use data from 34 GCMs participating in CMIP5 (Taylor et al., 2012).These are listed in Sect.S4.We consider data for historical  and two 21st-century Representative Concentration Pathway emissions scenario (RCP4.5 and RCP8.5;2006-2100) experiments.Only one ensemble member was used for each model and experiment (the first: r1i1p1).Simulated data are regridded to 0.5 • × 0.5 • resolution and masked to the two Chinese river catchments.We calculate Q as P − E (Eq. 1).
An energy-only E p estimator is used for CMIP5 models.E p , being a hypothetical construct, is not a standard output of CMIP5 models.We follow recent work (e.g.Greve et al., 2014;Greve and Seneviratne, 2015;Milly and Dunne, 2016) and estimate E p directly from net surface radiation (R n ): where λ is the latent heat of vaporisation (λ ≈ 2.45 MJ kg −1 ).This simple energy-only E p estimator has been shown to perform well compared to more complicated estimators, particularly under significant climate change (Sheffield et al., 2012).

20th-century historical changes
The Budyko framework can be used to estimate the aridity change contribution to the overall change in Q.We have to first calibrate ω against local data for each catchment.Using the observed annual mean P , E, and E p for 1951-1960, ω is calculated as the value that minimises the mean squared errors between the observed annual mean E/P ratios and those modelled using Eq. ( 3), for each catchment.Following Li et al. (2013) the objective function is where i is the year.The period 1951-1960, in this context, is considered to be representative of natural Q (minimal water consumption or regulation by human activities).There will be some direct human impacts on Q at this time, with a substantial Chinese land area equipped for irrigation even in the 1950s (Freydank and Siebert, 2008), although it does predate major dam construction; the Sanmenxia dam was the first major dam in the Yellow River catchment and was completed in 1960.In calculating the E p /P (aridity change) contribution to the change in E/P (Eq. 3) we take ω to be constant over the period 1951-2000.Our results are not qualitatively affected by the length of period chosen to represent natural Q (analyses are repeated for 5-, 15-, and 20-year periods, all starting in 1951).
We use ω values of 1.74 and 2.29 for the Yangtze and Yellow River catchments, respectively.Combining Eq. ( 1) with Eq. (3) gives where Q a is the runoff due to aridity change (changes in P and E p only), and so ω is taken to be constant.This separates aridity change from changes in all other climatic factors besides aridity change, as well as changes in all non-climatic factors.All other climatic and non-climatic factors are integrated by ω.This aridity change component is sometimes referred to as the natural Q in other studies (Wang et al., 2006).However, this can be misleading since changes in P and E p include both changes due to natural variability and, potentially, human-induced changes (Zhang et al., 2007;Dai, 2013).
We also estimate the runoff due to direct human impacts (Q h ) for the Yellow River catchment only, since previous work suggests that Q h contributes significantly to the measured runoff (Q m ) here (Wang et al., 2006;Miao et al., 2011).Time series of water consumption are derived to estimate Q h .Water consumption is defined as the water withdrawn for human use that leaves a catchment (Xu et al., 2010).Agricultural sector irrigation accounts for a large proportion of total water consumption and, in turn, Q h .A year 2000 water consumption estimate of 0.082 mm day −1 for the Yellow River catchment (48 % of the 1951-1960 mean Q m ) (Xu et al., 2010) is scaled with a 1951-2000 time series of Chinese irrigated area (Freydank and Siebert, 2008).Irrigated area in China increased 3-fold between 1951 and 2000, and we assume that Yellow River catchment irrigated area has changed in proportion with national changes.Accurate quantification of past (and even present) water consumption is immensely difficult, but using estimates of past irrigated area offers a means of making pseudo-quantitative statements about Q h .
With the change in runoff due to aridity change defined as Q a , the measured change in runoff ( Q m ) can be approximated as the sum of Q a , the change in runoff due to direct human impacts ( Q h ), and the change in runoff due to all other climatic and non-climatic factors besides aridity change and direct human impacts ( Q o ): with changes over the historical period  calculated as the linear trend.We note that our conclusions are not affected by using the difference between either 10-or 20-year means at the beginning and end of the historical period.The Budyko framework can only separate the contribution of aridity change to the measured decrease in Yellow River runoff from the contribution of all other factors besides aridity change (time-varying ω), represented by the residual 7).The parameter ω integrates all other factors, so a significant residual represents a significant net contribution from these factors.Changes in climatic factors besides aridity -such as seasonality, snow dynamics, and storminess -and non-climatic factors besides direct human impacts, such as land surface characteristics and the physiological response of plants to increasing CO 2 (CO 2 fertilisation, CO 2 stomatal closure, and water-use efficiency), could all play a role.However, the previous literature suggests that Q h has been significant in the Yellow River catchment.We therefore decompose the residual in Eq. ( 7) into a component due to direct human impacts and a component due to all other factors besides both aridity change and direct human impacts.Since water is being diverted from the river and heavily consumed, we expect Q h to be negative.
We reconcile Q a with Q simulated by the LPJ LSM.Although the LSM is unable to simulate water resources with the complexity of a GHM, it does include a representation of some of the factors integrated by ω, particularly nonclimatic factors such as changes in land use and land cover, the response of stomata to rising CO 2 concentrations, CO 2 fertilisation, and soil moisture controls on transpiration (see Sect.S3 and Sitch et al., 2015).The representation of these other factors means that we do not truly compare like for like when reconciling Q a with Q simulated by the LPJ LSM.However, we still expect aridity change to be the dominant driver of runoff in the LPJ LSM and so define the change in runoff simulated by the LPJ LSM as Q a l .That is to say, Q a l should be dominated by changes in P and E p and show strong agreement with Q a .We specifically test the sensitivity to land-use changes since they are excluded in a separate run of the LPJ LSM model.This is the only change between the two runs, so we can elucidate the influence of land-use changes by simply taking the difference between them.

21st-century projected changes
Equation ( 6) is also used to constrain projections of Q in CMIP5 models, instead substituting P with a corrected P (P ) and E p with a corrected E p (E p ): where Q * , the Budyko-corrected runoff, is calculated for the period 1951-2100.An asterisk (rather than a prime) is used to show that Q has been corrected using the Budyko framework and not directly using a simple bias correction.The bias correction technique chosen to calculate P and E p is covered in Sect.3.2.This is because the results of exploratory data analyses on P and E p , and how these relate to climatology biases across the CMIP5 models, will inform the choice of correction technique.In Eq. ( 8) we use ω values calculated using observed data and Eq. ( 5) for the 1951-2000 period (1.77 and 2.44 for the Yangtze and Yellow River catchments, respectively).We compare Q * with the original CMIP5 modelsimulated Q, calculated as P −E.Data for Q are also directly available for 28 of the 34 GCMs.Conclusions should not be sensitive to using either direct Q output or water-balancederived Q if changes in storage are negligible.Bring et al. (2015), however, showed evidence for long-term systematic changes in water storage in some CMIP5 models.Although not a primary analysis, it is sensible to test the sensitivity of our results to the choice of Q.

20th-century historical changes
The drying of the Yellow River has been one of the most notable aspects of hydrological change in China over recent decades (Yang et al., 2004;Piao et al., 2010).There has been a significant negative linear trend in Yellow River Q between 1951 and 2000 (−0.26 ± 0.06 mm day −1 century −1 , p < 0.05; range is the 5 %-95 % range, taken as ±1.64 SD), while the decrease in P over the equivalent period is not significant at the 95 % (p = 0.05) confidence level (−0.17 ± 0.21 mm day −1 century −1 ) (Fig. 4).The decrease in Q is particularly notable since about 1970.Despite a substantial human water demand in the second half of the 20th century there has been a slight, non-significant, increase in Q in the Yangtze River catchment (0.04 ± 0.29 mm day −1 century −1 ) that is closely matched by a slight, non-significant, increase in P (0.02 ± 0.34 mm day −1 century −1 ).
The Yangtze River shows no tendency to shift towards a distinct new area of the Budyko space between 1951 and 2000 (Fig. 5).The Yellow River, however, seems to shift towards larger E/P values (smaller Q/P ).Within the Budyko framework this could be expected under a shift towards greater aridity (larger E p /P values), or increases in ω.A systematic shift towards greater aridity is not obvious in Fig. 5.There is a significant positive linear trend in Yellow River E/P between 1951 and 2000 (0.22±0.05 per century), but the positive trend in E p /P (0.52 ± 0.50 per century) is only significant at the 90 % (p = 0.10) confidence level.This suggests that all other factors (ω) may also be a key driver of changes in E/P over this time period in the Yellow River catchment.Given this evidence and the significant negative linear trend in Yellow River Q, we investigate further the contributions of aridity change and all other factors to the decrease in Q. Q a is noticeably different to Q m for the Yellow River catchment (−0.07 ± 0.08 and −0.26 ± 0.06 mm day −1 century −1 , respectively), with a significantly less negative trend ( Q a − Q m is equal to 0.19 ± 0.07 mm day −1 century −1 , p < 0.05) (Fig. 6).We reconcile our Q a calculations with Q a l .The linear trends are statistically consistent (−0.07 ± 0.08 and −0.05 ± 0.06 mm day −1 century −1 for Q a and Q a l , respectively).This also holds when considering the LPJ LSM run without land-use changes, for which Q a l is −0.06 ± 0.06 mm day −1 century −1 .Our results are not sensitive to fixed or varying land use.
If aridity change and direct human impacts have dominated the measured change in Yellow River runoff, so that the change in runoff due to all other factors is negligible, from Eq. ( 7) we get Q a ≈ Q m − Q h .We calculate Q h as −0.11 ± 0.01 mm day −1 century −1 for the Yellow River (note that the uncertainty range is artificially small due to the limited temporal resolution of the irrigated-area time series of Freydank and Siebert, 2008).Therefore, Q m − Q h (−0.15 ± 0.07 mm day −1 century −1 ) does not fully reconcile our estimates of Q a and Q a l (−0.07 ± 0.08 and −0.05 ±  0.06 mm day −1 century −1 , respectively).Q h only accounts for 59 % and 54 % of Q m − Q a and Q m − Q a l , respectively.This imbalance could suggest a significant contribution from Q o , or be explained by an underestimate of the year 2000 water consumption.We calculate the year 2000 water consumption that balances Q a = Q m − Q h to be 0.140 mm day −1 , a 70 % increase on the estimate of Xu et al. (2010).This closely matches a year 2000 water consumption estimate by Zhu et al. (2003) of 0.137 mm day −1 .Calculating the relative contribution of aridity change to the measured decrease in Yellow River runoff as ( Q a / Q m )×100 % returns a value of 27 %.Using the two estimates of year 2000 water consumption of 0.082 and 0.137 mm day −1 , the relative contribution of direct human impacts to the measured decrease in Yellow River runoff (( Q h / Q m ) × 100 %) is 43 % and 71 %, respectively.
We account for between 70 % and 98 % of Q m with Q a + Q h , using the low and high water consumption estimates, respectively.Using this information with Eq. ( 7), we could suggest that the contribution from Q o is either negligible (using the high water consumption estimate) or significant (using the low water consumption estimate).Instead, it shows that there is considerable uncertainty in quantifying water consumption and, in turn, the contribution of Q h to Q m .Nevertheless, the close agreement of Q a and Q a l suggests that direct human impacts have played a larger role than aridity change in causing the water availability crisis in the Yellow River catchment.The contribution of direct human impacts would appear to be greater, by a factor of approximately 2, than the contribution of aridity change.It is worth remembering that Q a will reflect not only natural variability but also human-induced changes; E p has increased due to human-induced warming (Dai, 2013), and P has changed due to various anthropogenic forcings (Osborne and Lambert, 2014; Burke and Stott, 2017).

21st-century projected changes
From the Budyko framework, changes in Q are dependent not only on changes in P , E p , and E but also on the initial values of these three fluxes.This means that we should view Q projections cautiously if there are biases in key hydrological cycle variables in CMIP5 models.Consistent with previous work (Chen and Frauenfeld, 2014), we find that the spatial pattern of P over China is reproduced by CMIP5 models but annual mean P is overestimated in most regions, compared to CRU climatology.This is evident in the multimodel mean P bias (Fig. 7), with the greatest wet biases seen in the the western parts of the Yangtze and Yellow River catchments (the eastern Tibetan Plateau).
As a result of these P biases most CMIP5 models do not fall in the same region of the Budyko space as observations for the Yellow River catchment (Fig. 8).Although P is overestimated in the Yangtze River catchment for 1951-2000 (3.78 ± 0.97 and 2.74 mm day −1 for CMIP5 and observations, respectively), there is little multi-model mean bias in E p /P (0.88 ± 0.28 and 0.85 for CMIP5 and observations, respectively), implying that E p is also overestimated (3.24 ± 0.47 and 2.30 mm day −1 for CMIP5 and observations, respectively).In contrast, there is considerable multi-model mean bias in E p /P in the Yellow River catchment (1.35 ± 0.52 and 2.27 for CMIP5 and observations, respectively), with models (on average) simulating a humid rather than a semi-arid climate zone, according to a widely used aridity classification (Middleton and Thomas, 1997).In fact, only one of 34 models considered simulates an aridity greater than 2.0 (MRI-CGCM3).This misrepresentation is a result of a significant overestimate of P for 1951-2000 (2.18 ± 0.91 and 1.10 mm day −1 for CMIP5 and observations, respectively) and a less biased simulation of E p (2.75 ± 0.43 and 2.45 mm day −1 for CMIP5 and observations, respectively).
Figure 9 shows the multi-model mean P and E p (using 1980-1999 and 2080-2099 as present-day and future climates, respectively) in RCP8.5.Consistent with the previous literature (Chen and Frauenfeld, 2014), P increases in CMIP5 projections throughout China, with significant increases across most of the Yellow River catchment.P also increases across the Yangtze River catchment, although fewer models simulate significant increases here.As discussed by  , are masked in white.Greve and Seneviratne (2015), significant E p increases are ubiquitous.CMIP5 multi-model mean P is 0.36±0.56 and 0.35 ± 0.30 mm day −1 for the Yangtze and Yellow River catchments, respectively.Respective values for E p are 0.33 ± 0.23 and 0.25 ± 0.19 mm day −1 .Although we expect model-simulated Q and E to be erroneous due to climatology biases (as highlighted in the hypothetical example in Sect.S1), we assume that P and E p in the CMIP5 multi-model ensemble are not dependent on the biases in climatology described above (Fig. 7).
Figure 10 shows how P and E p relate to the climatology of P (P ) and E p (E p ), respectively, across the 34 CMIP5 models.There are weak but significant correlations between P and P in RCP8.5 for both the Yangtze and Yellow River catchments, but significance is lost with the exclusion of an outlying model in each case.The weak but significant correlation between P and P in RCP4.5 for the Yangtze River catchment is also dependent on an outlying model.There is little evidence for significant correlations between E p and E p .
If there were strong evidence for relationships between P and P and/or E p and E p , then a simple multiplicative correction, applied to catchment annual mean P and E p , would be appropriate (Hempel et al., 2013).For P , where the subscript GCM is an individual model from the CMIP5 ensemble, the subscript CRU is the observed data, and P is the corrected P .The period 1980-1999 is used to calculate the climatologies.Using a multiplicative correction factor preserves the relative rather than absolute trends in model-simulated P and E p .CMIP5 P and E p biases for the Yangtze and Yellow River catchments are, on average, positive and substantial (Fig. 10).As such, the correction factor in Eq. ( 9) is, on average, less than unity.A multiplicative correction factor would therefore narrow the ranges of P and E p across the CMIP5 ensemble .The assumption that we can use absolute P and E p from CMIP5 models seems valid in the absence of strong evidence for relationships between P and P and/or E p and E p .We instead use a simple additive correction (Hempel et al., 2013).Temporally constant offsets (the absolute differences between observed and simulated climatologies) are added to model-simulated P and E p .For P , We adjust P and E p in the 34 CMIP5 models for 1951-2100 to eliminate the biases in simulating the observed CRU climatologies, while retaining absolute P and E p .Positivity constraints on P and E p can render additive corrections inappropriate, but this is not a problem at the spatial (catchment) and temporal (annual) resolutions considered here.Q (as simulated by the CMIP5 models and calculated using P − E) differs considerably from Q * (Fig. 11) as calculated with Eq. ( 8), particularly for the Yellow River catchment.The Budyko-corrected future-minus-present change in runoff ( Q * ; recall that the future-minus-present change is the mean of 2080-2099 minus the mean of 1980-1999) is similar to Q for the Yangtze River catchment in both RCP4.5 and RCP8.5 across the CMIP5 ensemble (Table 1).In the Yellow River catchment (RCP8.5) the multi-model mean Q * matches that of the multi-model mean Q (both 0.09 mm day −1 ).The 5 %-95 % range, however, is reduced by 34 % (±0.14 to ±0.09 mm day −1 ).Similar results are found with RCP4.5, with little change in the multi-model mean from 0.07 to 0.06 mm day −1 but a decrease of 35 % in the 5 %-95 % range from ±0.11 to ±0.07 mm day −1 for Q and Q * , respectively.These findings are not sensitive to using directly simulated runoff instead (Fig. 11 and Table 1).The small differences between Q and Q * , and Q and Q * , for the Yangtze River catchment are expected given that CMIP5 models broadly fall in the correct region of the Budyko space (Fig. 8).For the Yellow River catchment, the The symbols represent observed data, with darker shades for the more recent years.ω values are calculated for the 1951-2000 period using Eq. ( 5).
Table 1.CMIP5 model-simulated ( Q) and CMIP5 Budykocorrected ( Q * ) future-minus-present runoff changes (mm day −1 ) for 2080-2099, relative to 1980-1999.The multi-model mean and 5 %-95 % ranges across the individual models are listed (based on a Gaussian assumption).For comparison, values for a subset of 28 (from 34) CMIP5 models for which Q is directly simulated are also shown.CMIP5 model directly simulated future-minus-present runoff changes ( Q direct ) are used to verify the suitability of calculating Q as P − E (water-balance-derived).

Discussion
Before using the Budyko framework in tandem with CMIP5 output, we considered whether it could be used to quantify the contribution of aridity change to the measured decrease in Yellow River runoff between 1951 and 2000.Encourag-ingly, for both the Yangtze and Yellow River catchments, the Q trend due to aridity change was found to be near-identical to that simulated using the LPJ LSM (which is forced by observed P and E p ).This suggests that the Budyko framework is suitable for determining the relative contribution of aridity change to the measured decrease in Yellow River runoff, calculated as 27 %.Therefore, the relative contribution of all other factors besides aridity to the measured decrease in Yellow River runoff is expected to equal 73 %.
With time series of water consumption derived using low and high year 2000 water consumption estimates, the component due to direct human impacts is calculated as 43 % and 71 %, respectively.Therefore, we can account for nearly all of the measured decrease in Yellow River runoff (98 %) using aridity change and the high consumption estimate alone, but we stress that such estimates are highly uncertain.We are not able to dismiss a significant contribution from the net effect of all other factors (besides aridity and direct human impacts), which ranges from 2 % to 30 %.Given that the estimate of the contribution of aridity change appears to be the most robust result, we can instead state that the majority of the measured decrease in Yellow River runoff appears to be due to direct human impacts and all other factors.Also, despite the uncertain water consumption estimates, the contribution from direct human impacts is approximately 2 times greater than the contribution from aridity change.Other studies have estimated the climate change (all non-human) and human components.Miao et al. (2011) attribute 55 % of the reduction in Yellow River water discharge to humans, with Wang et al. (2006) giving a value of 49 %, compared to our range of 43 % to 71 %.Note that these studies use different methods and periods to estimate the contributions of the two components but focus on the second half of the 20th century.Our estimate of the component due to direct human impacts is consistent with these previous estimates, although we add  (2080-2099 minus 1980-1999) in P (a) and E p (b) in RCP8.5.Stippling indicates where fewer than 50 % of the CMIP5 models show significant change, as determined with a t test comparing present-day and future climates.Absence of stippling indicates where more than 50 % of the models show significant change and more than 80 % of the significant models agree on the sign.Grey indicates where more than 50 % of the models show significant change but fewer than 80 % of the significant models agree on the sign.This method follows Tebaldi et al. (2011).Desert regions (< 200 mm yr −1 ), as determined from CRU climatology (1961determined from CRU climatology ( -1990)), are masked in white for P .
detail by finding that this contribution is markedly greater than the contribution from aridity change alone.
Although estimates of water consumption are highly uncertain, there are also uncertainties in our estimate of the aridity change contribution to Q change.This estimate, as well as runoff simulated by the LPJ LSM, rely on an uncertain observed E p dataset (see Sect. 2.1).An energy-only E P estimator is expected to be more appropriate (Milly and Dunne, 2016) but is not available because of insufficient observed data.Meanwhile, the observed P dataset is likely to contain biases and inhomogeneities (Osborne and Lambert, 2014).Many grid boxes in China are poorly gauged (some not at all) in the period investigated (see Fig. S1), especially in the mountainous Tibetan Plateau region, where P is scarce but highly variable (Adam et al., 2006).These are largely in-surmountable obstacles facing all hydroclimatological studies.
Within the Budyko framework all climatic and nonclimatic factors besides aridity are integrated by the ω parameter.In Eq. ( 7) we separate this "residual" into a component due to direct human impacts and a component due to all other climatic and non-climatic factors besides aridity change and direct human impacts.The low water consumption estimate means that we are not able to dismiss a significant contribution from the net effect of all other factors.Support for a negligible contribution from all other factors comes from the strength of agreement between Q a and Q a l .This is because the LPJ LSM includes a realistic representation of vegetation, which has been shown to be a useful indicator of these other factors that are integrated by ω (although this may only hold for larger catchments) (Li et al., 2013) (see Sect.S3).Further, Fig. S3 shows that CMIP5 models simulate no obvious changes in ω over the second half of the 20th century.
In estimating direct human impacts from just water consumption there remains the possibility that other direct human impacts could account for a significant contribution to the decrease in Yellow River runoff.We present evidence that the contribution from land-use change is negligible.On the other hand, catchment runoff can abruptly decrease during the filling of large reservoirs following dam construction, causing anomalously low annual runoff.Following filling, runoff should return to pre-dam levels, and such projects are only thought to affect seasonal water storage and not introduce trends in long-term runoff.Rather, dam and reservoir construction facilitates access to water resources and leads to more water withdrawal and consumption.The influence of dams and reservoirs are likely accounted for in the water consumption estimates (Biemans et al., 2011).
The agreement between the Budyko framework and the LPJ LSM for the observed period also increases our confidence in using the Budyko framework for projections.The CMIP5 Budyko-corrected projected changes in runoff rely on the assumption that 21st-century changes in P and E p are not dependent on existing climatology biases in CMIP5 models.Across the CMIP5 multi-model ensemble we did not find compelling evidence for relationships, supporting this assumption (Fig. 10).This is broadly consistent with expectations, given recent research showing that the "wet gets wetter, dry gets drier" paradigm (Held and Soden, 2006) does not hold over global land surfaces (Greve et al., 2014;Greve and Seneviratne, 2015).However, the mean state can undoubtedly have some influence on the simulated changes in P and E p due to land-atmosphere feedbacks (Berg et al., 2016).We note that when correcting E p (Eq. 10; with P replaced with E p ) we calculate the correction offset as the observed climatology (Penman-Monteith estimator) minus the modelsimulated climatology (energy-only estimator).Using these different estimators will likely introduce some error in the calculation.It is also important to note some potential limitations of using Eq. ( 7) to separate the measured decrease in Yellow River runoff into various components.This approach assumes a linear relationship and therefore that the individual components are independent.Padrón et al. (2017) showed that cross-correlations exist between many of the factors sug-gested to influence runoff through ω.Testing for dependencies between Q h and other components is unfortunately limited by the poor temporal resolution of the irrigated-area time series of Freydank and Siebert (2008).Although we find that interannual variations in Q a and the residual Q h +Q o are correlated (−0.35), this correlation is weak and reverses sign when considering multi-year means.Further, our approach considers long-term trends/changes in runoff, which means that any dependencies at shorter timescales should not influence conclusions.
In calculating Q * (Eq.8) ω values are calculated for the 1951-2000 period, using Eq. ( 5), then taken to be constant for the period 1951-2100.While the relationships of variations in Q with variations in such catchment-specific parameters are understood (Roderick and Farquhar, 2011;Gudmundsson et al., 2016), the full complexity of the influence of changes in catchment properties on these parameters is not.However, Li et al. (2013) showed that, for large catchments, the long-term averaged annual vegetation coverage explains as much as 63 % of the variance in the catchment-specific ω.With 21st-century increases in total vegetation coverage projected (Schneck et al., 2015), we expect this parameter will increase in magnitude.This is found to be the case in the CMIP5 multi-model ensemble, and these increases in ω need to be included when verifying the Budyko framework on the CMIP5 models themselves (see Sect.S3 and Figs.S2-S4).The influence of changes in ω on projected changes in Q is small compared to the influence of correcting E p and P (see Sect.S3 and Fig. S5).Demonstrating this, the CMIP5 multi-model mean Q * for the Yellow River catchment in RCP8.5 with constant ω (0.09 ± 0.09 mm day −1 ) is not significantly different to the CMIP5 multi-model mean Q * for the Yellow River catchment in RCP8.5 with time-varying ω (0.07 ± 0.08 mm day −1 ).Therefore, our conclusions are not sensitive to the choice of ω (constant or time-varying).
We show that aridity change (changes in P and E p only) is of greatest importance in shaping projected changes in runoff in CMIP5 models, and all other factors (ω) play a secondary role.We expect our CMIP5 Budyko-corrected Q projections to be substantially more reliable than the original CMIP5 model-simulated Q projections.In the case of the Yellow River catchment, the 5 %-95 % range of the future-minus-present (2080-2099 minus 1980-1999) change in Q is reduced by 34 % and 35 % in RCP8.5 and RCP4.5, respectively.Importantly, constraining Q projections using the Budyko framework increases confidence that the Yellow River catchment will see increases in Q by the end of the 21st century -the best-guess (CMIP5 multi-model mean) change of 0.09 mm day −1 is significantly different from zero at the 90 % confidence level.Greater confidence in the range of Yellow River catchment water availability projections could be of great value to policymakers.More generally, the Budyko framework serves as an inexpensive tool to rapidly update projections from biased GCM simulations without the need for offline GHMs.However, further research is needed.Specifically, we believe that an ensemble of GHMs, driven by at least one set of bias-corrected and downscaled GCM projections, should be used as a means of verification.
Most applications of the Budyko framework consider spatial rather than temporal variations.Berghuijs and Woods (2016) demonstrate that spatial and temporal variations are not necessarily tradable.We stress that the Budyko framework is not employed here to robustly determine interannual variability in water availability but is instead used to understand long-term trends (Sect.3.1) or the difference between 20-year means at the end of the 20th and 21st centuries (Sect.3.2).

Conclusions
We have demonstrated how the Budyko framework can be used to place water availability projections from readily available GCM output onto a more physical basis by correcting for biases in aridity, using the example of the Yangtze and Yellow River catchments in China.The approach is inexpensive, does not need the use of offline GHMs, and could be used to provide rapid updates on water availability projections for new GCM scenarios.Wherever GCMs simulate significant biases in representing observed aridity, we expect to generate significantly altered projections.In the Yellow River catchment, considerable negative biases in simulated aridity lead to a substantial narrowing of the range of future GCM projections.In catchments where GCMs simulate positive biases, we would expect to see broadening of the range of GCM projections.Meanwhile, in the Yangtze River catchment, simulated aridity biases are small, meaning that projections are little changed by our approach.
We stress again that these refined water availability projections account for aridity change only.In the hypothetical case where future aridity change is known, the projected Q will not be realised due to the effect of all other factors, especially highly uncertain future changes in direct human impacts (these are not represented in CMIP5 models).Current human impacts on Q are possibly greater than end-of-21stcentury aridity change impacts on Q in the Yellow River catchment (Haddeland et al., 2014).Therefore, the current water shortages are not likely to be alleviated without improved agricultural practices and water management.Importantly though, reducing the range of water availability projections gives planners an improved idea of what needs to be done to reduce water stress in the Yellow River catchment for future generations.Moreover, our conclusions underline the need for imminent action and highlight the fact that increases in Q due to aridity change will not offer much relief in the absence of serious and concerted action to minimise direct human impacts.
Chinese authorities have recently attempted to alleviate the drying in the north of China, by diverting water there from the wetter south (the South-to-North Water Diversion Project).It remains to be seen whether this will reduce the imbalance in atmospheric water supply and human water demand across China and whether it could even place additional water stress on the more resilient south (Barnett et al., 2015).Generating refined water availability projections in Hydrol.Earth Syst.Sci., 22, 6043-6057, 2018 www.hydrol-earth-syst-sci.net/22/6043/2018/ these two key river catchments should underpin decisions made on future engineering projects.
Data availability.The CMIP5 data can be accessed via the Web portal https://esgf-node.llnl.gov/projects/esgf-llnl/(Taylor et al., 2012).For the observed datasets, CRU TS3.23 P and E p can be found at http://www.cru.uea.ac.uk/data/ (Harris et al., 2014), and the Global River Flow and Continental Discharge Dataset Q can be found at http://www.cgd.ucar.edu/cas/catalog/surface/dai-runoff/ (Dai et al., 2009).Data from the LPJ LSM experiments are available from the authors upon request.
Author contributions.JMO and FHL designed the study and discussed results.JMO performed the research, analysed the data and wrote the manuscript.
Competing interests.The authors declare that they have no conflict of interest.

Figure 1 .
Figure1.The traditional Budyko curve (solid black curve), corresponding to ω = 2.6 in Eq. (3).The ω = 2.6 ± 1 curves are also shown (dot-dashed grey curves bounding the shaded region).The atmospheric supply limit (E < P ; horizontal dashed blue line) and atmospheric demand limit (E < E p ; diagonal dashed red line) are shown.Energy-limited conditions are represented to the left of the vertical dashed black line (E p P < 1), and water-limited conditions are represented to the right (E p /P > 1).

Figure 2 .
Figure 2. Precipitation climatology for 1961-1990 using the CRU precipitation dataset.This dataset is spatially interpolated, using available in situ observations, to give complete global land coverage.The location of the Yangtze and Yellow River catchments within China is shown.

4)Figure 3 .
Figure 3. Schematic of how the Budyko framework is used to improve our understanding of 20th-century historical changes and 21st-century projected changes.

Figure 4 .
Figure 4. Observed runoff and precipitation anomalies for the Yangtze (a) and Yellow (b) River catchments for 1951-2000, relative to 1961-1990.The dot-dashed lines show linear fits to the time series.

Figure 5 .
Figure 5.The evaporative index against aridity for the Yangtze (red) and Yellow (blue) River catchments.The symbols represent observed annual mean data for 1951-2000 with darker shades for the more recent years.The traditional Budyko curve is fitted, corresponding to ω = 2.6 in Eq. (3).

Figure 6 .
Figure 6.Runoff anomalies for the Yangtze (a) and Yellow (b) River catchments for 1951-2000, relative to 1961-1990.Shown are measured runoff (Q m ), runoff due to aridity change (Q a ), runoff simulated by the LPJ LSM (Q a l ), and (for the Yellow River only) the difference between Q m and runoff due to direct human impacts (Q h ).The dashed lines show linear fits to the time series.

Figure 7 .
Figure 7. Absolute (a) and relative (b) multi-model mean precipitation climatology bias for 1961-1990.The location of the Yangtze and Yellow River catchments within China is shown.Desert regions (< 200 mm yr −1 ), as determined from CRU climatology(1961- 1990), are masked in white.

Figure 8 .
Figure 8.The evaporative index against aridity for the Yangtze (a) and Yellow (b) River catchments.The shaded blue regions represent the density of CMIP5 annual mean data for the 1951-2000 period, with darker shades meaning more data in a given region of the Budyko space.The symbols represent observed data, with darker shades for the more recent years.ω values are calculated for the 1951-2000 period using Eq.(5).

Figure 9 .
Figure9.Multi-model mean future-minus-present changes(2080- 2099 minus 1980-1999)  in P (a) and E p (b) in RCP8.5.Stippling indicates where fewer than 50 % of the CMIP5 models show significant change, as determined with a t test comparing present-day and future climates.Absence of stippling indicates where more than 50 % of the models show significant change and more than 80 % of the significant models agree on the sign.Grey indicates where more than 50 % of the models show significant change but fewer than 80 % of the significant models agree on the sign.This method followsTebaldi et al. (2011).Desert regions (< 200 mm yr −1 ), asdetermined from CRU climatology (1961determined from CRU climatology ( -1990)), are masked in white for P .
Table1).Also shown, for comparison, are box plots for a subset of 28 (from 34) CMIP5 models for which Q is directly simulated (not limited to being calculated as P − E).The unfilled box plot shows CMIP5 model directly simulated runoff for 2080-2099.