Flow pathways and nutrient transport mechanisms drive hydrochemical sensitivity to climate change across catchments with different geology and topography

Hydrological processes determine the transport of nutrients and passage of diffuse pollution. Consequently, catchments are likely to exhibit individual hydrochemical responses (sensitivities) to climate change, which are expected to alter the timing and amount of runoff, and to impact in-stream water quality. In developing robust catchment management strategies and quantifying plausible future hydrochemical conditions it is therefore equally important to consider the potential for spatial variability in, and causal factors of, catchment sensitivity, as it is to explore future changes in climatic pressures. This study seeks to identify those factors which influence hydrochemical sensitivity to climate change. A perturbed physics ensemble (PPE), derived from a series of global climate model (GCM) variants with specific climate sensitivities was used to project future climate change and uncertainty. Using the INtegrated CAtchment model of Phosphorus dynamics (INCA-P), we quantified potential hydrochemical responses in four neighbouring catchments (with similar land use but varying topographic and geological characteristics) in southern Ontario, Canada. Responses were assessed by comparing a 30 year baseline (1968–1997) to two future periods: 2020–2049 and 2060–2089. Although projected climate change and uncertainties were similar across these catchments, hydrochemical responses (sensitivities) were highly varied. Sensitivity was governed by quaternary geology (influencing flow pathways) and nutrient transport mechanisms. Clay-rich catchments were most sensitive, with total phosphorus (TP) being rapidly transported to rivers via overland flow. In these catchments large annual reductions in TP loads were projected. Sensitivity in the other two catchments, dominated by sandy loams, was lower due to a larger proportion of soil matrix flow, longer soil water residence times and seasonal variability in soil-P saturation. Here smaller changes in TP loads, predominantly increases, were projected. These results suggest that the clay content of soils could be a good indicator of the sensitivity of catchments to climatic input, and reinforces calls for catchment-specific management plans. Published by Copernicus Publications on behalf of the European Geosciences Union. 5126 J. Crossman et al.: Flow pathways and nutrient transport mechanisms drive hydrochemical sensitivity


Introduction
Phosphorus (P) is an essential nutrient in riverine and lotic ecosystems (Jarvie et al., 2002), however when present in concentrations surplus to requirements it can result in eutrophication, where death and decay of excess aquatic plant matter leads to reduction in stream oxygen concentrations (Nicholls, 1995;Jarvie et al., 2006) which affects fish spawning and survival (Evans, 2006).High P loads can also bring about alterations in algal species composition, leading to dominance of blue-green bacteria, some of which produce toxic compounds (Chorus and Bartram, 1999), and blooms can lead to increases in water quality treatment costs (Smith, 2003).Eutrophication of many water bodies around the world is now a pressing concern making nutrient monitoring and management increasingly important.
There are numerous sources and pathways of P to rivers, which may travel either in the dissolved (DP) or particulate (PP) form.As some of the PP may be rapidly converted to bioavailable DP, the movement of both forms (total phosphorus; TP) is generally monitored.Delivery to streams may be direct through sewage effluent, or more diffuse through soils (overland flow and leaching).Leaching can be a significant P transport mechanism where the ability of soil to adsorb P is low (e.g.shortly after fertiliser applications), and where leachate rarely comes into contact with adsorption sites (e.g. through macropore flow) (Haygarth et al., 1998;Hooda et al., 1999;Borling, 2003).
Nutrient transport pathways are influenced by hydrology, which is in turn driven by topography, geology and climate.Changes in climate, therefore, have far reaching implications for the future of catchments, with increasing temperatures and altered precipitation patterns affecting the timing and magnitude of runoff and soil moisture, changing lake levels, groundwater availability and river discharge regimes (Gleik, 1989;Bates et al., 2008).It follows, therefore, that individual catchments are likely to respond to climate change in different ways (van Roosmalen et al., 2007).The extent to which the hydrology and nutrient concentrations of a catchment respond to alterations in climate drivers can be termed its "sensitivity" to climate change (Bates et al., 2008).In developing robust management strategies and quantifying plausible future ranges of hydrology and water quality it is important to explore possible changes in climatic and non-climatic pressures (Whitehead et al., 2009), but equally so to consider the potential for spatial variability in, and causal factors of, catchment sensitivity.
A complicating factor when determining future conditions is climatic uncertainty.This stems from a wide range of sources, including natural variability, the inability to predict future emissions of greenhouse gases, and an imprecise understanding of climate systems (modelling uncertainty) (Goddard and Baethgen, 2009).The incomplete knowledge of the physics of climate systems introduces a significant source of uncertainty into global climate models (GCMs), known as parameter uncertainty (Wilby and Harris, 2006;Meehl and Stocker, 2007), whereby GCMs developed by different scientists may predict various outcomes under similar forcing conditions.It is therefore important to explore a range of different plausible futures, to provide a better understanding of the vulnerability of catchments (McSweeney and Jones, 2010), including possible threshold effects.Only recently have credible techniques become available to more fully quantify the effect of physical or parametric uncertainty on future projections.These methods include multimodel-and perturbed physics ensembles (PPEs).The former explores uncertainties stemming from differences in model structure by combining outputs from multiple GCMs (Collins et al., 2006).The latter alters the physical parameterisation of a single GCM, within scientifically accepted ranges, to create an array of model variants with specific climate sensitivities (UKCP, 2012).Using a PPE, alterations are made to the GCM in a systematic way, and the effects of different sensitivities on climate projections can be directly quantified (Collins et al., 2006); therefore a wider range of physically plausible future climates can be explored (Mc-Sweeney and Jones, 2010).
This study explores the underlying determinants of catchment hydrochemical sensitivity to climate change, examining if any particular catchment characteristics might be used as an indicator of that sensitivity.A comparison of hydrochemical sensitivity is made across four major subcatchments of a large lake in southern Ontario, Canada (Lake Simcoe), by applying five variants of the regionally downscaled Met Office Hadley Centre PPE to a range of catchments with varying characteristics, using process-based hydrochemical models.Objectives include (a) comparing the hydrological and water quality sensitivity (to climate uncertainty) across catchments with different quaternary geology and topography and (b) generating likelihood estimates of future water quality, having accounted for uncertainty in GCM parameters.A greater understanding of catchment sensitivity is intended to facilitate adaptive management strategies, which could look to reduce the extent of a catchment's hydrochemical response to climatic change.Such an approach would lessen reliance upon the current "project-then-act" paradigm, which relies heavily upon the accuracy of individual climate projections.The Simcoe region was chosen for study as 50 years of research had previously been undertaken into the sources and pathways of phosphorous.Agricultural practices, urbanisation, wetland drainage and sewage effluent have all been linked to increasing P levels in the lake (Evans et al., 1996;LSRCA, 2009, Jin et al., 2013), and associated with depleted oxygen concentrations and decreasing populations of trout, herring and whitefish (LSEMS, 1995). 2 Methods

Site description
Lake Simcoe is situated in southern Ontario, approximately 45 km north of Toronto, Canada.It drains north into the Severn River, and ultimately into Georgian Bay (Fig. 1).The total catchment area is 2914 km 2 (with a lake area of 720 km 2 ), comprised of 21 subcatchments.The four focus catchments (referred to in this study as Holland, Pefferlaw, Beaver and Whites) are situated in the south, and contribute some of the most significant P-inputs to the Lake (Winter et al., 2002(Winter et al., , 2007)).The combined east and west branches of the Holland form the largest catchment, with an area of 614 km 2 .The Pefferlaw and Beaver are a similar size at 444 and 325 km 2 ; and the Whites is the smallest at 85 km 2 .All catchments are characterised by a dominance of agricultural land use (Table 1), though the Beaver and Whites have the highest coverage at over 65 %.Major crops in the area are alfalfa, corn (for grain), soybeans and winter wheat (Statistics Canada, 2011).The Holland has the greatest degree of urbanisation at ∼ 20 %.There are three sewage treatment works (STWs) in the Holland, two in the Beaver, and one in the Pefferlaw.Being much smaller, there are no STWs within the boundaries of the Whites catchment.
A proportion of the headwaters of each catchment are located on the Oak Ridges Moraine, an important area of groundwater recharge (Johnson, 1997), characterised by steep slopes, high infiltration capacity and relatively low surface overland flow (LSRCA, 2012a).The lower reaches of each catchment flows through the Peterborough drumlin fields, Algonquin Plains and Rolling Plains (Johnson, 1997).These are characterised by low slopes, and clayey-till soils (LSRCA, 2012a).This transition in topography and quaternary geological permeability creates areas of waterlogging and marshlands proximal to the Lake (Miles, 2012).
Differences in catchment dominant quaternary geology leads to variations in surface overland flow and soil water residence times (Table 2), and the catchments can be split into two distinct typologies; those with a high proportion of runoff, and rapid throughflow (the Beaver and the Whites) (Miles, 2012); and those transporting water more slowly through the catchment (the Holland and the Peffer- law) (LSRCA, 2010b), with less runoff, and slow rates of soil matrix flow.The Beaver and Whites have a high coverage of low-permeability clayey soils (Soil Landscapes of Canada Working Group, 2010), and a high density of tile drains; characteristics which are often associated high runoff rates and macropore flow (Huang, 1995;Burt and Pinay, 2005) respectively.The saturation excess threshold of these soils is low, particularly in the northern end of the catchment, where wetlands dominate (Miles, 2012).Slower water transport in the Holland and Pefferlaw might be attributed to a higher soil storage capacity, due to the lower coverage of clayey soils, greater dominance of sandy loams and, in the Pefferlaw, low density of tile drains and a low percentage of impervious surfaces (LSRCA, 2012b).Additionally, in the Holland, extensive seasonal drainage of wetlands is performed (LSRCA, 2010b).Wetland drainage increases soil bulk density, which results in greater soil water retention capacity, and reduction of hydraulic conductivity and of overland flow (Burke, 1967(Burke, , 1975;;Schlotzhauer and Price, 1999).Over 50 % of the Holland's former marshlands are now drained in spring for  agricultural purposes, resulting in extensive modification to the catchment hydrology (Nicholls and MacCrimmon, 1975;LSRCA, 2010b).Urbanisation has had little influence on surface overland flow (LSRCA, 2010b), and has been attributed to township locations in the more permeable sediments of the Oak Ridges Moraine, where high infiltration capacities counteract the impact of impervious surfaces (Oni et al., 2014).
The average 30 year climate of the study regions  has a distinct seasonal cycle with average peak temperatures in the summer (June-August) of between 17 to 20 • C. Minimum temperatures occur in winter (December-February) of between −4 to −6 • C. Precipitation patterns are highly localised (Smith, 2010), but are generally greatest in summer (June-August) and in autumn (September-November) with between 2.9 and 3.4 mm of precipitation per day, and least in spring (March-May) with between 2.0 and 2.3 mm per day.There is a strong seasonal input of snow to the Simcoe catchment, with local meteorological stations recording an average of 92.3 cm of snow falling during winter, and 26.5 cm during spring.Annual average in-stream TP concentrations range from 0.03 mg L −1 in the Beaver to 0.14 mg L −1 in the Holland.

Model description
A process based, spatially distributed model was chosen for this study as it was considered more suitable for studying impacts of climate change on hydrochemistry than empirical alternatives (Adams et al., 2013).Empirical models are developed using correlative relationships, based upon a mechanistic understanding, and thus reflect current relationships between dependent and independent variables.Their applicability under future conditions is therefore questionable (Leavesley, 1994).Process-based models, however, have a greater focus on representing the underlying physical processes that describe system behaviour (Adams et al., 2013).As model parameters have a specific physical meaning, which can be defined for future altered catchment states (Bathurst and O'Connell, 1992)these models can be applied with more confidence outside the range of data under which they were developed, and hence have a greater ability to deliver credible projections under a changing climate (Leavesley, 1994).
The dynamic, process-based INCA-P model (INtegrated CAtchment model of Phosphorus dynamics) has been applied to over 40 catchments across Europe and North America (Wade et al., 2002a;Whitehead et al., 2011Whitehead et al., , 2014;;Jin et al., 2013;Baulch et al., 2013 inter alia).It uses a semidistributed approach to simulate the flow of water and nutrients through the terrestrial system to river reaches, differentiated by land use type.The model can simulate fully branched river networks, with unlimited numbers of tributaries and stream orders.Information flows through the model from the individual process equations, via the subcatchment comprised of up to six land use types, to a network of multiple reaches and tributaries.A full mass balance is imposed at each level (Supplement SI 1).
The flow of water and phosphorus through INCA is modelled through four storage zones (SI 2), using a series of detailed process equations, solved using numerical integration routines (Wade et al., 2002a(Wade et al., , b, 2007a)).By analysing phosphorus inputs from diffuse sources, sewage treatment works (STWs), sediment interactions and biological processes (Wade et al., 2002a) the model estimates daily values for a range of variables both in the terrestrial and aquatic phase.Terrestrial model outputs used in this study include nutrient export coefficients, and concentrations of both soil water total dissolved phosphorus (TDP) and labile P. Aquatic outputs include flow, and concentrations of in-stream total phosphorus (TP), TDP, and particulate phosphorus (PP).
INCA-P requires a daily input time series of precipitation, temperature, hydrologically effective rainfall (HER) and soil moisture deficit (SMD).Precipitation and temperature are obtained from local meteorological stations, and HER and SMD from the rainfall-runoff model Nordic HBV (Hydrologiska Byråns Vattenbalansavdelningen) (Saelthun, 1995).There are five main storage components within HBV, and the physical processes operating within and between these zones are represented through simplified mathematical expressions which can be adjusted through a series of parameters to within recommended ranges, to attain a best fit compared with observed river flow records (Oni et al., 2011).An individual HBV model set-up was used for each of the study catchments.

Model calibration
Calibration periods were selected that maximise available spatial resolution and temporal longevity of observed data, and as a result, the calibration period for each catchment varied according to availability of data (SI 3).Data was not reserved for independent validation as these are open, dynamic and natural systems, and no amount of independent data can account for a system's potential to change in unanticipated ways (Oreskes et al., 1994); i.e. validation by independent data does not prove accuracy of future predictions any more than calibration.Again, the use of a process-based model (as opposed to empirical) gives higher confidence that models will perform adequately under future conditions (Leavesly, 1994;Adams et al., 2013).Lengthening the calibration period has been shown to enhance model performance and has proven the most effective method in applying water quality models to predictive studies (Larssen et al., 2007); hence the chosen method was to calibrate models using the widest possible range of conditions.
Whilst the Holland, Pefferlaw and Beaver were all calibrated for over a decade (20, 17 and 12 years respectively), the short observed record available for the Whites (2010-2012) limited calibration to three years.Intensive monitoring over 17 sites within the Whites, however, enabled a detailed calibration across the whole catchment.Given the highly localised nature of rainfall patterns within the Simcoe region (Smith, 2010), catchment-specific observed daily temperature, precipitation and flow data was used for each HBV model (PWQMN, 2009;LSEMS, 2013;LSRCA, 2010a;Environment Canada, 2013;D. Woods, personal communication, 2013), and the derived HER and SMD were used to complete the daily time series required for INCA-P.For each catchment, the most proximal meteorological station data was used (Fig. 1).Where station record length was insufficient, the nearest available records were used and adjusted to the local baseline conditions using bias correction techniques of Futter et al. (2009).
Each of the four study catchments was subdivided into their constituent subcatchment network using ArcHydro GIS software, and a hydrological network developed from a 2 m vertical resolution digital elevation model (DEM) (Global LandCover Facility, 2002) (SI3).Land uses were grouped into five cover classes; urban, intensive agricultural, nonintensive agricultural, wetlands and forest, derived from Ecological Land Classification of Ontario data (Ontario Ministry of Natural Resources, 2007) (Table 1).Parameter inputs for model calibration were then calculated.As a distributed model, INCA can be calibrated individually to each land use, subcatchment and river reach within the catchment, giving extensive flexibility where conditions vary significantly within local areas.As there are over 107 parameters within the INCA model, plausibility of calibration values was assessed through a variety of methods including field measurements, GIS and digital elevation assessments, literature values, expert judgement and model performance statistics.For those model parameters which have the greatest impact on model output, it is especially important to obtain accurate calibration values.Within INCA-P these parameters include stream velocity, Freundlich coefficients, and soil P concentrations (Crossman et al., 2013c;Lepistö et al., 2013).Details of methods used to derive values for these (and other) parameters are given below, with average values for each catchment given in Table 2.
P-inputs from fertilisers (kg P day −1 ) were calculated using the methods of Wade et al. (2007b) and Baulch et al. (2013), from local recommended P-application rates (OMAFRA, 2009) and crop types (Statistics Canada, 2011).Inputs of P from livestock waste were calculated using current livestock assessment numbers (Statistics Canada, 2011) combined with Ontario livestock phosphorus nutrient coefficients (Bangay, 1976).Based on OMAFRA recommendations, model fertiliser inputs were extended over a 60 day period to intensively cropped areas, and a 120 day period to non-intensive agricultural lands beginning in late April (Baulch et al., 2013;Whitehead et al., 2011).Plant uptake in these agricultural areas was set to a maximum of 100 kg ha −1 yr −1 , based on previous INCA-P applications to the Simcoe region (Jin et al., 2013).In addition to effluent from STWs (Table 2), a large proportion of households in rural areas contribute P loads through septic systems.Input loads were determined using methods of Scott and Winter (2006) and Paterson et al. (2006), calculated as a function of annual P input per person, combined with septic system usage.Data used included GIS information on households connected to septic systems (Louis Berger Group Inc., 2010), household population data (Statistics Canada., 2011), and the average TP load from excretion (2.6 g TP person −1 day −1 ; Stephens, 2007).As the efficiency of P removal is estimated at 57 % (Whitehead et al., 2011), only 43 % of the calculated septic TP load was input to the model.
Initial soil P concentrations were based on values measured by Fournier et al. (1994), and soil equilibrium coefficients based on laboratory-derived equilibrium phosphorus concentration (EPC 0 ) and Freundlich isotherm values for different land use and soil types (Peltouvouri, 2006;Väänänen, 2008;Koski-Vähälä, 2001).These values were applied to the dominant soil types of each land use category within INCA (Olding et al., 1950;Soil Landscapes of Canada Working Group, 2010).Average catchment values for EPC 0 ranged from 0.01 in the Beaver to 0.08 in the Holland, and are consistent with those used by Baulch et al. (2013) and Whitehead et al. (2011).An overview of key input variables and data sources is given in Table 2, with model structures provided in SI 3.

Climate modelling using perturbed physics ensemble
The PPE used in this study is designed to quantify uncertainty in model predictions (QUMP) (McSweeney and Jones, 2010), and was developed from the HADCM3 global climate model (GCM), under the SRES-A1B scenario.Whilst ideally the widest range of ensemble member climate sensitivities would be explored, it is important that the models used continue to represent regional climatic behaviours (Collins et al., 2006).Therefore, as suggested by McSweeney and Jones (2010), a subset of five of the available 17 members were chosen, which cover the widest range of climate sensitivities, but which also accurately reproduce baseline temperature and precipitation patterns (Table 3).In the Simcoe basin, ensemble members with higher sensitivities were less representative of regional behaviours; and a maximum appropriate sensitivity of 4.0 was determined (Table 3).The five members were dynamically downscaled by the Institute for Energy, Environment and Sustainable Communities (IEESC, 2012) to 25 km 2 grid cells using the regional climate model PRECIS (Providing Regional Climates for Impacts Studies).Complete details of the model components within PRECIS and of its application to southern Ontario Figure 3.Comparison of monthly precipitation data from 5 ensemble members (Q0-15), regionally downscaled using PRECIS, with observed meteorological station data over 30 year baseline period .
are given in Jones et al. (2004), andIEESC (2012).PRECIS was run from 1968 to 2100, and daily outputs of mean temperature and total precipitation obtained for each ensemble member.Output grid cells from PRECIS were selected for analysis which covered the respective Simcoe catchments.At this resolution most catchments fall within a single grid cell; where two or more PRECIS outputs were available for a single catchment, that which was geographically closest to the catchment's observed monitoring station was selected.Individual grid cells were used for each catchment to reflect the highly localised nature of Simcoe precipitation patterns.
Whilst GCM outputs are not intended to be weather forecasts it is important that each ensemble member provides plausible climate information (Futter et al., 2009).The standard climatic baseline for southern Ontario has been established as the 30-year period of 1968-1997(IEESC, 2012)).Therefore to establish the suitability of each scenario, monthly average comparisons for both temperature and precipitation were made between QUMP outputs and observed data over this baseline period.Although reproduction of temperature by QUMP members was highly successful (Fig. 2), with an average monthly R 2 value of 0.99, precipitation was less accurate with an R 2 of only 0.23 (Fig. 3).The distinctive seasonal patterns of minimum winter, and maximum summer precipitation values are well represented by ensemble members Q0, Q13 and Q15; whereas ensemble members Q3 and Q10 present inverse seasonal patterns.Whilst removing these less accurate members from the study would restrict the range of future conditions applied in the analysis, use of many of the available bias correction techniques might also be considered questionable for a PPE study.Linear scaling, local intensity scaling, variance scaling and power transformations, for example, all perturb future regional climate model (RCM) simulations to account for biases detected during the baseline period (Lenderink et al., 2007) and in doing so would alter the future change of each ensemble member to different extents.This would render the study incapable of directly assessing impacts of climate variability (originating from internal GCM parameter uncertainty) upon catchment responses.
In contrast to the above, however, the Delta Change ( Change) method perturbs a time series of observed meteorological data to match projected future changes.Where there are discrepancies between baseline observed and baseline GCM data, this method assumes that the RCM is better able to simulate relative change than it is to provide absolute values (Hay et al., 2000).In the case of the PPE it also preserves the original relationship between QUMP ensemble members.Change is an established technique for directly assessing impacts of climate change, and was the primary method used to generate future scenarios in the U.S National Assessment (Hay et al., 2000).It was chosen here in order to ensure plausible projections for the Simcoe region under all ensemble members, whilst incorporating the maximum range of conditions in the study.Whilst all bias correction methods have their limitations and assumptions, it should be noted that delta change specifically does not account for changes in future event frequency (Teutschbein and Seibert, 2012).The same, however, is true of the linear-scaling method.In truth, all correction algorithms assume that the same correction factor will apply under future climate conditions.Despite their various limitations, however, simulation performance is greatly improved following the correction for bias (Teutschbein and Seibert, 2012).
Change was applied individually to each study catchment, using the associated gridded PRECIS outputs and observed meteorological data.Monthly Change was calculated as the average monthly difference between each ensemble baseline  and a future 30-year period; future periods selected for comparison were 2020-2049 (the 2030s), and 2060-2089 (the 2070s).These differences were applied to the observed meteorological conditions of the baseline period, so that each month in the observed data set underwent the same degree of change in climate as was originally demonstrated by the particular ensemble member.The standard formulae were used to determine the Change for future daily temperature (T f , daily) and future daily precipitation (P f , daily) (Akhtar et al., 2008), where additive change factors are used for temperature, and multiplicative for precipitation: where T o,daily and P o,daily are the observed daily temperature and precipitation; T f,monthly and P f,monthly are the mean monthly QUMP simulated future temperature and precipitation; T b,monthly and P b,monthly are the mean monthly QUMP simulated baseline temperature and precipitation.Monthly change values of precipitation (%) and temperature ( • C) are provided in SI 4 and SI 5 for reference.The new change-adjusted temperature and precipitation time series were then passed through respective HBV catchment models to derive associated SMD and HER.Eleven time series were generated for each catchment; five ensemble members over two different future 30 year time periods, and one baseline 30-year period of observed meteorological data.Finally, these time series of temperature, precipitation, SMD and HER were used as input for the INCA-P models to generate future values of water quality (TP) and flow.

Data analysis
Future change in variables (TP concentrations, TP loads and hydrology) were assessed as a percentage difference between 30 year baseline and 30 year future time periods.In this way, the impact of potential INCA-P model deficiencies during the calibration period were minimised, if model performance accuracy is presumed to be similar under both current and future scenarios.The ensemble cumulative distribution function (CDF) of monthly changes in each variable (temperature, precipitation, flow, and TP) was calculated to  the Holland, reported outflow performance statistics are an average (mean) of statistics from the two individual branches.* * Due to the different temporal resolution of sampling vs. modelled data "observed loads" are not directly comparable with modelled loads.Both are calculated using monthly means of flow and total phosphorus data, however, although degree of load variance explained by the model can be determined (R 2 ), a direct quantification of over or underestimation (model error) cannot be calculated.
identify the likelihood of a change being less than a certain amount, having accounted for the projections from all ensemble members.Uncertainty was calculated as the range of values projected by ensemble members at a specific likelihood level (Fig. 4), and "average uncertainty" was calculated as the mean value of uncertainty across all likelihood levels.The term "likelihood" is used in contrast to probability, where likelihood estimates give the odds of an event given specific data, and probability requires all outcomes to be accounted for.Due to the random and open character of natural systems, it is impossible to account for all possible outcomes, and therefore probability cannot be used in the statistical sense.Whilst the term "subjective probability" would also be appropriate (UKCP, 2010), likelihood is more commonly used.
The full spectrum of catchment responses projected (both hydrological and chemical) was assessed by looking at the range of catchment results (i.e. the maximum catchment value minus the minimum catchment value).In addition, catchment sensitivity metrics were derived to assess catchment responses independent of between-catchment differences in climate drivers.The ensemble average catchment response variable (monthly change in hydrology -m 3 s −1 -and phosphorus concentration -mg L −1 ) was divided by monthly change in precipitation (mm), giving an output of "catchment response per unit change in precipitation".A similar metric was not established for unit change in temperature due to insignificant differences in this driving variable between catchments.Sensitivities were compared across catchments to provide an indicator as to which underlying factors might contribute to their responses.

INCA-P model calibration
Model outputs were compared to observed flow data (PWQMN, 2009;LSEMS, 2013;LSRCA, 2010aEnvironment Canada, 2013;D. Woods, personal communication, 2013), and grab samples of TP.Although long term TP records were collated from a number of different sources (Table 2), all providers used similar methods of water quality analysis (colorimetric analysis following a digestion on the whole water sample).A summary of model performance statistics is provided in Table 4, an example of daily model output is given in Fig. 5, and the spatial variability of model fit throughout the catchment is provided in Fig. 6.All model performance statistics represent monthly averages, where model average error (MAE) is calculated as in Crossman et where A t o is the modelled value at time step t, and A t m is the observed value at time step t.
During calibration periods there was considerable spatial variability in TP concentration model performance statistics within each catchment (Fig. 6).In general, model accuracy was greatest in catchment outflows and lower in tributaries.In the Holland the greatest flow accuracy was achieved near the outflow, with R 2 = 0.95, and MAE = −24.8%.The accuracy of TP concentrations was also highest in the main chan- = 0.34) with an error of −23.4 %.This need not be interpreted as a poor model fit as accuracy varied significantly with season, where MAE is as low as −13.0 % in winter, but as high as −36.0 % in spring and can be attributed to difficulty in representing occasional peaks in TP during March and April.Model accuracy in reproducing annual loads varies from good to excellent with an R 2 of 0.61 to 0.71.
In the Beaver, the greatest model flow accuracy was found in the upstream tributaries, where flow R 2 = 0.98, and MAE was as low as −15.3 %.R 2 of TP concentrations = 0.79 and MAE was +4.4 %.The explained variance for TP load was also high (up to 0.82).At the downstream extent, flow variability was simulated well, with R 2 = 0.85, though MAE was +33.5 %.Although the fit to TP concentrations at the outflow was poor (R 2 = 0.05 and MAE = −58.4%) that of TP loads remained good (R 2 = 0.72).Despite the shorter duration of observed data available for the Whites, the explained variance at the outflow for TP concentrations, flow and TP loads were 0.92, 0.29 and 0.77, respectively.Average annual flow error at this site was only 3.0 %.The model does underpredict TP concentrations with an outflow MAE of −26.3 %, which is lowest during autumn and spring (−6.6 and 7.6 %, respectively) and highest during summer (−36.2 %).In headwaters and tributaries, R 2 of flow and TP loads of up to 0.90 and 0.94 were achieved.Success in modelling TP concentrations was highly spatially variable (Fig. 6), with a maximum R 2 = 0.47.
In addition to the monthly model performance statistics, model responses to precipitation and seasonal snowmelt events were assessed at catchment outflows (Figs. 7 and 8).High R 2 and low model errors during daily analysis of both the rising and recession limbs of hydrographs in each study catchment demonstrate the accuracy of, and give confidence in, the runoff simulations performed by the HBV-INCA model chain.Similarly, there was high seasonal accuracy of snowmelt responses (Fig. 8), increasing support for the models' ability to simulate flow pathways.
TP soil export coefficients (kg TP km −2 yr −1 ) were calculated and compared with previous studies to assess the plausibility of simulated nutrient transport pathways, to determine the relative contributions of TP from each land use type, and to identify catchments with noteworthy P export.INCA source apportionment outputs (kg TP km −2 ) were divided over the years of model runs (kg TP km −2 yr −1 ).Values were highly variable between sites, with catchment averages ranging from 3.4 kg km −2 yr −1 in the Beaver, to 13.1 kg km −2 yr −1 in the Holland (Table 5).These findings are consistent with previous studies of these catchments  ( Thomas and Sevean, 1985;Winter et al., 2007;Baulch et al., 2013), although higher exports have previously been reported for the Holland (Winter et al., 2007).This could be attributed to the larger spatial domain used in this study (covering both East and West branches of the river), plus a recent decline in TP exports due to implementation of best management practices.
In each catchment, the highest exports of TP originated from areas of agricultural land use, ranging from 3.7 kg km −2 yr −1 (Beaver) to 16.1 kg km −2 yr −1 (Holland).The lowest exports of TP originated from sewage treatment works, with the lowest outputs from the Whites (no output) and the greatest from the Holland (0.8 kg km −2 yr −1 ).It is notable that in the Holland exceptionally high export coefficients were found in wetlands (14.4 kg km 2 yr −1 ).This is consistent with drainage of former wetlands and use for intensive agriculture.The percentage contribution of each source to TP output from the catchment was calculated by multiplying each coefficient by land use area, and (in the case of STWs) by calculating the load differences resulting from models run with and without STW inputs (Table 5).In the Whites, Beaver and Pefferlaw, agriculture was by far the major source of TP (ranging from 64.3 to 69.9 %).In the Holland, although agricultural sources dominate, there is also a large contribution from urban areas and STW (14.0 and 5.5 %, respectively).
Terrestrial model outputs indicate significant seasonal variability in soil TDP concentrations within the agricultural areas of each catchment, with large increases in summer following fertiliser additions (SI 6), and associated increases in the soil labile pool, followed by reductions in winter and spring.These simulated processes are consistent with those described by Haygarth et al. (1998) and Borling (2003).cesses with existing system understanding gives confidence that catchment nutrient transport pathways are accurately represented.

Likelihood of change
Projected changes in temperature and precipitation for 10, 50 and 90 % likelihood levels indicate that changes will become progressively more extreme and more likely between the 2030s and 2070s (SI 7).Notably, there is very little difference between catchments in projected climatic drivers; by 2070 at the 50 % likelihood level there is a cross catch-ment range of only 0.1 • C (temperature) and 4.1 % (precipitation).The average temperature uncertainty within all catchments was low (Table 6), at < 1 • C in 2030, and < 2 • C in 2070.Average precipitation uncertainty was higher, at < 19 % in 2030 and < 23 % in 2070.Of note is the similarity in uncertainty between catchments (Fig. 9a-d), with a cross-catchment range in average temperature uncertainty of only 3.0 × 10 −2 (2030) and 0.1 (2070) (Table 6), and crosscatchment range in average precipitation uncertainty of only 3.5 % (2030) and 1.1 % (2070) (Table 6).

Seasonality of change
Seasonal changes in climate drivers (SI 8, SI 9) demonstrate that the largest increases in temperature generally occur during winter.Average seasonal changes in temperature were similar between catchments, with a maximum crosscatchment range occurring during winter (0.2 • C in 2070).
The Pefferlaw was, however, consistently projected to experience the greatest winter increases (+3.0 • C in 2030 and +5.9 • C in 2070), and the Whites the least (+2.9

Likelihood of change
Similar to temperature and precipitation, the likelihood and extent of flow changes increased between the 2030s and 2070s (SI 10).There were, however, markedly greater differences in projected flow changes between catchments than for corresponding climatic drivers.At the 50 % likelihood level there is a cross catchment range of 23.4 % by 2070, where projected changes are significantly smaller in the Holland and Pefferlaw (−5.8 and −8.2 %) compared to the Whites and Beaver (−29.2 and −20.4 %) respectively (Fig. 10).The average flow uncertainty within each catchment was also generally higher than that of temperature and precipitation, reaching up to 43.2 % (2030) and 40.1 % (2070) (Table 6).
In contrast to climatic drivers, average uncertainty in projected flow varied considerably between catchments, with a cross-catchment range of 19.9 % (2030) and 12.5 % (2070), and was consistently largest in the Whites and lowest in the Pefferlaw.

Seasonality of change
Seasonal changes in HER and flow (SI 11, SI 12) demonstrate that, similar to precipitation, the largest increases occur during winter, with reductions occurring during summer and autumn.In spring, however, in contrast to the increases projected for precipitation, large reductions are projected for both HER and flow.Similar to precipitation, the maximum cross-catchment range occurred in winter, though seasonal variability in HER and flow between catchments was notably greater than that of climatic drivers, with a winter HER range of 39.0 % ( 2030  HER occur in the Whites, and the smallest in the Holland and Pefferlaw. Changes in soil moisture deficit (SMD) were also analysed.Annually, SMD increased in all catchments, though increases in the Whites and Beaver (up to 45.8 %) were up to four times greater than those in the Holland and Pefferlaw (up to 11.0 %).Seasonally, the largest SMD increases were projected during spring, and were consistently higher in the Beaver and Whites than those in the Holland and Pefferlaw.

Sensitivity to change
To determine the sensitivity of catchment HER to precipitation input, the HER generated per unit change in precipitation was calculated (Table 7).On average, the Holland and Pefferlaw yield the least HER in response to changes in precipitation, whilst the Beaver and Whites generate the most.In both the 2030s and 2070s a greater proportion of HER flowed as surface runoff in the Beaver and Whites than in the Holland and Pefferlaw (Table 7), where soil matrix flow contributions were particularly high (up to 94.3 % in the Holland).The difference between catchments increased between 2030 and 2070, where proportions of surface runoff decreased in the Holland and Pefferlaw, and increased in the Beaver and Whites.Proportions of overland and subsurface flow varied by land use type, with highest runoff proportions in saturated wetlands and urban areas, and lowest in forests and intensively farmed (and drained) agricultural lands.

Water quality
Ensemble average projections (accounting for system uncertainty) of total annual TP loads for the 2070's were projected to decrease in the Beaver and Whites by −14.7 and −13.1 %, respectively, but lower reductions, and even some increases were projected for the Pefferlaw and Holland, of −1.0 and 14.7 %, respectively (Table 8).Projections varied considerably between catchments with a range of 14.6 % (2030) to 29.4 % (2070).

Likelihood of change
Again, projected changes in monthly TP concentrations and loads at the 10, 50 and 90 % likelihood levels amplify between 2030 and 2070 (SI 13) and indicate that larger changes are more likely by 2070.The cross-catchment range of TP loads is considerable, and at 32.7 % (2070 at the 50 % likelihood level) is wider than that of all other variables.Similarly there is a larger difference between catchments in TP concentration changes than for corresponding climatic drivers (8.9 % by 2070).Reductions in TP concentration are projected at the 50 % likelihood level, and are more extreme in the Whites and Beaver than in the Holland and Pefferlaw.Similarly, at the 50 % likelihood level markedly higher monthly reductions in TP loads are projected for the Whites and Beaver (−39.7 % by 2070) compared to the Holland and Pefferlaw (−11.2 by 2070).
In some catchments (Holland and Pefferlaw) the average uncertainty for TP concentrations was lower than that of the corresponding climatic drivers.However, similar to flow, uncertainty varied significantly between catchments (Table 6), with a cross-catchment range of 15.0 % (2030) and 16.5 % (2070).In contrast, the average uncertainty for TP  loads within each catchment was markedly higher than corresponding catchment drivers, flow and TP concentrations, at 30.9 % in 2030, and 20.4 % in 2070.Average uncertainty of TP concentrations was consistently highest in the Whites and Beaver (Table 6).Similarly, uncertainty in TP loads is highest in the Whites and lowest in the Pefferlaw (Fig. 11).

Seasonality of change
Seasonal changes in TP concentrations demonstrate that the largest increases are projected during winter (up to +7.6 %) and the largest reductions during summer and autumn (up to −26.6 %) (Fig. 12), and during this period are positively associated with changes in HER and flow.In spring, however, only in the Beaver and Whites did projected TP reductions (up to −9.0 %) correspond with projected reductions in HER and flow.In contrast, increases in spring TP concentrations are projected for the Holland and Pefferlaw (up to 3.9 %), and are negatively associated with projected hydrological changes (flow reductions).Accordingly, the spring crosscatchment range of projections was high (9.3 % in 2030 and 12.9 % in 2070).Ensemble average seasonal changes in TP concentration varied to a greater extent between catchments than did corresponding seasonal climatic drivers with a maximum cross-catchment range in autumn of 22.3 % (2070).
Seasonal changes in TP loads demonstrate that, similar to HER and flow, increases are projected during winter and reductions projected during spring, summer and autumn.Similar to hydrological and climatic variables, the maximum −15.1 and −20.9 % in 2070).This large cross-catchment difference in seasonal balances likely accounts for a significant portion of the difference in projections of annual loads between the Pefferlaw and Holland (increases) and Beaver and Whites (reductions) (Table 8).

Sensitivity to change
An analysis of change in TP (mg L −1 ) per unit change in precipitation was undertaken, to determine catchment sensitivity to climate drivers (Table 9).The Beaver and Whites had a higher sensitivity to changes in precipitation throughout the year (58.3 and 142.7 µg L −1 TP generated per unit of change in precipitation), compared to the Holland and Pefferlaw (22.1 and 10.6 µg L −1 TP per unit change in precipitation).In the Beaver and Whites the majority of changes in TP export occurred during spring and winter, in contrast to the Holland and Pefferlaw where greater changes occurred during summer and autumn.

Likelihood of future changes in hydrochemistry
Climate change and variability present long term water management challenges both now, and persisting into the future.This study takes that uncertainty in our knowledge of climate systems into consideration, and results indicate that changes in climatic drivers (temperature and precipitation) become progressively more likely and extreme between the 2030s and 2070s, increasingly perturbing catchment hydrology and water quality from present conditions.Notably, although the changes in climatic drivers were similar between four neigh-bouring study catchments, responses in HER and flow varied considerably between sites.With a significant seasonal snowmelt influence, catchments such as those within the Lake Simcoe basin may be very sensitive to climatic inputs (Barnett et al., 2005), and even small changes in climate could have large implications for catchment hydrology (Hamlet et al., 2005;Barnett et al., 2005), specifically with respect to the magnitude and timing of snowfall and snowmelt.The PPE projected greatest increases both in temperature and in precipitation to occur during winter, where larger quantities of precipitation previously falling as snow could in future fall as rain, likely resulting in an earlier snowmelt of reduced magnitude (Regonda et al., 2005;Crossman et al., 2013a, b).Projected soil moisture deficits (SMD) suggest that whilst in winter these higher inputs of rain (as opposed to snow) and an earlier snowmelt will limit rises in SMD, in spring a marked SMD increase is to be expected, due in part to the earlier depletion of frozen water stores.
Seasonally, the direction of change in projected HER and flow did not always match those of climatic drivers.Although each indicated that changes will be more extreme and likely between the 2030s and 2070s, spring reductions in HER and flow contrast markedly to projected precipitation increases.HER and flow are affected by a number of variables, including evapotranspiration and preceding soil moisture conditions.It is likely, therefore, that spring reductions in HER and flow are due to the increases in spring SMD, the magnitude of which was not offset by increases in spring rainfall.
The response of water quality (TP concentrations and TP loads) to climatic drivers also varied between catchments.Although, similar to climate drivers and hydrology, larger changes were more likely between 2030 and 2070, the direction of change was not consistent between catchments.Seasonally, in the Beaver and Whites the direction of change in TP concentration was consistently positively associated with that of HER and flow; whereas in the Holland and Pefferlaw a negative association was demonstrated in spring.These different hydrological and water quality responses to very similar climatic drivers are significant in terms of annual TP loads, where large annual reductions were projected for the Beaver and Whites, compared to much smaller reductions, and even some increases, projected in the Holland and Pefferlaw.Differences in hydrological flow pathways and mechanisms of nutrient delivery are a possible explanation for these different responses.

Catchment characteristics driving seasonal differences in hydrochemical responses
Model calibrations demonstrated that in every catchment TDP soil water concentrations increased in summer following fertiliser applications, which is consistent with observations of Haygarth et al. (1998) where fertiliser additions and plant decomposition enriched near-surface soil stores.
Research has shown that when TDP is present in excess of plant requirements, it can be transported from these surface pools to increase in-stream concentrations via overland and subsurface flow of soil water in highly fertilised organic soils, with low phosphorus sorption capacities i.e. shortly after the application of fertilisers in areas of shallow water tables e.g.wetlands (Borlin, 2003).By late autumn, the surface pools of TDP may be exhausted, and lower concentrations of TDP are delivered through subsurface flows, though soils have the capacity to release some P in an attempt to reestablish an equilibrium.Generally, soil water that leaches into streams in spring and winter will be less P-enriched than it was during the summer months; the manner in which this process is represented by the models is shown in Fig. SI6 (in the Supplement).
Where catchments, such as the Beaver and Whites, have significant overland and macropore flow contributions, or tile drainage (where rapid flow mobilises PP, and bypasses Pbinding sites in the soils; Hooda et al., 1999), TP can be delivered directly to the streams by erosion in spring (Heathwaite and Dils, 2000), and a positive relationship between HER and TP is maintained.Under future climates, spring flow and TP concentrations are therefore reduced simultaneously.However, where catchment pathways are dominated by soil matrix flow (Table 7), the dilute leachate enhanced by spring meltwater has a significant impact on in-stream TP concentrations (Borling, 2003), and a negative relationship between HER and TP is created.When future spring HER is reduced (and less dilute leachate delivered to streams), the in-stream TP concentrations increase.This effect may be further enhanced in the Holland and Pefferlaw catchments due to dilution of sewage treatment work contributions (point sources).
The large reductions in annual TP loads of the Whites and Beaver are projected due to large scale reductions in flow and TP concentrations in spring, summer and autumn, combined with only small scale increases in winter flow and TP concentrations.In contrast, spring flow reductions in the Holland and Pefferlaw enhance TP concentrations, which in combination with larger scale winter flow and TP concentrations (and smaller scale summer flow reductions), leads to projections of annual increases in TP loads.
The earlier thawing of frozen soils in spring under a changing climate may also influence nutrient flow pathways.In the Holland and Pefferlaw, earlier thawing was projected to lead to an increased proportion of subsurface flow, enhancing interactions with the soil matrix, and may be an additional causal factor behind the dilution of the spring and winter leachate.In the Beaver and Whites, however, with large macropores and tile drainage, soil freeze-thaw is projected to have less impact on infiltration (as observed by Harris, 1972).In addition, as water movement through the soil profile in the Beaver and Whites is rapid, any changes in subsurface flow generated during an earlier spring melt would have a limited effect on soil water concentrations reaching the stream.A final consideration is the drainage of wetlands (polders) in the Holland River during spring (Burke, 1967(Burke, , 1975)), where artificial drainage of soils can significantly increase levels of P lost through subsurface leaching (McDowell et al., 2001).This practice might therefore contribute to the negative HER associated with TP during spring, where the addition of artificial drainage waters with high TP concentrations will in future be less diluted by flows with a lower spring melt contribution.

Impact of catchment sensitivity compared to future climate uncertainty
In all catchments, uncertainty in temperature is low (across the range of QUMP scenarios generated by the PPE), and similar between catchments.As is common amongst climate change studies, uncertainty in precipitation is higher (Hawkins and Sutton, 2011;Maskey et al., 2004), though remains relatively consistent between sites.Importantly, hydrological (HER and flow) uncertainty is considerably higher than corresponding climatic drivers, and varies more significantly between catchments.The catchment-specific response can be defined by the quantity of HER generated per unit of precipitation input (Table 7).This relative index of HER sensitivity to climate change normalises the catchment response from differences in original driving forces (precipitation).On average, the Beaver and Whites catchments demonstrated a greater HER sensitivity to precipitation change (> 0.9 mm) than the Holland and Pefferlaw (< 0.6 mm).This sensitivity might be attributed to shorter soil water residence times in the Beaver and Whites, whereby more rapid soil drainage through macropores and tile drains results in prompt responses of soil moisture content to changes in precipitation; and the dominance of overland flow results in greater exposure to evaporation and rising temperatures.As a result, reductions in summer and autumn precipitation have a greater impact upon SMD in the Beaver and Whites (as indicated by model results) and thus upon HER and flow.Similarly, reductions in spring-melt waters within these catchments have a greater impact on SMD, as soils with a dominant overland flow component would previously have responded most rapidly and significantly to snowmelt input (Dunne and Black, 1971).In contrast, in the Holland and Pefferlaw projected changes in SMD are lower, and the impact of a changing climate upon hydrology less extreme, due to a higher soil storage capacity, longer soil water residence times, and a greater proportion of water directed through the soil matrix.Slow passage of waters through soils in the Holland and Pefferlaw reduces its response rate to precipitation changes and meltwater reductions, and its exposure to evaporation (Dunne and Black, 1971;van den Hurk et al., 2004), meaning soil moisture deficits can be recharged with a lesser affect upon HER and stream flow.The hydrological sensitivity of the Pefferlaw is higher than that of the Holland, but is consistent with the difference in soil water residence times and clay content between the two catchments (Tables 2 and 7).
The sensitivity of TP concentrations to changing climate also varies considerably between catchments, with a greater response of TP per unit change in precipitation in the Whites and Beaver (up to 142.7 µg L −1 ) than in the Holland and Pefferlaw (up to 22.1 µg L −1 ).In Simcoe this variance between catchments appears to be associated with flow pathways and timings of nutrient export which may act as a buffer to uncertainty within some catchments.For instance, in the Holland and Pefferlaw, the majority of changes in TP export are associated with changes in future summer and autumn precipitation (Table 9), despite there being smaller absolute changes in water volumes during these periods.This is likely due to the high mobility of soil TP in the Holland and Pefferlaw at this time of fertiliser applications, with high initial P soil concentrations, and limited P sorption capacities.As the uncertainty of future climate change is lowest during this period, these are the catchments for which the lowest TP uncertainty is calculated.During the periods of high climate uncertainty (winter and spring), TP mobility is low, and the effects of the climate change uncertainty upon water quality are buffered.In the Beaver and Whites, a much larger proportion of the changes in TP export occurs during winter and spring.This is likely due to the high runoff rates and macropore flow, which facilitates movement of TP directly to the stream during all seasons.Climate uncertainty is highest during the large projected spring and winter changes, therefore uncertainty in future TP concentrations are higher in these catchments.Given that nutrient loads are directly derived from a combination of flow and TP concentrations, it follows that the uncertainty from each propagates up into the load estimations.As such, uncertainty of future TP loads is also greatest in the Whites.
It is evident then that the future certainty in both hydrology and water quality depends not only upon climatic inputs (between which there is little difference in these catchments), but also largely upon the relative sensitivity of a catchment to changing drivers.Results from this study suggest that catchment sensitivity is strongly influenced by quaternary geology, seasonal P-inputs and P-saturation thresholds.Sensitivity to climatic drivers was highest in clayey catchments, dominated by overland flow, with little influence of P-saturation (Beaver and Whites); it was lowest in sites with greater contributions from soil matrix flow, and where seasonal variability in soil P-saturation had a marked influence on water quality (Holland and Pefferlaw).It is likely that these characteristic drivers of sensitivity will be applicable to other catchments.For instance, van den Hurk (2004) determined that future inter-annual variability in wintertime runoff in the Rhine was smallest in catchments with high storage capacities, whilst van Roosmalen et al. (2007) determined from a range of sites in Denmark with differing quaternary geology, that sites with clayey soils responded to precipitation change with the highest increases in stream discharge and overland flow proportions.The consistency of driving forces behind catchment sensitivity to climate change requires further investigation, but indicates that the clay content of soils might be used as an indicator of catchment hydrochemical sensitivity to climate change.

Study limitations and further research
By analysing outputs from a range of physical structures of a single global climate model (a perturbed physics ensemble), this study explores the sensitivity of catchments to uncertainty in climate system behaviour and parameter values of global climate models.Whilst this enables an examination of a wider range of plausible future projections than could be investigated through a multi-model ensemble approach, the PPE used is derived only from the HADCM3 model.A wider range of plausible futures could be generated by perturbing the physics of additional GCMs (Collins et al., 2006).Currently, however, PPEs are available for only a limited number of GCMs.
As discussed previously, the choice of GCM bias correction method impacts both climatic and hydrochemical outputs.The Change method was selected as it preserves the impact of perturbing the physical structure of the climate model (i.e. it maintains the difference between PPE ensemble members).This method, however, does not account for potential changes in distribution of wet and dry days, and outputs from this study should not be used in assessments of future flood or drought frequency.
Finally, parametric uncertainty within both INCA-P and the rainfall-runoff model HBV might also influence model outputs.In this study, care was taken to accurately calibrate the most sensitive parameters.Within HBV these have previously been identified as parameters associated with partitioning precipitation into runoff and soil moisture (Seibert, 1997;Uhlenbrook et al., 1999).The assessment of hydrological responses to individual storm events and seasonal snowmelt events (Figs. 7 and 8) enhance confidence in runoff partitioning within these model applications.Within INCA-P, sensitive parameters have been identified as Freudlich soil coefficients and stream flow velocities (Crossman et al., 2013c;Lepistö et al., 2013).Laboratory and field values specific to each catchment were used for these parameters.The similarity of model process responses (including soil water TDP responses to fertiliser inputs) to those in the literature, and of model soil TP export coefficients to those calculated by Winter et al. (2007), gives confidence in model representation of reality.While models were calibrated using these measured and calculated values, there are over 107 parameters within INCA, and parametric uncertainty is an issue that should be considered (Wade et al., 2001;Starrfelt and Kaste, 2014).Although beyond the scope of this study, future work will explore the extent to which parametric uncertainty might influence assessments of catchment sensitivity.
Whilst there are a range of uncertainties associated with hydrochemical modelling, process based modelling does provide the best method of understanding catchment responses to possible future conditions (Jin et al., 2013).These results provide a useful basis for the development of catchment-specific management approaches.The overall outcome of this study indicates that management strategies might focus on internal processes dynamics that define the extent of hydrochemical responses to changes in climatic input, and as such reinforces the call for catchment-specific management plans.In the Holland and Pefferlaw, where the majority of P export appears to be derived from leachate during summer and autumn, a reduction in labile P concentrations and an increase in soil sorption capacity would be beneficial.It has been shown that soil P levels can take over 20 years to fully recover from extensive periods of excess fertiliser applications (McCollum, 1991;McLauchlan, 2006), but may gradually be achieved through methods such as avoidance of nutrient applications in excess of plant requirements (Whitehead et al., 2011), and establishing nutrient vulnerable zones where use is no longer required.In the Beaver and Whites, where the majority of P export appears to be derived from surface runoff and soil erosion, a more appropriate strategy might include efforts to reduce rates of surface runoff from fertilised fields, through injecting fertiliser directly to the root zone, constructing vegetated riparian areas (Sharpley et al., 2001), restricting fertiliser applications to periods of higher soil moisture deficit (Westerman and Overcash, 1980;Sharpley et al., 2001), and limiting tile drainage.

Conclusions
Accounting for climate modelling uncertainty, this study suggests that in the Beaver and Whites, large scale reductions in annual TP loads could result from reductions in flow and TP concentrations in spring, summer and autumn, combined with minor increases in winter.In contrast, much smaller reductions and even some increases in annual TP loads are expected in the Holland and Pefferlaw, due to much smaller scale reductions in summer and autumn flow and TP concentrations, combined with larger scale increases in winter, and enhanced spring TP concentrations.These findings indicate that all changes will become more extreme and more likely between 2030 and 2070.
Uncertainty in future projections of climate change had different impacts on uncertainty of future hydrological and water quality projections within each catchment.This impact depended upon (a) catchment characteristics which determine the sensitivity of a catchment to climate drivers and (b) the degree of projected climate change within each catchment.Influential catchment characteristics include quaternary geology, seasonal P-inputs and P-saturation thresholds.Catchment sensitivity to climatic drivers was relatively high where catchments had large proportions of overland flow and a direct association between TP concentrations, loads and HER was established.Catchment sensitivity was lower in catchments with larger proportions of soil matrix flow, and where seasonal variability in soil P-saturation influenced water quality; here internal catchment processes had a greater impact on hydrology and water quality.
The different ranges of uncertainty demonstrated by the catchments indicate that in water quality management it is important to consider every catchment as a distinct hydrological unit.Effectiveness of management strategies will vary depending on the dominant nutrient sources and P transport mechanisms in each area.Significantly, although the hydrology and water quality of the Whites and Beaver Catchments are more sensitive to changes in climate, the majority of projections under the PPE indicate reductions in TP concentrations and loads.Although the Holland and Pefferlaw are less sensitive to change, the dominant response is an increase in annual TP loads; as such the latter catchments might be seen as priority target areas for nutrient management.

Figure 1 .
Figure1.Map of 4 study catchments within the Simcoe Basin, southern Ontario, demonstrating PRECIS (Providing Regional Climates for Impacts Studies) RCM (regional climate model), available meteorological stations, and catchment boundaries.

Figure 2 .
Figure2.Comparison of monthly temperature data from 5 ensemble members (Q0-15), regionally downscaled using PRECIS, with observed meteorological station data over 30 year baseline period.

Figure 4 .
Figure 4. Calculations of (a) monthly change in temperature and (b) projected uncertainty from cumulative distribution functions of perturbed physics ensemble temperature data of the Holland River in 2030.

Figure 5 .
Figure 5. Example of INCA model output.Calibration results from Holland River comparing modelled and observed data of (a) flow at East Branch outflow, (b) total phosphorus (TP) concentrations at East Branch outflow, (c) TP concentration at West Branch outflow.

Figure 6 .
Figure 6.Spatial variability in model accuracy of total phosphorus (TP) concentrations throughout INCA models calibrated across multiple reaches, illustrating relative model error in (a) Holland, (b) Pefferlaw, (c) Beaver and (d) Whites catchments.Data represents average TP concentrations over the calibration period.

Figure 7 .
Figure 7. Analysis of INCA model fit to event data at outflow of (a) Holland, (b) Pefferlaw, (c) Beaver and (d) Whites catchments.Rising and recession limb at analysed separately for R 2 and model average error (MAE) to determine model performance under precipitation events.All events were analysed during the same four day period of intense precipitation in November 2011.Although event curves can be used to confirm model representation of reality, it should be noted they should not be used to compare response times between catchments.

Figure 8 .
Figure 8. Analysis of model response to observed spring melt in (a) Holland, (b) Pefferlaw, (c) Beaver and (d) Whites catchments.Results analysed during spring melt of 2010.
The consistency of modelled nutrient and hydrological pro-

Figure 9 .
Figure 9. Between-catchment comparison of the CDFs from the perturbed physics ensemble of (a) monthly change in temperature by 2030, (b) monthly change in temperature by 2070, (c) monthly change in precipitation by 2030, (d) monthly change in precipitation by 2070 (CDF: cumulative distribution function).

Figure 10 .
Figure 10.Between-catchment comparison of the CDFs from the perturbed physics ensemble of (a) monthly change in flow by 2030, (b) monthly change in flow by 2070, (c) monthly change in total phosphorus (TP) concentration by 2030, (d) change in TP concentration by 2070 (CDF: cumulative distribution function).

Figure 11 .
Figure 11.Between-catchment comparison of the CDFs from the perturbed physics ensemble of (a) change in total phosphorus loads (TP) by 2030, (b) change in TP loads by 2070 (CDF: cumulative distribution function).

Figure 12 .
Figure 12.Box plot of seasonal changes in total phosphorus concentrations across all perturbed physics ensemble members (one box per ensemble member) in (a) Holland, (b) Pefferlaw, (c) Beaver and (d) Whites.

Table 1 .
Land use cover of the four study catchments within the Lake Simcoe catchment reported to 1 decimal place (1 d.p.).

Table 2 .
Details of key parameters and data sources used in model calibration (1 d.p.).

Table 3 .
Collins et al., 2006)ty in Model Predictions (QUMP) members listed in order of climate sensitivity under a doubling of CO 2 concentrations (fromCollins et al., 2006).Bold members indicate those selected for use in the study.

Table 4 .
Model performance statistics presented for locations of highest calibration accuracy and at downstream extent (for which corresponding observed flow, TP and loads data were available).Results are reported to 2 d.p.For individual performance statistics of each tributary, see Fig.6.Due to lack of observed data available for model performance assessment in the convergence zone of the East and West branch of *

Table 5 .
Export coefficients (kg km −2 yr −1 ) of total phosphorus from diffuse and point sources, and contributions of each source as a percentage of total phosphorus export (1 d.p.).

Table 6 .
A measure of average uncertainty of temperature, precipitation, flow, TP concentrations and loads across the perturbed physics ensemble, where "average uncertainty" is calculated as the mean value of uncertainty across all probability levels within the cumulative distribution function (1 d.p.).

Table 7 .
Calculation of quantity of hydrologically effective rainfall (HER) (mm) generated per mm of precipitation (1 d.p.), and comparison with catchment characteristics (hydrological and quaternary geology).

Table 8 .
Annual change in total phosphorus loads between baseline and future periods, for all ensemble members of the perturbed physics ensemble.Includes ensemble average (EA) change (1 d.p.).

Table 9 .
Estimation of average catchment sensitivity of water quality to uncertainty in precipitation: projected change in total phosphorus (mg L −1 ) per mm change in precipitation averaged over 2030's and 2070's (1 d.p.) highlighting (in bold) the seasons of greatest sensitivity.