Using Variograms to Detect and Attribute Hydrological Change

There have been many published studies aiming to identify temporal changes in river flow time series, most of which use monotonic trend tests such as the Mann–Kendall test. Although robust to both the distribution of the data and incomplete records, these tests have important limitations and provide no information as to whether a change in variability mirrors a change in magnitude. This study develops a new method for detecting periods of change in a river flow time series, using temporally shifting variograms (TSVs) based on applying variograms to moving windows in a time series and comparing these to the long-term average variogram, which characterises the temporal dependence structure in the river flow time series. Variogram properties in each moving window can also be related to potential meteorological drivers. The method is applied to 91 UK catchments which were chosen to have minimal anthropogenic influences and good quality data between 1980 and 2012 inclusive. Each of the four variogram parameters (range, sill and two measures of semi-variance) characterise different aspects of the river flow regime, and have a different relationship with the precipitation characteristics. Three variogram parameters (the sill and the two measures of semi-variance) are related to variability (either day-today or over the time series) and have the largest correlations with indicators describing the magnitude and variability of precipitation. The fourth (the range) is dependent on the relationship between the river flow on successive days and is most correlated with the length of wet and dry periods. Two prominent periods of change were identified: 1995–2001 and 2004–2012. The first period of change is attributed to an increase in the magnitude of rainfall whilst the second period is attributed to an increase in variability of the rainfall. The study demonstrates that variograms have considerable potential for application in the detection and attribution of temporal variability and change in hydrological systems.


Introduction
Increasing scientific agreement on climate change (IPCC, 2013) has been parallelled by a rise in the number of studies investigating the potential impacts on various aspects of the Earth system, economies and society. One projected impact from climate change is a change in river flow dynamics, in particular changes in the magnitude, seasonality and variability of river flows which could have major impacts on the management of water resources and flood risk (e.g. Hirabayashi et al., 2013;Gosling and Arnell, 2013) on a global scale. For the UK the potential impact of climate change on water resources and flooding has recently been reviewed by Watts et al. (2015). Examining future changes in river flow is a focus for many modelling studies. However, the uncertainties inherent in scenario-based future projections (Prudhomme et al., 2003) highlight the need for observational evidence of change (Huntington, 2006).
Being able to detect and attribute changes in observed data is challenging, particularly in systems which are the result of complex (often non-linear), interactions between several processes (e.g. precipitation, evapotranspiration, storage and transport within a catchment). Further levels of complexity are added due to temporal changes in catchment characteristics (e.g. land cover and land management), anthropogenic modification of rivers (e.g. abstraction, impoundments and channel modifications) and changes in the location and hydrometric performance of gauging stations.
Previous studies have shown trends of increases and decreases in observed river flow for individual catchments, but at the regional to national scale the picture is more complex and regional patterns are often not spatially coherent (as noted for Europe, e.g. Madsen et al., 2014) and results are dependent on the methods and the study periods used. In the UK, significant heterogeneity in streamflow trends has Published by Copernicus Publications on behalf of the European Geosciences Union.

2396
A. Chiverton et al.: Using variograms to detect and attribute hydrological change been reported, with trends of different sign occurring in some catchments in close proximity (Hannaford and Buys, 2012). These spatial and temporal differences in published results of change detection studies are an obstacle to efforts to develop appropriate adaptation responses, particularly when there is a lack of congruency with scenario-based projections for the future. This has led to calls for fresh approaches to change detection, as highlighted by several recent synthesis reviews (e.g. Burn et al., 2012;Merz et al., 2012; and the IAHS decade "Panta Rhei" ("everything flows") which aims to reach an improved understanding of the changing dynamics in the water cycle (Montanari et al., 2013). This paper describes one such new avenue for change detection, namely temporally shifting variograms.

Review of previous approaches to change detection
Detection of environmental change is a huge area of research which cannot easily be reflected in an introduction. More extensive reviews of change detection methods in hydrology are available (e.g. Yue et al., 2012) and there are textbooks on trend testing in the environmental sciences in general (e.g. Chandler and Scott, 2011). The overview below will give the reader a flavour of the methods which are available, with a brief critique, to set the new method described in Sect. 1.2 in context. The choice of change detection method clearly depends on the users' aims and available data.
The majority of hydrological change detection studies use monotonic trend tests such as Mann-Kendall (details of which can be found in Yue et al., 2012) which are influenced by the amount of autocorrelation in the data as well as by the start and end points of periods to which the trends tests are applied (Hannaford et al., 2013;Chen and Grasby, 2009). This is particularly problematic when the gauging stations have relatively short records starting in a dry or wet period. For example, the UK gauging station network was largely built in the 1960s when the North Atlantic Oscillation Index (NAOI) was in a strong negative phase resulting in conditions for the UK which were drier than much of the following record. Furthermore, monotonic trend tests only provide information as to whether change has occurred over the time period being investigated and no information is gained as to the type (e.g. abrupt or gradual) or the timing of change. This is a major limitation as it makes it difficult to link a simple monotonic trend in streamflow to trends in potential drivers of change (i.e. changes in meteorological conditions or catchment properties). A further weakness of current change detection methods is that they often use indicators of flow selected a priori to characterise a particular aspect of the flow regime (e.g. the Q 95 ; 7 day minimum flow; frequency of peaks-over-threshold), which potentially introduces bias by selecting a pre-determined aspect of the flow regime.
Another approach to change detection is change-point analysis, which can be used to identify the temporal location where change occurs (e.g. Beaulieu et al., 2012, applied change-point analysis to climate variables and Jandhyala et al., 2013, reviews change-point analysis including a plethora of studies which investigated change points in the Nile river flow time series). Change-point analysis identifies the temporal location at which one or more properties of the river flow time series change abruptly (e.g. a change in the magnitude, variability or autocorrelation, etc.) but is associated with several limitations. Firstly, there is increased uncertainty about change points detected close to the start or end of the time series (due to a higher risk of false detection). Secondly, the method only detects changes in one aspect of the time series (e.g. linear trend, magnitude, variability or autocorrelation). Finally, although change-point analysis is designed to detect abrupt changes there is, in practice, great difficulty in discriminating between trends and abrupt changes (as demonstrated by Rougé et al., 2013). Jarušková (1997) provides a cautionary review of change-point detection methods for river flow data.
An alternative approach to change detection is through analysis of periodicities. There is a wide range of methods available for decomposition of time series into various components (e.g. Fourier methods, empirical mode decomposition, wavelets; see for example Labat, 2005 andSang, 2013). These approaches can detect complex non-linear patterns of variability and do not require the selection of indicators as they are normally based on the whole time series. However, such approaches normally characterise periodicities over a range of scales, rather than changes over time. It is hard to relate the change in spectral shape to the hydrological regime (Smith et al., 1998). This is indicated by recent studies in the UK which applied these methods and did not go beyond looking at the high-level drivers, particularly the NAOI (e.g. Sen, 2009;Holman et al., 2011). Similarly, Kumar and Duffy (2009) use single spectral analysis to look at the precipitation-temperature-river flow relationship. This analysis enabled the authors to link the identified temporal changes to the southern oscillation as well as large anthropogenic influences (dam building and pumping), but did not investigate how changes in different aspects of the precipitation regime (e.g. seasonality and magnitude) influence the river flow time series.

The proposed new method
Here a novel and fundamentally different methodology for detection of hydrological change is introduced using variograms that are applied to moving windows in a river flow time series (hereafter, temporally shifting variograms, TSVs). The TSV method provides insights into how river flow dynamics evolve through time, without relying on fixed study periods or pre-determined flow indicators. This enables streamflow changes to be linked explicitly with external drivers (e.g. meteorological forcing). Variograms are able to capture the temporal dependence structure of the river flow (i.e. on average, how dependent river flow on a partic-ular day is on river flow on the preceding days). The temporal dependence structure is closely related to the amount of variability at different temporal scales in the time series and, as it is influenced by catchment characteristics (Chiverton et al., 2015), it enables inferences to be made about the precipitation-to-flow relationship in a catchment.
As previously noted in the introduction there are several methods of identifying temporal changes in river flow and a large range of indicators which could also be investigated using a moving window. The TSV has key advantages over existing methods. Firstly, the variogram can be thought of as a composite indicator which provides information about a range of aspects in the river flow time series, hence enabling a range of possible temporal changes in river flow dynamics (e.g. standard deviation and seasonality) to be captured. Variograms can also detect changes in daily river flow which other indicators may not be able to (e.g. changes in variability at a range of time scales). Furthermore the variogram is calculated using daily flow data and does not rely on the user extracting pre-conceived aspects of the river flow regime via the calculation of indicators (e.g. annual or seasonal averages, minimum or maximum flow). This enables the whole flow regime to be investigated, rather than much of the daily flow information being discarded, as is the case when calculating some indicators (e.g. annual 7 day minimum flow).
It is worth noting that there are a range of stochastic techniques which can characterise the basic autocorrelation structure of data (e.g. AR, ARIMA, etc.). These classical time series analysis approaches have been widely used to investigate hydrological behaviour (e.g. Salas et al., 1982;Montanari et al., 1997;Chun et al., 2013). Such approaches characterise temporal dependence and can also in principle be applied in moving windows (e.g. AR1 applied in 20-year moving windows by Pagano and Garen, 2005). A limitation with the classical models is that the user has to select the appropriate AR and MA parameters, a potentially subjective process, which will vary between catchments. In practice, they have not been widely used to examine changes in temporal dependence through time.
The method we propose uses variograms to characterise the autocorrelation so that the AR parameter does not need to be specified. Furthermore, variograms are designed to handle missing data which is common in river flow time series. The variogram has several defined parameters (e.g. nugget, sill and range) which characterise different aspects of the autocorrelation structure that can be used in moving window change analysis. This enables changes in several aspects of the river flow regime to be analysed in parallel.
Conventionally most trend analysis studies focus on change detection, and attribution is often based on qualitative reasoning and relies on published work to support the hypothesis (Merz et al., 2012). The TSV method enables changes in river flow (associated with changes in variogram parameters) to be quantitatively related to meteorological characteristics. This work is an attempt to provide a formal "proof of consistency" (Merz et al., 2012) that river flow changes can be associated with changes in meteorological drivers. This is an important new development, as few published studies of streamflow change have sought to explain observed patterns through links to precipitation. We acknowledge that this does not amount to full attribution without "proof of inconsistency" with other drivers (e.g. land use change), but it does provide a solid foundation for such attribution studies. In principle, the method could be used with a wider range of drivers, both natural and anthropogenic, if temporal data on, e.g. land-use change, were also available.
This study has the following objectives: develop a novel change detection method (TSV) to detect both linear and non-linear changes throughout the river flow regime; test the performance of the method by imposing artificial changes to a river flow time series; identify patterns of temporal change in rivers for a set of 94 catchments in the UK; and explain the contribution of precipitation to the detected variability in variogram parameters. This paper is structured as follows: Sect. 2 describes the data employed; Sect. 3 details the TSV method; Sect. 4 tests the TSV method using artificially perturbed river flow time series; Sect. 5 identifies the periods of change across the 94 UK catchments and Sect. 6 investigates the meteorological drivers.

Catchment selection
Near-natural UK benchmark network catchments, with only modest net impacts from artificial influences, were chosen (Bradford and Marsh, 2003). These catchments are deemed to have good data quality and therefore artificial influences will be limited. Furthermore, only catchments with a record length of 33 years or more  of daily river flow data with less than 5 % missing data were considered. Nested catchments with similar flow regimes were excluded.
This data set was used in a previous study which classified UK catchments into four classes according to their average temporal dependence structure (Chiverton et al., 2015). One of these classes was excluded from the present study; this comprises catchments which have high infiltration and storage, hence with distinctly different precipitation-to-flow relationships than the rest of the catchments. In particular, Chiverton et al. (2015) demonstrated that these catchments have very long-lasting temporal autocorrelation of over a year, largely due to the influence of groundwater storage, instead of weeks to a few months like the other catchments. To avoid this very different catchment response time overly influencing results, catchments which overlay highly productive aquifers were removed (mainly in the south-east of England). This resulted in 94 catchments, shown in Fig. 1.

Precipitation characteristics
Daily catchment-averaged precipitation values were calculated from CEH-GEAR, a 1 km 2 gridded precipitation data set (Tanguy et al., 2014) derived using the method outlined in Keller et al. (2015). From this data, characteristics which represent different aspects of the precipitation regime were calculated (Table 1).

The temporally shifting variograms methodology
Before going into the details of the method it is important to point out that this paper is not aiming to ascribe the behaviour in the global variogram as the definitive expression of the temporal dependence structure. This paper develops a method which identifies differences between variogram parameters at different time scales that represent significant changes in the temporal dependence structure that are due to meteorological drivers (or, theoretically, anthropogenic influences e.g. land management change, although this is not considered here; see also Sect. 6).
The methodology consists of four steps, as follows: transformation of river flow data for analysis using variograms (Sect. 3.1); creation of variograms for each catchment (Sect. 3.2); detection of periods of change in streamflow using TSV (Sect. 3.3); and, analysis of the influence of meteorological drivers using Pearson correlation and multiple linear regression methods (Sect. 3.4).

Data transformation
An overview of how the river flow time series has been de-seasonalised and standardised (steps 1 to 5) is provided here, but in-depth discussion can be found in Chiverton et al. (2015).
1. The river flow data were in-filled, using the equipercentile linking method (Hughes and Smakhtin, 1996), to remove periods of missing data. This was required to improve the de-seasonalisation (step 3).

2.
A log-transform of the time series was undertaken to stabilise the variance and create a near normal distribution. Values of zero were replaced by 0.001 m 3 s −1 prior to transformation. It should be noted that a variogram could be created for a river flow time series which has not been logged; however, the user would need to take care in the fitting to ensure: (a) the variogram fits the data well and (b) the shape of the variogram is not overly influenced by extreme values.
3. Seasonality was removed using Fourier representation. This was done to avoid exaggerating the temporal dependence. The de-seasonalising was carried out using the "deseasonalize" package in R, see Hipel and McLeod (2005) and Chandler and Scott (2011) for further details and illustrative examples.
4. The infilled data from step 1 were removed. The infilled data were solely used for the de-seasonalisation (step above). Since the in-filled data are associated with a greater uncertainty than the measured data, they are removed from the subsequent analysis as variograms are well suited to handling missing data.
5. Flow data were standardised for each catchment by subtracting the mean and dividing by the standard deviation of the time series. Standardising enables comparison of catchments with different magnitudes of flow.

Creating variograms
The temporal dependence structure can be represented by a one-dimensional temporally averaged variogram (see Chandler andScott, 2011, or Webster andOliver, 2007, for a detailed background on variograms). Based on the transformed, The maximum number of successive days for which the precipitation is above/below the threshold. Average length of precipitation above or below 1 mm day −1 days The average number of successive days for which the precipitation is above/below the threshold. Only periods of time greater than 2 days were analysed. Winter/summer precipitation ratio unitless The mean rainfall in December, January and February divided by the mean rainfall for June, July and August. Autumn/spring precipitation ratio unitless The mean rainfall in September, October and November divided by the mean rainfall for March, April and May.
de-seasonalised standardised flow data, an empirical semivariogram was calculated for each catchment using the average squared difference between all pairs of values which are separated by the corresponding time lag (Eq. 1 which calculated the semi-variance): where h is the lag time, Y (t i ) is the value of the transformed data at time t i and (N − h) is the number of pairs with time lag h. A variogram model was then fitted using the variofit function (from the geoR package in R and the Cressie method; Cressie, 1985) to the empirical semi-variogram to enable the following parameters to be calculated ( Fig. 2): the nugget, which is the y intercept, represents a combination of measurement error and sub-daily variability; the sill is defined as the semi-variance where the gradient of the variogram is zero. A zero gradient indicates the limit of temporal dependence and is an indicator of the total amount of temporally auto-correlated variance in the time series. The partial sill is the sill minus the nugget and shows the temporally dependent component, used herein as the sill. The range is the lag time at which the variogram reaches the sill value. Autocorrelation (gradient of the variogram) is essentially zero beyond the range. The practical range is the smallest distance beyond which covariance is no more than 5 % of the maximal covariance (time it takes to reach 95 % of the sill) (Journel and Huijbregts, 1978). As the variogram is only asymptotic to the horizontal line which represents the sill, the practical range is used herein as the range.

Detection of change in streamflows using TSV
The fundamental premise of the TSV approach is that variograms are applied in moving windows through a time series, to determine the extent to which variogram properties (which characterise the autocorrelation structure) change through time. To examine how unusual these changes are in the context of the observed streamflow record, the method determines whether variogram properties in each window are outside thresholds which encompass the 5-95 % of expected values based on the original 32-year average variogram. Periods of change (compared to the 32-year average variogram) were thus detected for the 94 catchments using the following method, applied to each catchment: 2. Calculate upper and lower thresholds (the 5th and 95th percentiles of the 1000 variograms). Several thresholds were tested and the 5th and 95th percentiles were chosen as these were found to detect an appropriate number of threshold exceedances throughout the time series.
3. Calculate parameters (see below for details) for variograms applied to 5-year overlapping moving windows (shifting by 1 year) from the original (de-seasonalised and standardised) river flow data. The values for the 5year moving windows were compared with the spread of expected values (between the 5th and the 95th percentiles) for the 32-year average variogram to see if they were above, below or inside the thresholds. Different sized windows between 1 and 10 years were analysed; 5-year overlapping windows were found to be long enough to obtain a good fitting variogram whilst being short enough not to characterise the average behaviour of the system.
Four variogram parameters were calculated. Firstly, the sill and range were calculated. However, as the data used are relatively high frequency (daily) and good quality, the value for the nugget is low (although not zero as there is measurement error and sub-daily variability) and the 5th percentile is zero. Therefore, the nugget cannot be handled in the same way as the other variogram parameters (i.e. decreases below the lower bound cannot be investigated). Instead, a new parameter, the 3-day average semi-variance (3 DASV) (average of the first three points of the semi-variogram) was defined and used to investigate changes in very short term temporal dependence. A further parameter was defined, the half-range average semi-variance (HRASV) (average of the points up to half the practical range to provide information on the intermediate temporal variability (between the 3 DASV and the sill, which is the total amount of auto-correlated variability).
It is acknowledged that there is uncertainty surrounding the variogram calculated from the river flow data. Part of the uncertainty comes from river flow measurement and part from the fitting of the variogram model. Due to the number of catchments and moving windows it is beyond the scope of this paper to do a full uncertainty analysis as discussed in Marchant and Lark (2004). Therefore a stability test was carried out in order to verify if the changes detected in the TSV method are caused by a change in the autocorrelation structure or by a few extreme points influencing how the variogram model fits the data. This is usually undertaken by doing a split test. However, due to the requirement of having a large data set to calculate the variogram, splitting the 5-year moving window in two was not deemed appropriate. Instead each data point in the 5-year moving window was randomly assigned to one of ten equal sized groups. The variogram was then fitted to the data 10 times, each time removing the data from one of the groups meaning that the variogram was fitted to 90 % of the data. This resulted in 10 values for each variogram parameter which were calculated using 90 % of the data. These points were then plotted against the variogram parameters which were calculated using 100 % of the data to provide an indication as to the stability of the variogram parameter estimates.

Relating change to the meteorological drivers
Having established patterns of temporal variability using the TSV approach, the potential meteorological drivers behind the detected changes in the variogram parameters were identified before being used to calculate how much of the change they explain.
Firstly, Pearson's product-moment correlation was calculated between the time series of each of the four variogram parameters and the time series of precipitation characteristics, calculated over the same time window. These results were used to determine the likely drivers behind each variogram parameter.
Secondly, multiple linear regression (MLR) was undertaken in order to determine how much variance in the variogram parameters could be explained by a combination of different precipitation characteristics. As precipitation characteristics are correlated with each other, a procedure which penalises extra model parameters was required. Stepwise regression, which tests whether parameters are significantly different from zero, has limitations -in particular, it can lead to bias in the parameters, over-fitting and incorrect significance tests (see Whittingham et al., 2005, for an in-depth discussion). In addition, the number and order of the potential parameters can influence the final model (Burnham and Anderson, 2002). Instead, information theory (IT) based on Akaike's information criterion (AIC) was used to analyse how much information is added by each characteristic. For each catchment the model with the lowest AIC score was used to obtain the R 2 value which provides an indication into the amount of change in the variogram parameters which can be explained by precipitation.
The relative importance of each precipitation characteristic was also investigated, providing information on which precipitation characteristics are important in explaining the changes in each variogram parameter. The relative importance was obtained by calculating the R 2 contribution averaged over orderings among regressors for each precipitation characteristic using the Lindeman-Merenda-Gold (LMG) method proposed by Linderman et al. (1980), as recommended by Gromping (2006).
Positive autocorrelation would influence the efficiency of the explanatory variables causing an overestimation of the significance. However, analysing the residuals from the MLR between precipitation and river flow (using the Durbin-Watson test for autocorrelation disturbance) showed no significant autocorrelation. Therefore, regressing against several precipitation variables with similar autocorrelation to the variogram parameters (both averaged over 5-year moving windows) was deemed to adequately remove the autocorrelation.

Testing the TSV method using artificially perturbed time series
To demonstrate the suitability of the TSV approach, it was first applied to river flow time series with known artificially perturbed periods. To identify which variogram parameters respond to changes in the river flow time series; a series of artificial changes were imposed onto a 7-year (1987 to 1994) section of the observed 32-year (1980-2012) de-seasonalised river flow time series (Fig. 3): 5-year moving windows starting between 1982 and 1994 (inclusive) will exhibit changes.
The changes were imposed on three rivers; the South Tyne in the north-east of England, the Yscir in Wales and the Tove in eastern England. These three catchments range from a relatively upland catchment with low storage (South Tyne) to a more lowland catchment with higher storage (Tove), although still a catchment with limited groundwater contribution; base-flow index (BFI) values are 0.45, 0.34 and 0.54 with drainage path slope (DPS) values of 138, 107 and 37 m km −1 for the Yscir, South Tyne and Tove, respectively (Marsh and Hannaford, 2008). The perturbations applied represent plausible scenarios of the likely types of change seen in river flow time series due to climate variability, other extrinsic drivers (e.g. land management) or a change in the gauging station.
-Increase in the standard deviation: a random, normally distributed set of numbers with a mean of zero and a standard deviation of 0.5 were added to the standardised river flow time series.
-Increase in variability: the smallest 20 % of values were decreased by 20 % whilst the largest 20 % of values were increased by 20 %.
-Increased dependence: a cosine wave with a wavelength of 365 days and amplitude of 0.5 was added to the standardised river flow time series. This increases the relationship between river flow on successive days.
-Increase in the mean: 1.0 was added to all the standardised river flow time series increasing the mean from 0 to 1.
-Periods of persistence: a 30 day period each December was forced to equal the mean.
Imposing artificial changes onto raw time series was selected as a more challenging test for the variogram change detection method, compared to applying the changes to a randomly generated artificial statistically stationary time series, as it requires the method to be able to detect changes amongst the naturally occurring variability in the time series. For all three catchments, a variogram was calculated for each 5-year overlapping moving window (i.e. 1980-1984, 1981-1985. . . 2008-2012) for the original and each of the artificial time series (Fig. 3). The variation in time of the variogram parameters provides information on whether the enforced changes in the input time series would be detected, and on which variogram parameters are affected by different types of change. Figure 4 shows the outputs of the TSV analysis for the artificially modified time series. The outputs from the three catchments were similar and therefore only the output from the South Tyne is shown, as an example.
The magnitude of change varies depending on the type of perturbation to the flow regime (Fig. 4). Variogram parameters are sensitive to realistic changes to aspects of the flow regime which can cause the parameters to exceed the 5th or 95th percentile threshold. In addition, the individual variogram parameters respond differently to each of the changes: -range: the only artificial perturbation which has a large influence on the range is the dependence. The increase in range is caused by creating dependency between flow on given days which lasts for a longer time. -sill: influenced mainly by the dependence and variability. Adding a wave also increases the difference between the largest and smallest values, hence the total amount of variability (the sill) increases.
-HRASV: mainly influenced by the standard deviation and the variability, both of which influence the variability (short term and long term respectively). In addition the persistence also has a small negative impact as this would reduce the short-term variability.
-3 DASV: influenced by the same artificial perturbation as the HRASV, however, the variability has less of an influence.

Stability analysis
Before identifying the temporal changes, the stability of the variogram parameters was analysed to investigate if certain data points have a large influence on the shape of the variogram and hence the variogram parameters. Figure 5 shows the relationship between the variogram parameters which are calculated using 100 % of the available river flow data and the same parameters calculated using 90 % of the available data. The figure highlights that there is a strong relationship between the points calculated using 90 and 100 % of the data. However, there are points which deviate greatly from the x = y gradient. The red dashed lines in Fig. 5 represent small deviations from the x = y plot which are deemed to be an acceptable amount of variation due to the removal of 10 % of the data. Any catchment which has a point or more outside these lines, for any variogram parameter, was removed. This resulted in three catchments being removed from subsequent analysis (reducing the number of catchments from 94 to 91). As well as the points outside of the red dashed lines, the range has two groups of values that exceed the length of the red dashed lines (catchments with a range of over 170 days). These two groups have large variability in the 10 values containing 90 % of the data. The large variability is probably due to the extrapolation by the model from the calculated semivariance. Due to the fact that all the values are above the 95th threshold (and therefore it is likely that they capture a true change in the range) these values were retained.
To explore the links with drivers more quantitatively, the relationship between precipitation characteristics and variogram parameters in the 5-year moving windows was calculated, with the results summarised for all catchments in Table 3.
The sill has the largest relationship with the winter to summer ratio (negative) followed by the standard deviation (positive). Although these appear contradictory, closer inspection found that the winter value seldom changed whereas the summer value increased (decreasing the winter to summer ratio), increasing the sill. The range is most correlated with the lower percentiles (negative) and the length of wet and dry periods (negative and positive respectively). Similar to the sill, the 3 DASV has the largest correlations with the standard deviation (positive), winter to summer ratio (negative), mean (positive) and 90th percentile (positive). The largest correlations are with the HRASV which is highly correlated with the percentiles (positive), SD (positive) and the mean (positive).
Each variogram characteristic has a different relationship with the precipitation characteristics (Table 3). As expected from the artificial analysis (Fig. 4) the sill, HRASV and 3 DASV are more influenced by precipitation characteristics which affect the short-term or total amount of variability in the time series (e.g. standard deviation and the different percentiles). The range is most influenced by aspects of the precipitation which enhance correlation between the river flow on successive days (e.g. length of wet and dry periods). The relationship between the precipitation characteristics and the range is usually in the opposite direction to the other variogram parameters.
The average relative importance of each indicator in predicting each variogram parameter was calculated using the LMG method. The three most important characteristics for the sill (accounting for over 30 % of the explained variance between them) are the winter to summer ratio, standard deviation and 90th percentile. The three most influential characteristics for the 3 DASV were the same as for the sill. The average length of time below and above 1 mm accounts for over 30 % of the explained variance for the range. For the HRASV, standard deviation, winter to summer ratio and the mean precipitation account for over 30 % of the explained variance. Although these key drivers have been identified, the total amount of variability in the variogram parameters which is explained by precipitation characteristics is mixed and depends on both the variogram parameter and the catchment, as shown by the spread of values of explained variance for individual catchments (Fig. 7). Table 2. Change in the median value of the potential driving characteristics for 1995-2001 and 2004-2012, compared to 1980-1994. The median value (taken from all the 91 catchments) is presented along with the significance level (if significantly different from 1980 to 1994 at or above the 95 % CI).

Discussion
The TSV method provides information about temporal changes in the whole autocorrelation structure of the daily river flow data and shows the relationship between river flow on successive days. Persistent changes in precipitation can cause the river flow regime to change in a way which will alter the autocorrelation structure and be detectable using the TSV method. This is demonstrated by the analysis of the artificially perturbed time series which showed that it is possible to identify plausible and realistic (i.e. likely to be seen in a river flow time series) changes in a river flow time series using the TSV approach. The TSV technique goes beyond monotonic change detection methods (such as the widely used Mann-Kendall test) as it does not require the whole time series (which is driven by multiple non-linear interactions) to alter in a near-linear way for change to be detected. Change in any form (e.g. gradual linear and nonlinear) can be characterised by plotting the variogram parameters over time. This is an advantage over change point analysis which is designed to detect abrupt changes. Another benefit of the TSV method is that it provides more informa- tion about the autocorrelation structure than an AR/ARMA model. Changes throughout different aspects of the river flow regime will be detected as the individual variogram parameters (sill, range, HRASV and 3 DASV) are sensitive to different types of change. Finally, the identified change is in relation to expected flow dynamics which represent the whole time period, enabling anomalous periods at the start and end of the records to be identified. Applied to 91 UK catchments, the TSV method was able to identify clear changes from the normal river flow behaviour. Changes in each variogram parameter (range, sill, HRASV and 3 DASV) characterise different aspects of the river flow regime. The range is dependent on the relationship between the flow on successive days; the value of the sill depends on the overall variability; the 3 DASV is related to the day-to-day variability and the HRASV is a combination of short-term and long-term variability. As this is a new method, the changes in the variogram parameters are discussed below in the context of previous studies, on observed changes in river flow and precipitation. This is in order to corroborate the river flow variations that the variogram parameters are detecting, as well as their meteorological drivers.
The variogram parameters exhibit different changes throughout the record. For the range there is a clear increase in the number of catchments going below the lower threshold (5 % threshold, from the 1000 river flow time series simulations) approximately between 1995 and 2001. Analysis of the perturbed time series shows that a decrease in the range is likely to be caused by a reduction in the dependence between river flow on successive days. This period was exceptionally wet (CEH, 2002) with less seasonality (Table 2). Therefore, catchments would have often been wetter, decreasing the available storage and the lag time between precipitation and river flow and hence increasing the variability in river flow. This also indicates why the number of catchments which exceed the HRASV upper threshold (95 % threshold) increases approximately between 1995 and 2001. The HRASV is influenced by standard deviation and variability in the river flow (Fig. 4), both of which will be influenced by wetter conditions in the catchment.
Post-2004 there is a large increase in the number of catchments which exceed the upper threshold for the sill. This increase is likely caused by the increase in variability of river flow after 2004 (Fig. 4). This time period experienced some of the most unusual hydrological conditions in the UK since records began: among the highest annual precipitation totals on record were recorded in 2008 (CEH, 2009) whereas January-June 2010 was the second driest since 1910. The 2010-2012 drought, one of the most severe droughts for a century  terminated abruptly, leading to widespread flooding due to the wettest April to July in England and Wales for almost 250 years . In addition, the standard deviation in the river flow was significantly larger than for both the 1980-1995 and the 1995-2001 periods. The high correlation between standard deviation and the 3 DASV explains the post-2004 increase in the number of catchments which exceed the upper threshold for the 3 DASV.
Different meteorological characteristics influence each variogram parameter. The sill, HRASV and 3 DASV are largely controlled by precipitation characteristics which influence the total amount and variability of precipitation (mean, standard deviation, 95th percentile). The range is more dependent on the length of wet and dry periods. The precipitation characteristics, on average, explain a large amount of the variability in the variogram parameters ( Fig. 7) (75, 67, 83 and 69 % for the sill, range, HRASV and 3 DASV respectively).
Although, on average, precipitation explains a large proportion of the river flow variability, there are large differences in the amount of explained variability across catchments (Fig. 7). The unexplained proportion could be caused by: (1) land management change or other human disturbances which would alter the precipitation-to-river flow relationship; (2) other meteorological characteristics not included in this paper; (3) catchment characteristics moderating how a river responds to temporal changes in precipitation; (4) unquantified error (e.g. statistical error), including assumptions made when using information theory. With regards to the first of these factors, the analysis was carried out on benchmark catchments with limited abstractions/discharges; however, it is likely that other factors will have a greater role in catchments with less natural regimes. Benchmark catchments generally have relatively stable land cover but land use changes over time cannot be ruled out. Other meteorological characteristics (potential factor number 2) could be influential, for example, temperature, which will influence the amount of snow and evapotranspiration. Snow will increase the lag time between precipitation and river flow. Furthermore if the snow melt is gradual this will act as a store of water, and the gradual release could influence the variogram, mimicking the effect of a groundwater aquifer. Snow can be important in runoff generation in upland areas of the UK, and in more low-lying settings in some winters. However, it is unlikely to make a large difference that would be discerned in the variogram of the majority of UK benchmark catchments. A change in the evapotranspiration losses over time could alter the magnitude of river flow, as well as seasonality. Assessing the role of additional meteorological characteristics is an important avenue of future work for developing the TSV methodology.
It is well documented that catchment characteristics moderate the precipitation-to-river flow relationship (e.g. Sawicz et al., 2011, andLey et al., 2011) and, more specifically, have been shown to exert a strong control over variogram properties (Chiverton et al., 2015). It therefore stands to reason that the catchment characteristics could be enhancing or damping a rivers response to changes in precipitation; influencing the non-linear precipitation to river flow relationship. This would influence the amount of variability which can be explained by multiple linear regression, and possibly explain the wide spread of explained variance between catchments in Fig. 7. The influence of catchment characteristics could explain why several studies (e.g. Hannaford and Buys, 2012;Pilon and Yue, 2002) find regional inconsistencies in observed streamflow trends in catchments with broadly similar meteorological characteristics. Therefore, the influence that catchment characteristics have on moderating how a river responds to temporal changes in precipitation needs to be established. Finally, using other methods to obtain the optimum combination of precipitation parameters (other than IT and AIC) could produce different results.
Overall, the TSV approach has been shown to be a useful tool for characterising temporal variability in river flow series, going beyond standard monotonic trend tests and relating the changes to precipitation characteristics. As the method is able to detect non-linear changes, and there are four variogram parameters which respond in different ways, a more detailed analysis of links with drivers of change can be provided. In this study, this has been done using a suite of meteorological indicators. However, the approach could also be used with other explanatory variables (e.g. land use changes, changes in artificial influences, etc.). In this way, the method could find wider application as a tool for attribution of change using, for example, the multiple working hypothesis approach (e.g. Harrigan et al., 2014).

Conclusions
This paper developed a new method of temporally shifting variograms (TSVs), for detecting temporal changes in daily river flow. The TSV approach can detect periods of change (increases and/or decreases) which result from linear or nonlinear changes. Each variogram parameter is related to a different aspect of the river flow, thus providing detailed information as to how river flow dynamics have changed through time.
There are distinct time periods when there is a large increase in the number of UK benchmark catchments exceeding a threshold (around 1995-2001 for the range and HRASV and post-2004 for all of the variogram parameters). The changes between 1995 and 2001 are attributed to an increase in precipitation; increasing the wetness of the catchment. Increased wetness reduced the amount of short-term (< half the range) variability which is removed by the catchment characteristics. The period after 2004 incorporated some of the most variable precipitation on record, influencing all of the variogram parameters. Meteorological factors explained a large proportion of the variability in the variogram parameters (75, 67, 83 and 69 % for the sill, range HRASV and 3 DASV respectively). The amount of unexplained variability is potentially caused by catchment characteristics moderating how a river responds to temporal changes in atmospheric conditions. This paper has demonstrated that TSV analysis enables changes in river flow dynamics to be characterised. The method will detect a wide range of changes (trends, variations in variability or standard deviation and step changes); the larger the magnitude of the change the less time is needed before the variogram parameters will exceed the thresholds. The principal advantages to the variograms are: the method is not influenced by the start and end points; changes near the start or the end of the record can be identified; nonlinear changes can be detected and the four variogram parameters capture different aspects of the river flow dynamics. Variograms could also be used to identify the impact that catchment characteristics have on moderating how a river responds to temporal changes in precipitation, which could be valuable information for enabling detailed catchment management plans to be drawn up at a local level in a nonstationary environment.