How is Baseflow Index (BFI) impacted by water resource management practices?

Abstract. Water resource management (WRM) practices, such as abstractions and discharges, may impact baseflow. Here the CAMELS-GB large-sample hydrology dataset is used to assess the impacts of such practices on baseflow index (BFI) using statistical models of 429 catchments from Great Britain. Two complementary modelling schemes, multiple linear regression (LR) and machine learning (random forests, RF), are used to investigate the relationship between BFI and two sets of covariates (natural covariates only and a combined set of natural and WRM covariates). The LR and RF models show good agreement between explanatory covariates. In all models, the extent of fractured aquifers, clay soils, non-aquifers, and crop cover in catchments, catchment topography and aridity are significant or important natural covariates in explaining BFI. When WRM terms are included, groundwater abstraction is significant or the most important WRM covariate in both modelling schemes and discharge to rivers is also identified as significant or influential, although natural covariates still provide the main explanatory power of the models. Surface water abstraction is a significant covariate in the LR model but of only minor importance in the RF model. Reservoir storage covariates are not significant or are unimportant in both the LR and RF models for this large-sample analysis. Inclusion of WRM terms improves the performance of some models in specific catchments. The LR models of high BFI catchments with relatively high levels of groundwater abstraction show the greatest improvements, and there is some evidence of improvement in LR models of catchments with moderate to high discharges. However, there is no evidence that the inclusion of the WRM covariates improves the performance of LR models for catchments with high surface water abstraction or that they improve the performance of the RF models. These observations are used to formulate a conceptual framework for baseflow generation that incorporates WRM practices. It is recommended that information on WRM, particularly groundwater abstraction, should be included where possible in future large-sample hydrological data sets and in the analysis and prediction of BFI and other measures of baseflow.


Abstract. Water resource management (WRM) practices, such as groundwater and surface water abstractions and effluent discharges, may impact baseflow. Here the CAMELS-GB large-sample hydrology dataset is used to assess the impacts of such practices on Baseflow Index (BFI) using statistical models of 429 catchments from Great Britain. Two complementary modelling schemes, multiple linear regression (LR) and machine learning (random forests, RF), are used to investigate the relationship between BFI and two sets of covariates (natural covariates only and a combined set of natural and WRM covariates). The LR and RF models show good agreement between explanatory covariates. In all models, the extent of fractured aquifers, clay soils, non-aquifers, and crop cover in catchments, catchment topography, and aridity are significant or important natural covariates in explaining BFI. When WRM terms are included, groundwater abstraction is significant or the most important WRM covariate in both modelling schemes, and effluent discharge to rivers is also identified as significant or influential, although natural covariates still provide the main explanatory power of the models. Surface water abstraction is a significant covariate in the LR model but of only minor importance in the RF model. Reservoir storage covariates are not significant or are unimportant in both the LR and RF models for this largesample analysis. Inclusion of WRM terms improves the performance of some models in specific catchments. The LR models of high BFI catchments with relatively high levels of groundwater abstraction show the greatest improvements, and there is some evidence of improvement in LR models of catchments with moderate to high effluent discharges. However, there is no evidence that the inclusion of the WRM covariates improves the performance of LR models for catchments with high surface water abstraction or that they improve the performance of the RF models. These observations are discussed within a conceptual framework for baseflow generation that incorporates WRM practices. A wide range of schemes and measures are used to manage water resources in the UK. These include conjunctive-use and lowflow alleviation schemes and hands-off flow measures. Systematic information on such schemes is currently unavailable in CAMELS-GB, and their specific effects on BFI cannot be constrained by the current study. Given the significance or importance of WRM terms in the models, it is recommended that information on WRM, particularly groundwater abstraction, should be included where possible in future large-sample hydrological datasets and in the analysis and prediction of BFI and other measures of baseflow.

Introduction
Baseflow, defined as streamflow fed from the deep subsurface and shallow subsurface storage between precipitation and/or snowmelt events (Tallaksen, 1995;Price, 2011;Zhang et al., 2017;Singh et al., 2019;Gnann et al., 2019), is a hydrological phenomenon that represents a whole catchment response to meteorological and other environmental signals (Bloomfield et al., 2011). It is important as it sus-tains surface flows particularly during relatively dry periods and droughts (Smakhtin, 2001;Miller et al., 2016) because it supports ecological flows and ecosystem functioning (Poff et al., 1997;Boulton, 2003) and is a factor in regulating streamflow quality and temperature (Jordan et al., 1997;Gomez-Velez et al., 2015;Hare et al., 2021). It integrates the outcomes of a wide range of natural and human-influenced surface and subsurface catchment processes Gnann et al., 2019) that include geomorphological controls related to surface topography (Santhi et al., 2008) and soil processes (Vivoni et al., 2007;Price et al., 2011;Singh et al., 2019;Yao et al., 2021) and (hydro)geological processes that control baseflow (Longobardi and Villani, 2008;Bloomfield et al., 2009;Kuentz et al., 2017;Carlier et al., 2018). Land use and land cover (LULC) change may also have profound effects on baseflow generation (Zhang and Schilling, 2006;Wang et al., 2014), including effects of changing forest cover and agriculture (Juckem et al., 2008;Ahiablame et al., 2017;Zhang et al., 2017) and urbanization (Simmons and Reynolds, 1982;Chang, 2007;Dow, 2007;McGrane, 2016). Through these processes, the dynamics of baseflow generation is modulated by meteorological variability over a range of spatial and temporal scales (Beck et al., 2013;Van Loon and Laaha, 2015;Longobardi and Van Loon, 2018) including large-scale circulation patterns (Cheng et al., 2021). There is also growing evidence for the potential impact of climate change on baseflow across a variety of climate and catchment settings (Wang et al., 2014;Ficklin et al., 2016;Ahiabalme et al., 2017;Zhang et al., 2019), and it has been proposed that this should be viewed in the context of increasing sensitivity of changes in droughts and low flows to wider anthropogenic influences (Van Loon et al., 2016;Sankarasubramanian et al., 2020).
Despite this extensive work on baseflow generation dynamics, Gnann et al. (2019) observed that there is still no general theory to explain variations in baseflow between catchments despite the strong evidence that it is largely controlled by the interaction of climate and landscape processes. They explored the role of climate in baseflow generation using baseflow data from the United States of America (USA) and the United Kingdom (UK) and found that in humid settings baseflow can be highly variable due to variations in catchment storage and wetting potential, whereas in more arid settings baseflow has much lower variability and is primarily controlled by vaporization limits. In a complementary study of 435 catchments across the contiguous US and the UK, Yao et al. (2021) found that soil water storage capacity is an important control on baseflow and that generally, BFI increases with storage capacity for a given a climate condition and decreases with aridity for a given storage capacity.
In addition to climate and catchment controls on baseflow, there is evidence that baseflow may be impacted by water resource management (WRM) practices. Here "WRM practices" is a loosely defined term that encompasses a wide range of activities related to the management of groundwa-ter and surface water resources that are specifically distinct from wider "human influences" or "human activities" (Zhang et al., 2019;Mo et al., 2021) that affect LULC, such as urbanization, deforestation, and land-management practices. Wang and Kai (2009) referred to WRM practice as "direct human interferences". Some examples of WRM practices include abstraction and discharge, changes in conveyance of streams due to changes in channel structure, for example, for damming, flow regulation and flood management, and development of structures for water storage within catchments including dams and artificial wetlands.
Using a baseflow recession analysis, Wittenberg (2003) identified reduced baseflow resulting from abstraction for summer irrigation in a catchment in Turkey but only saw a limited effect of abstractions for agricultural irrigation on baseflow in a catchment in Germany. The latter was attributed to the location of the abstractions within the catchment (abstractions were primarily near the watershed) and to the fact that the abstracted groundwater was not entirely lost to the groundwater balance (with lowered evapotranspiration stress, relative to the Turkish case study, associated with the irrigation contributing to recharge). Using an empirical analysis of baseflow recession, Wang and Cai (2009) modelled the impact of abstraction and effluent returns on streamflow in a catchment in Illinois, USA. They found that the WRM practices significantly altered recession process and low-flow hydrograph characteristics (compared with land-use change processes that affected both the rising and falling limbs of the hydrograph and peak flows) and showed that effluent returns caused a significant increase in low-flow (Q 5 ) magnitude but a decreased low-flow variability. In a statistical analysis of trends in baseflow in a catchment in Florida, USA, Weber and Perry (2006) documented a long-term decline in baseflow and spring flows. They assessed the possible effects of changes in rainfall, LULC, and groundwater abstraction but concluded that the primary cause of decline in baseflow and spring flow was over-abstraction of groundwater. Thomas et al. (2013) emphasized the importance of taking "human interference" into account when estimating the baseflow recession constant after documenting higher baseflow recession constants associated with groundwater withdrawals from catchments in New Jersey, USA. They also noted that the location, size, and degree of confinement of abstractions effected the degree to which streamflow was impacted. Large abstractions of groundwater close to streams resulted in larger impacts on streamflow than smaller abstractions from more distant locations, and abstractions from unconfined aquifers had larger impacts than from confined aquifers. A number of modelling studies have simulated the impact of abstraction and other WRM practices on baseflow (Kirk and Herbert, 2002;Parkin et al., 2007;Sanz et al., 2011;de Graaf et al., 2014). For example, de Graaf et al. (2014) calibrated the PCR-GLOBWB model with a dynamic allocation scheme to simulate surface water and groundwater abstractions and corresponding feedbacks.
They found that impacts of WRM were experienced during periods of low flows when the contribution of groundwater through baseflow is the largest and that return flows changed the timing and duration of the low-flow periods, causing baseflow to be maintained for longer. In summary, as with natural controls on baseflow (Gnann et al., 2019), there is as yet no general theory to explain the effects of WRM practices on baseflow, and the effect of a given WRM practice on baseflow may be contingent on a range of factors including climate, (hydro)geological setting, location, and timing of the activity.
To date, there have been no large-sample, data-led analyses of the impacts of WRM practices on baseflow. This is despite new opportunities being offered to investigate and quantify catchment processes through open access, large-sample hydrology datasets . Such datasets have been used to provide insights into catchment processes and functioning across multiple climate and catchment settings (Beck et al., 2013;Ochoa-Tocachi et al., 2016;Fouad et al., 2018;Gnann et al., 2019;Dudley et al., 2020). CAMELS-GB, a recently published large-sample hydrology dataset for Great Britain (GB) (Coxon et al., 2020a;b), is unusual in that it contains quantitative information on WRM practices including surface water and groundwater abstractions, discharges, and reservoir numbers and capacities at the catchment scale. The aim of the present study is to use the CAMELS-GB large-sample dataset to identify which, if any, of these WRM activities influence baseflow; to assess the importance of these activities in the context of other factors known to influence baseflow, such as meteorology, catchment hydrogeology, catchment physiography, and LULC (Price, 2011); and to investigate if WRM factors are important in any particular catchment or management settings. More generally, this study also directly addresses the need to improve understanding of the impact of human activities on the water cycle in the UK .
As Price (2011) has noted, there are four broad approaches to quantify baseflow, as follows: low-flow event time series, flow-duration statistics, baseflow recession analysis, and metrics of the proportion of baseflow to total flow, also known as baseflow indices. This study takes the last approach and specifically uses the two measures of Baseflow Index (BFI) reported in CAMELS-GB (Coxon et al., 2020a;b). BFI is the ratio of baseflow volume to total flow volume expressed as a fraction (Nathan and McMahon, 1990) and can be estimated by hydrograph separation using a wide range of tracer-based and non-tracer methods (Eckhardt, 2008;Gonzales et al., 2009;Price et al., 2011). The two measures of BFI in CAMELS-GB both use non-tracer-based methods, specifically a digital filter (Lyne and Hollick, 1979) and a graphical/statistical method (Gustard et al., 1992;Piggott et al., 2005). The former, although it is not based on the physics of discharge processes, produces objective and reproducible estimates of BFI (Cheng et al., 2021), while the latter has been used previously to characterize BFI across the study area (Bloomfield et al., 2009).
Two statistical models (multiple linear regression, LR, and machine learning using random forests, RF) are used here to investigate the relationships between the two estimates of BFI and WRM and other catchment covariates in the CAMELS-GB dataset. Although studies of BFI typically consider multiple baseflow filters to reduce uncertainty in estimates of BFI (Chen and Teegavarapu, 2020;Kissel and Schmalz, 2020;Zhang et al., 2020), the present study is designed neither to assess the relative efficacy of the filters used to estimate BFI, nor to compare the respective efficacy of the chosen statistical models in estimating baseflow: this is not a model inter-comparison study (Refsgaard and Knudsen, 1996). Instead, the estimates of BFI and the modelling approaches are designed to provide complementary evidence for the nature and importance (or not) of WRM practices on influencing BFI based on the published CAMELS-GB data.
2 Study area and data

Study area
This study focuses on 429 catchments across GB (Fig. 1) covering a wide range of climate-landscape-water management features (Fig. 2). Catchments in the north and northwest of the study area tend to have higher mean elevations than those in the south and south-east (Coxon et al., 2020a). Meteorology tends to reflect the broad gradient in catchment physiography, with wet and cooler conditions typically prevalent in the north and west of the study area compared with relatively dry and warmer conditions in the south-east (Fig. 2a). The dominant land cover also reflects the prevailing physiographic and meteorological conditions, with grass cover predominating in the north and west and crop cover in the south and east, with urban land cover dominant in London and the other large cities of central and northern England ( Fig. 2b). High-productivity aquifers are found in the southeast and east ( Fig. 2c; Bloomfield et al., 2009;Marchant and Bloomfield, 2018), whereas less productive aquifers and nonaquifers are generally more extensive in the west and northwest. Catchments in which clay-dominated soils overlie mudrock and clay bedrock formations and catchments with extensive glacial till deposits that are present in central and eastern areas (Fig. 2d) (Bloomfield et al., 2009;Bricker and Bloomfield, 2014).
Groundwater is used throughout England and forms on average about 30 % of the public supply, as well as being used extensively for agricultural irrigation and industrial supplies (Ascott, 2017). For 2017 (the last year of published abstraction data) abstractions from all sources (except tidal) in England totalled 10 395 M m 3 (million cubic metres), with 8350 M m 3 from surface waters and 2044 M m 3 from groundwater. Just over half of all these abstractions were used for public supply (5332 M m 3 ) (UK Government, 2020). Regionally, groundwater use is more important in southern and eastern England where groundwater abstraction may contribute 100 % of public supply . Consequently, there is a tendency for more extensive surface water abstraction in the north and more groundwater abstraction in the south-east ( Fig. 2e and f) (Coxon et al., 2020b). Effluent discharges are generally relatively high in catchments in and near major urban centres such as London, central England, and across parts of the north-west ( Fig. 2b and g), while the highest reservoir capacity is generally associated with catchments in northern and western parts of the study region (Fig. 2h).

Data
The data used in this study have been taken from the CAMELS-GB large-sample hydrology dataset for Great Britain (GB) (Coxon et al., 2020a, b), itself part of the wider CAMELS (Catchment Attributes and MEteorology for Large-sample Studies) initiative (Newman et al., 2015;Addor et al., 2017;Alvarez-Garreton et al., 2018;Chagas et al., 2020). CAMELS-GB is unique in that it contains human influence attributes for some catchments, and it is that subset of catchments which is used here. These initially consisted of 442 catchments for which there are "human influence attributes" (Coxon et al., 2020a, Table 2). However, these were further reduced to 429 catchments (Fig. 3) following a consideration of the estimates of BFI that are available for those catchments and the availability of data for the covariates of interest, as described below.
BFI is a hydrological signature McMillan, 2021) that can be estimated using a wide range of techniques. CAMELS-GB contains two estimates of baseflow. One index, "baseflow_index_ceh" (BFI_CEH) (Fig. 3a), is derived using a method developed by the UK Centre for Ecology & Hydrology and has been used in previous studies of baseflow and flow regimes in Great Britain (Gustard et al., 1992;World Meteorological Organization, 2008). The other, "baseflow_index" (BFI_LH) (Fig. 3b), was estimated by baseflow separation using the Lyne and Hollick digital filter (Lyne and Hollick, 1979) as implemented by Ladson et al. (2013). A comparison of the two CAMELS-GB baseflow indices ( Fig. 3c) confirms the common observation that different techniques used for baseflow separation influence the estimated indices (Nathan and McMahon, 1990;Eckhardt, 2008;Beck et al., 2013;Addor et al., 2017). There are often large uncertainties in the underlying streamflow data used to estimate BFI (Coxon et al., 2015), but these are difficult to characterize across large samples of catchments, and uncertainty estimates are not available for all the CAMELS-GB catchments (Coxon et al., 2020b). However, BFI typically has lower uncertainty compared with other hydrological signatures, as it is based on temporal averaging (Westerberg and McMillan, 2015), and only typically small differences in the BFI estimates are observed in the present study based on the two methods of estimate (Fig. 3).
Given that the true BFI for any given catchment is unknown, catchments for analysis in this study have been selected where there is a reasonable agreement between the two baseflow indices. A total of 10 catchments were removed where there is an absolute difference between BFI_CEH and BFI_LH of greater than 0.14, equivalent to the largest 2.5th percentile of the absolute differences of the population. A further three catchments were removed due to missing covariate data, leaving the 429 catchments for analysis (Figs. 1 and 3). Coxon et al. (2020b) note that the CAMELS-GB baseflow indices have been estimated for flow time series available during water years from 1 October 1970 to 30 September 2015 but that individual time series lengths and completeness may vary between catchments. On average, flow records for the 429 catchments are 91 % complete, with only 48 catchments < 75 % complete. No sites have been omitted from the analysis based on the length of their flow records. Figure 3c shows that there is a generally good linear agreement between the two estimated BFI indices. However, for BFIs below 0.7, BFI_CEH is systematically lower than BFI_LH, and for BFIs above 0.7, BFI_CEH is system-  atically higher than BFI_LH. In addition, for sites above a BFI of about 0.7, the correlation between the two indices is reduced.
A total of 21 of the CAMELS-GB catchment attributes (Coxon et al., 2020a) related to catchment physiography, climate, hydrogeology, land cover, and soils as well as WRM practices have been selected as covariates for analysis (Table A1). The spatial distribution of selected covariates is provided in Fig. 2 and described in Coxon et al. (2020b). The 21 CAMELS-GB covariates used in this study have been selected to be representative of each of the major components in a new conceptual model of baseflow generation (Fig. 4) and are consistent with the recently proposed, broader perceptual hydrological model for GB . Five WRM covariates from the CAMELS-GB dataset have been selected for analysis: groundwater abstraction (ground-water_abs), surface water abstraction (surfacewater_abs), effluent discharges (discharges) to streams, and the number and capacity of reservoirs within catchments (num_reservoirs and reservoir_cap). Note that the discharge term only accounts for effluent from sewage treatment works and does not provide information on other water returns (Coxon et al., 2020b). Price (2011) presented a conceptual model that illustrated how components of the terrestrial water cycle and specific catchment processes are related to baseflow based on stores and flows of water in catchments. It did not, however, incorporate WRM concepts and how these might influence or modify baseflow. In addition, it did not include aspects of catchment physiography as it focussed on catchment inputs, storage, and losses. Figure 4 is a revised conceptual diagram (building on Price et al., 2011) indicating conceptual relationships between baseflow, catchment compartments, and processes that lead to baseflow generation, including aspects of WRM. It conceptualizes WRM practices as simple high-level flows between groundwater, streamflow, and components of storage. Some flows may be significant within a given catchment, such as mains leakage (conceptualized in Fig. 4); however these are outside the current analysis as there is no information for these flows in CAMELS-GB.

Modelling methods
Modelling is used in this study not for predictive purposes but to explore model structures and performance to assess the evidence for the relative importance (or not) of WRM practices in influencing BFI. Two complementary modelling schemes, a multiple linear regression (LR) scheme and a random forest scheme (RF), have been applied to two estimates of BFI (BFI_LH and BFI_CEH) using two sets of covariates (Set A and Set B). Set A consists of the 16 natural covariates, and Set B consists of all 21 CAMELS-GB covariates, i.e. the combined natural and human influence covariates (Table A1). Consequently, eight models (Models 1 to 8) have been developed and evaluated. The LR and RF models are first calibrated for the Set A covariates (Models 1 to 4), then a second separate calibration is undertaken using Set B covariates (Models 5 to 8). The resulting model structures are investigated and their performance in estimating observed BFI compared without and with WRM covariates to understand the influence of WRM covariates on BFI.
The accuracy of the model estimates has been assessed using RMSE and by calculating Lin's concordance coefficient (Lin, 1989) for the predicted and measured values. Lin's coefficient indicates the degree of similarity between two variables, where and where ρ c (x, y) is Lin's concordance coefficient for variables x and y, ρ(x, y) is Pearson's coefficient for the same variables, var(x) is the variance of x, and µ x in the mean of x. Lin's concordance coefficient can take values between −1 and 1. A value of 1 indicates an exact match between the two variables, and the (µ x − µ y ) 2 term means that variables with different mean values have a small coefficient value in contrast to standard correlation coefficients for which perfectly correlated variables can have vastly different mean values. Lin's concordance coefficient is in contrast to a more standard Pearson correlation coefficient that is an indication of the explanatory power of a linear relationship between the two variables. Lin's concordance coefficient is calculated both to assess the accuracy of a given model at replicating the training data and in a 10-fold cross-validation procedure to explore the model accuracy at locations that were not used in calibration. If Lin's coefficient is substantially smaller upon cross-validation, then this could be an indication that the model is overfitted.

Linear regression
Regression is commonly used to model the effect of a given set of covariates on a variable of primary interest (Fahrmier et al., 2013). Here generalized linear regression (Dobson, 2002) is used to investigate the relationship between BFI_LH and BFI_CEH and the 21 catchment covariates. Logit transformation was applied to the BFI data, as where z i is the BFI of catchment i. This is to ensure the fitted, back-transformed BFI values are constrained between 0 and 1. The model considered in this paper is a linear mixed model with the following form: where Y = (y 1 , . . . y n ) denotes the column vector of BFI values from n catchments, X j = (x j 1 , . . . , x j n ) , j = 1, . . . , p are the column vectors of the covariates (catchment attributes). The column vector represents the model residuals, which are assumed to follow a normal distribution, with . Conceptual model of the relationships between the major compartments of the terrestrial water cycle that exert an influence on baseflow. Baseflow and storm flow components are highlighted in blue, driving climatology, catchment characteristics and compartments are shaded in green, and human influences within the conceptual model are shaded in orange and grey (the latter outside the scope of this study).
The 21 CAMELS-GB covariates and the two BFI parameters used in this study are listed against their respective compartments within the conceptual framework.
covariance matrix σ 2 R, where R reflects the correlation between transformed BFI values. The linear sums of covariates in a linear mixed model are referred to as the fixed effects and the residual term as the random effects.
In this paper, the model parameters β = (β 1 , . . . , β p ) are estimated using the generalized least-squares estimator (Dobson, 2002): These parameter values maximize the likelihood or probability that the data would have arisen from the estimated model. Standard linear regression requires the assumption that the residuals are independent and identically distributed (iid) and that the correlation matrix is equal to the identity matrix, I. Such an assumption can be inappropriate for landscape measurements, as they are not selected according to a randomized design and are often correlated in space as a result of the underlying geology and climate, etc. In particular, the BFI measurements made from locations closer to each other are more likely to share some similarity than those a long distance apart. If this correlation is ignored, then the significance of some model terms could be exaggerated. A further issue is deciding which of the available covariates should be included. If too few covariates are included, then some of the key drivers of BFI variation might be missed, and the predictions that result might be imprecise. If too many covariates are included, then the model might be overfitted. Some of the terms in an overfitted model replicate the random variation of the BFI values within the calibration data rather than generally applicable relationships between BFI and the covariates. Such a model can accurately predict the BFI for the sites used in calibration but performs less well for other data. The addition of a covariate to a model generally increases the maximized likelihood, even in the absence of a true relationship between that covariate and the property of interest. The addition cannot decrease this likelihood because the original model can be achieved if β p+1 = 0. A statistical criterion must be used to decide whether the increase in likelihood upon the addition of a parameter is sufficient to justify the inclusion of that term.
The modelling procedure consists of three steps. In the first step, given the candidate covariates, variable selection is carried out using the stepwise selection routine based on the Akaike information criterion (AIC; Akaike, 1973). The AIC is twice the negative log-likelihood of the model minus 2 times the number of model parameters: The model with the lowest AIC is considered to be the best compromise between accuracy and complexity. The forwards selection routine starts with a model containing no covariates. Each candidate covariate is considered in turn, and the AIC that results from its addition to the model is recorded. The covariate which leads to the largest decrease in AIC is added to the model. The iterative procedure continues until none of the remaining covariates lead to a decrease in AIC. This procedure is initially conducted assuming independent residuals (i.e., R = I ) and is implemented using the "step" function from the R package "stats". In the second step, spatial correlation is assessed by calculating empirical variograms (Cressie, 1993) of the model residuals using the "variogram" function from the R package "gstat". The variogram indicates how the expected squared difference between a pair of residuals varies according to their distance apart. Finally, a model including spatial correlation in the residuals is estimated when inspection of the variogram indicates that this is necessary. Specifically, the spatial correlation is reflected by the non-zero off-diagonal elements in the correlation matrix, R, which correspond to the values from an exponential correlation function (i.e., r(d ij ) = exp(−d ij /ϕ), where d ij is the Euclidean distance between two catchments i and j , and ϕ is an estimated model parameter). The model with spatial correlation can be estimated by residual maximum likelihood (REML; Lark et al., 2006) using the "gls" function from the R package "nlme". The statistical significance of each covariate included in the model (i.e. whether the corresponding regression coefficient is significantly different to zero) is recorded for p values of 0.1, 0.05, and 0.001.

Machine learning
LR models require assumptions about the nature of baseflow variation that can restrict the patterns of variation which the model can represent. In the past few decades, machine learning (ML) methodologies have become increasingly popular for representing complex environmental variation (e.g. Hengl et al., 2018;Lange and Sippel, 2020;Nearing et al., 2020). ML algorithms lead to considerably more flexible relationships between environmental variables. For example, regression trees recursively partition observation locations according to a series of binary tests on their covariate values. Each location enters the tree at the initial decision node and then follows one of two branches according to the result of the initial test. Each branch leads to a network of further decision nodes and tests until the location is allocated to a terminal node. The predicted value of the environmental variable at an unobserved location is equal to the average of the training data that are allocated to the same terminal node. The tests at each node are optimized so that the total squared error for a tree of a specified degree of complexity is minimized. Regression trees can replicate complex non-linear relationships that include interactions between different covariates, but they are prone to overfitting. A regression tree can perfectly predict the variable of interest for some training data if the number of terminal nodes is equal to the number of training observations, but it cannot be expected to perform exactly when predictions are made at other locations. Overfitting can be reduced by introducing stopping criteria to the trees (e.g. each terminal node must contain a specified proportion of the training data) or by using cross-validation to decide whether a particular decision node should be included in the tree. Overfitting might be further reduced by combining an ensemble of regression trees to form a random forest (Breiman, 2001). The trees within the ensemble differ because they are estimated for a different bootstrap sample of the available data, and a different subset of the candidate covariates is considered at each decision node. The prediction of the variable of interest at a particular location is equal to the average prediction across all the trees. Addor et al. (2018) found that the inclusion of 500 trees in a random forest considerably stabilized predictions and smoothed relationships between their covariates and BFI measurements.
The random forest interprets the available data as if they were a random sample of the population of interest and does not account for spatial correlation amongst the observations. Also, the relationships implied by a random forest model cannot be stated in a simple parametric form such as Eq. (1), meaning that it can be a challenge to determine the drivers of variation. It is possible to assess the importance of each covariate by shuffling the values of that covariate amongst the observation locations and calculating the reduction in prediction accuracy. However, Schmidt et al. (2020) and Wadoux et al. (2020) advise caution when inferring causal relationships from random forest models. Wadoux et al. (2020) demonstrate that photographs of soil scientists projected across their study area can be utilized by a random forest to accurately map the soil carbon content. They suggest that knowledge discovery from ML models requires more than the recognition of patterns and successful prediction. Instead they recommend the preselection of relevant environmental covariates and the posterior interpretation and evaluation of the recognized patterns: this is the approach taken here with the selection of 21 covariates representative of the conceptual framework being analysed (Fig. 4).
Random forests are calibrated using the MATLAB "Treebagger" function with each forest containing 500 trees (consistent with Addor et al., 2018), the with-replacement bootstrap sample for each tree being of the same size as the set of available data and one-third of the covariates being considered at each decision node. The Treebagger function defines the importance of a covariate in a random forest to be equal to the increase in the mean squared error of all predictions averaged over all trees in the ensemble upon shuffling of the covariate values divided by the standard deviation of the predictions taken over the trees.

Linear regression model structures
Regression models were developed for both BFI_LH and BFI_CEH, with covariates from Set A (Models 1 and 2) and from Set B (Models 5 and 6). For all four models the variograms of the residuals indicated substantial spatial correlation. Therefore, the models were re-estimated by REML and included spatial correlation parameterized by an exponential function. Note that although the inclusion of the residual correlation structure does not alter signs of the estimated coefficients, the significance of the model covariates changed. Some covariates were no longer significant after accounting for the spatial correlations. This could imply that part of the variation in BFI that was previously explained by certain covariates in the iid model may have been a result of spatial correlation. The full LR models are listed in Table A2, and the distribution of residuals for the LR models is illustrated in Fig. A1. Figure 5 shows the covariates identified as significant as well as the sign of the covariates. In this analysis, topography ("dpsbar"), climate ("aridity"), and the spatial coverage of fractured aquifers ("frac_high_perc"), of crop coverage ("crop_perc"), and of clay soils ("clay_perc") are highly significant in all four LR models, and the spatial coverage of areas with no active groundwater systems ("no_gw_perc") is also a significant covariate in all four models to different levels of significance (Fig. 5). In the LR models using Set B (Models 5 and 6), surface and groundwater abstractions and effluent discharges are all highly significant in explaining the variations in the BFI_LH and BFI_CEH, although the number ("num_reservoirs") and capacity of reservoirs ("reser-voir_cap") are not significant covariates. Urban land cover ("urban_perc"), previously noted as potentially influencing BFI in the Thames Basin in southern England (Bloomfield et al., 2009), is not a significant covariate in the LR models using covariate Set A once spatial correlation in the covariates has been accounted for and is not significant at all when WRM covariates are included in the LR models.
In the LR models, the signs of the significant natural covariates in Fig. 5 (Models 1 and 2) are consistent with current process-based understanding of the generation of baseflow Gnann et al., 2019;Yao et al., 2021) as represented in the revised conceptual model (Fig. 4) and with previous regression models of BFI in the study area (Bloomfield et al., 2009). For example, there is a significant inverse relationship between BFI and the fraction of clay soils within catchments, the fraction of catchments underlain by rocks with essentially no groundwater, and the aridity of catchments. Conversely, all LR models indicate a significant positive correlation between BFI and the fraction of catchments underlain by fractured aquifers.
In all four LR models, Lin's concordance coefficients between the fixed-effect predictions and the observed BFI are similar upon training and validation, indicating that the models are not overfitted (Table 1). The coefficients for the models using Set A (Models 1 and 2) to predict BFI_LH and BFI_CEH are 0.75 and 0.81 respectively. There are moderate negative correlations between the residuals from these models and the surface water and groundwater abstractions and effluent discharges from Set B covariates (Table 2). There are negligible correlations between the residuals and the number and capacity of reservoirs covariates. When the WRM covariates are added to the model (Models 5 and 6), Lin's con-cordance coefficients increase to 0.82 and 0.85 for BFI_LH and BFI_CEH respectively (Table 1).
In summary, when spatial correlation effects are taken into account, the LR models do not appear to be overfitted, show a consistent though moderate improvement in explanatory power with the addition of the WRM covariates, and indicate that groundwater and surface water abstraction and effluent discharges are all significant in explaining the variations in both the estimates of BFI.

Machine learning model structures
The relative importance of the covariates with respect to estimates of BFI are listed in Table 3 and illustrated in Fig. 5 for the RF Set A models (Models 3 and 4) and Set B models (Models 7 and 8). Lin's concordance coefficients on training data are larger for the RF predictions than for the LR models (Table 1). However, upon cross-validation, the RF coefficients decrease and are comparable to the LR model values. This could be an indication of overfitted RFs, perhaps because the spatial correlation previously identified amongst the data (see LR results above) has not been accounted for in the RF models. The most important covariates in the RF models using Set A covariates (Models 3 and 4) are consistent for both BFI_LH and BFI_CEH and are, in descending order of importance, the fraction of catchments underlain by fractured aquifers (frac_high_perc), clay soils (clay_perc), extent of catchments underlain by rocks with essentially no groundwater (no_gw_perc), and crop and grass coverage (crop_perc, "grass_perc") ( Table 3 and Fig. 5).
The residuals from the RF models are moderately and negatively correlated for the surface water and groundwater abstraction covariates ( Table 2). The groundwater abstraction covariate has high importance in both RF models of Set B covariates (Models 7 and 8; Table 3 and Fig. 5). The discharge covariate has a moderate importance in the RF models, but the relative importance of the surface water abstraction covariate and the covariates for the number of reservoirs and for their total capacity is low (Table 3 and Fig. 5).
In summary, RF models show that the majority of the power to explain variations in BFI is due to the natural covariates and when WRM covariates are included in the models, groundwater abstraction is the most important and effluent discharges of moderate importance in explaining both estimates of BFI.

Consistency between model structures
The results of the models are subject to standard caveats for such types of analysis. Inclusion of spatial correlation in the LR models was necessary and led to some otherwise significant covariates being removed, and the LR models were unable to represent non-linear relationships between the covariates. The RF models did not take into account spatial correlation identified in the LR analysis, and there was Figure 5. Signs and significance levels of the covariates in the LR models and the relative importance of covariates in the RF models. The signs of the significant covariates in the LR models are indicated using colour (pink for positive, blue for negative), and the corresponding significance levels of the covariates are indicated on the x axis with asterisks ( * for significance level between 0.05 and 0.1, * * for significance level between 0.01 and 0.05, * * * for significance level below 0.001). Some covariates were only significant prior to accounting for the spatial correlations. These are marked with asterisks only in the figure at their respective level of significance. Table A1 gives full details of the regression coefficients. Relative RF importance ranges from zero to 2. Table 3 gives the scores for the relative importance of covariates in the four RF models. some evidence of overfitting of the RF models, but they are able to represent any non-linearities that are present between the covariates that could not be included in the LR models. Notwithstanding these observations, the two contrasting modelling approaches, one relatively simple and tractable (LR modelling) and the other considerably more flexible but potentially harder to interpret (RF modelling), have resulted in remarkably similar model structures, with high levels of consistency between both natural and WRM covariates being identified as either significant (LR models) or important (RF models). The structures of the LR and RF models (Fig. 5) are broadly insensitive to the BFI being modelled. Although this is reasonable given the correlation between BFI_LH and BFI_CEH (Fig. 3), this observation supports the inference that the models are robust. Importantly for the purposes of the present study, significant covariates in the LR models and covariates with relatively large importance in the RF models are consistent, regardless of whether the models are developed using BFI_LH or BFI_CEH (Fig. 5).
There is a high level of agreement between the two modelling approaches regarding the significance or importance of  Bloomfield et al. (2009), where the percentage coverage of fractured aquifers in the Thames catchment in southern GB was found to be an important term in LR models of BFI. In the present study, in Models 1 to 4 the catchment fraction underlain by fractured aquifers is either a significant covariate or the covariate with the largest importance (Fig. 5), and catchment fraction of clay soils, those underlain by rocks with essentially no groundwater, and crop coverage are all significant in the LR models or have large importance in the RF models (Fig. 5). The two other catchment covariates identified as significant in the LR models (topography and aridity) also have moderate importance in the RF models.
The same natural covariates that are identified as significant or of high importance in the LR and RF models in Set A (Models 1 to 4) are also significant or important in models using the Set B covariates (Models 5 to 8) (Fig. 5), and the majority of the variation in BFI is described by the natural covariates (Table A2). From these observations, it is taken that WRM practices, rather than being the principle explanatory factor of variance in BFI, act to modify BFI controlled primarily by natural catchment processes. There are also similarities in the significant or importance of WRM covariates between the Set B models (Models 5 to 8). In both cases groundwater abstraction is significant or important, effluent discharges are significant or of moderate importance, and both reservoir numbers and capacities are either not sig-nificant or are of low importance. There is however a notable dissimilarity between the model structures with regard to surface water abstraction: it is a significant covariate in the LR models ( Fig. 5; see Models 5 and 6) but is not important in the RF model (Table 3 and Fig. 5; see Models 7 and 8).

Evidence for the impact of water resources management practices
The observations relating to the effect of WRM on BFI have been investigated further by considering the extent to which particular catchment context and management settings influence the respective model performance. Figure 6 shows that, particularly for a number of relatively high BFI catchments in central England and SE England to the north of London (Fig. 6a), the LR model of BFI_LH using only natural covariates appears to underestimate BFI. Similar observations can be made with respect to estimates of BFI_CEH (Fig. A2a), with the additional observation that there are a few catchments in eastern England where the model appears to overestimate BFI. Inclusion of WRM covariates leads to some improvements in LR model estimates of BFI, with the largest improvements being in the high BFI catchments (Figs. 7a and A3a). These improvements are particularly seen in the relatively high BFI catchments immediately to the north of London (Fig. 6b). Note, however, that addition of WRM covariates to the models does not appear to improve the estimates of BFI_CEH in the catchments in eastern England, where the model still appears to overestimate BFI (Fig. A2b).
To explore further which WRM covariates (groundwater abstraction, surface water abstraction, and effluent discharges) may be contributing to the improvement of the LR models, the distribution of differences between model estimates and observed BFI as a function of the magnitude of the three WRM covariates have been plotted for BFI_LH (Fig. 8) and for BFI_CEH (Fig. A4). Figure 8 shows that for LR models using natural covariates Set A (Model 1), underestimation of BFI is greater in catchments with higher levels of groundwater abstraction and, to a lesser extent, with higher effluent discharges, whereas there is no apparent systematic association between under-or overestimation of BFI_LH and levels of surface water abstraction. When the WRM covariates are included in the models (Set B, Model 5), estimates of BFI_LH are noticeably improved in catchments with high levels of groundwater abstraction and to a lesser extent moderate to high effluent discharges. Similar patterns are seen for models of BFI_CEH (Fig. A4). From this it is inferred that most of the improvement in the LR model performance when WRM covariates are included in the models is due to the groundwater abstraction covariate and, to a lesser extent, to the discharge covariate. Inclusion of the surface water abstraction covariate appears to have a negligible influence on estimates of BFI using LR models.
Compared with the LR models, differences between estimates of BFI from the RF models and observed values of Figure 6. Maps of difference between modelled and observed BFI_LH (a-d) and corresponding scatter plots of BFI_LH against fitted BFI (e-h) for covariate Sets A and B for LR and RF models (Models 1, 5, 3, and 7 respectively). BFI_LH and BFI_CEH using Set A covariates (Models 3 and 4) are small, and there are no clear regional patterns in model performance across the study area (Figs. 6 and A2). Figure 8 shows that RF models of BFI_LH using Set A (Model 3) covariates underestimate BFI in catchments with the highest levels of groundwater abstraction, but there is no clear association between the performance of these models and levels of surface water abstraction or effluent discharges. Inclusion of WRM covariates in the RF model of BFI_LH (Set B, Model 7) does not appear to improve the model (Figs. 7 and A3) or change these relationships: BFI is still underestimated in catchments with the highest levels of groundwater abstraction, and there is still no clear association between model performance and levels of surface water abstraction or effluent discharges. Similar relationships also hold for the RF models of BFI_CEH (Fig. A4). There is no noticeable improvement in the performance of the RF models with the inclusion of WRM covariates. Figure 8. Comparison of observed and modelled BFI_LH for covariate Sets A and B, for LR and RF models and as a function of different human management categories.

Impacts of WRM practices on BFI
Both modelling approaches are broadly consistent in identifying the most influential WRM covariates, namely the importance of groundwater abstraction, the modest effect of effluent discharges to streams, and the unimportance of reservoirs in influencing BFI, while surface water abstraction was identified as significant in the LR model but unimportant in the RF model (Fig. 5). In addition, the LR models identified positive correlations between BFI and groundwater abstraction, surface water abstraction, and effluent discharges (Fig. 5), and the influence of groundwater abstraction on BFI increases with increased abstraction (Figs. 7 and 8). It is evident from previous studies (Wittenberg, 2003;Weber and Perry, 2006;Wang and Cai, 2009;Thomas et al., 2013) that there is no universal relationship between WRM practices and baseflow, and the influence of WRM practices on baseflow is sensitive to climate, the location of abstraction in a catchment, and the details of abstraction and that in the context of the present study, the relationship between WRM practices and BFI is only partly explained in terms of the conceptual model in Fig. 4.
Assuming the principal uses for abstracted groundwater in the UK are for public supply (UK Government, 2020) where losses to evaporation are limited, abstracted groundwater from up-catchment sites should have a broadly neutral effect on baseflow. In contrast, groundwater abstracted from downcatchment or in the immediate vicinity of streams may be expected to reduce baseflow. However, neither of these simple conceptualizations of groundwater abstraction explain the positive correlation between groundwater abstraction and increased baseflow in the CAMELS-GB data (Figs. 5,7,and 8). Water resources in England have been well-regulated within the context of the European Water Framework Directive and Daughter Directives (European Commission, 2000), and a wide range of sophisticated schemes and measures are used to manage low flow and drought, including conjunctiveuse schemes, low-flow alleviation schemes, and hands-off flow measures (Clayton et al., 2008;Shepley et al., 2009;Agnew et al., 2000;Hutchinson et al., 2012;Wendt et al., 2020Wendt et al., , 2021. Conjunctive-use schemes use combined management of groundwater and surface water abstractions to maintain ecological flows, while low-flow alleviation schemes and hands-off flow measures are used in England to constrain the amount of water that is abstracted from groundwater and rivers, with abstractions being reduced or stopped at a given low-flow trigger level. Unfortunately, the CAMELS-GB data do not capture the details of any of these schemes or measures, and the conceptualization of baseflow generation in Fig. 4 does not capture the temporally and spatially linked changes in flows associated with these schemes and measures. In addition, although the analysis presented here uses BFI data for the period 1970 to 2015, the schemes and measures have evolved significantly over this period and so are both temporally and spatially variable. Consequently, although the cumulative, spatio-temporally varying effects of these schemes and measures may influence the relationship between WRM terms in the models, because there is no information on the dynamic management of water resources in the CAMELS-GB data in response to hydro-meteorological events (beyond the average terms used in the study; Table A1), the effects of the schemes and measures on BFI cannot be constrained by the present study. The positive correlation between effluent discharges and BFI is consistent with the conceptualization of baseflow generation in Fig. 4, while the lack of any significant or important correlation between the terms associated with reservoirs and BFI (Fig. 5) is consistent with the conceptualization of these as stores of water that do not contribute to baseflow (Fig. 4).

Impacts of climate and landscape characteristics on BFI
Both modelling approaches point to the same natural covariates (Models 1 to 4) contributing to the majority of variation in BFI (Fig. 5). These include a climate covariate (aridity), a number of catchment characteristics including topography (catchment mean drainage path slope, dpsbar), fractional area of highly productive fractured aquifer (frac_high_perc), non-aquifer (no_gw), and the clay fraction in soils (clay_perc), and a land cover characteristic (fractional area of crop cover, crop_perc). Qualitatively there is consistency between these covariates and similar covariates identified in previous studies. For example, Mazvimavi et al. (2005) also found slope to be a significant term in a regression model of BFI for 52 basins in Zimbabwe, and Addor et al. (2018) found slope to be an important covariate in an analysis of the CAMELS data for the USA. Note the observation in Table A1 that when topographic relief appears to be more important with respect to mean residence and transit times, catchment area appears less important. This is consistent with the results in both Fig. 5 and Addor et al. (2018). Beck et al. (2013) demonstrated that PET (a climate covariate related to aridity) was a significant covariate in a regression model of BFI based on 3394 global catchments, consistent with the results in Fig. 5. Bloomfield et al. (2009) previously identified the importance of the fractional area of high-productivity fractured aquifers and non-aquifers in controlling BFI in the Thames Basin, a basin within the current study area, again consistent with the results in Fig. 5. Similarly, Addor et al. (2018) and Huang et al. (2021) both found clay fraction in soils to be important in predicting BFI when ML techniques were applied to the CAMELS data for the USA. However, there are challenges in making direct comparisons between different models of BFI. Firstly, there is no commonly accepted approach to defining covariates used in such models. Although many of the climate and topographic catchment characteristics may have common definitions, other important or significant catchment factors, such as soil and aquifer characteristics, may be quantified quite differently between studies. The CAMELS family of hydrological large-sample datasets seeks to address the issue of consistency between hydrological datasets by attempting to publish hydrological data in standardized formats . However, even between the different national CAMELS datasets, there are differences in how (hydro)geological attributes are characterized (Addor et al., 2017Alvarez-Garreton et al., 2018;Chagas et al., 2020;Coxon et al., 2020a, b). A second challenge when attempting to compare between studies of the natural controls on BFI is that studies typically investigate different combinations of covariates. Regardless of the modelling approach used, for example, stepwise multiple LR (e.g. Mazvimavi et al., 2005;Bloomfield et al., 2009;Zhang et al., 2013;Aboelnour et al., 2021) or ML models (Mazvimavi et al., 2005;Addor et al., 2018;Huang et al., 2021), the resulting significant or important covariates reflect the composition of the original pool of covariates under consideration.

Implications for future research
There are a couple of implications that arise from this study. Although the dominant controls on baseflow across the study area are climate and catchment covariates, there is evidence that WRM practices, particularly groundwater abstraction, influence baseflow, but the manner in which they effect baseflow is inferred to be a function of the specific climate and catchment settings and WRM practices. Consequently, as this analysis and the CAMELS-GB data reflect the domi-nant WRM practices for GB, it is recommended that the present study should be extended to test additional WRM attributes and the applicability of the findings in other settings and WRM regimes. For example, CAMELS-GB does not explicitly include information about WRM practices associated with hydropower schemes or seasonal changes in abstraction (e.g. for irrigation), so the effect of such WRM practices on BFI has not been assessed. In addition, CAMELS-GB does not include any information on within-and betweencatchment water transfers (note the absence of these WRM terms from the conceptual model; Fig. 4). In addition, the approach to assessing the effect of WRM practices on BFI could also be applied and tested for relevance in other climate settings such as semi-arid environments (Mwakalila et al., 2002) or where snowmelt is an important component of baseflow generation (Miller et al., 2014;Barnhart et al., 2016;Huang et al., 2021) once systematic information on WRM practices is available in those settings.
More broadly, it is important to make data related to WRM practices much more widely available and for those data to be included in future large-catchment datasets . It is already challenging to develop common approaches to characterize some important catchment covariates related to soils and (hydro)geology for inclusion in largecatchment datasets. It is likely to be even more difficult to provide a consistent approach to capturing WRM practice data. However, a starting point would be to systematically conceptualize the major WRM practices across a wide range of regulatory (unregulated to highly regulated), catchment, and climatic settings that may influence baseflow and other hydrological signatures (McMillan, 2021) in order to establish broad classes of WRM practices against which data can be reported.
Finally, there is an active debate on the comparative merits of process-based hydrological modelling and ML in hydrological forecasting. Specifically, questions have been asked related to the extent to which hydrological processes and our understanding of the uniqueness of place, as encapsulated in our conceptual models of the terrestrial water cycle , have a role in hydrological prediction in the "age of machine learning" (Bevan, 2020;Nearing et al., 2020). For example, in a recent comparative study of the predictive accuracy of ML and LR models of flooding events in Germany, Schmidt et al. (2020) demonstrated that although ML methods had higher predictive accuracy than the LR models, they were still shown to be susceptible to the problem of equifinality and that this severely restricted their potential for inference. Schmidt et al. (2020) concluded with the observation that multiple algorithms and multiple methods should ideally be employed within a framework of model cross-validation when using ML for inference. Although the purpose of the present modelling was not to develop models capable of predicting BFI, it is interesting to note that there have been clear benefits in applying both simple statistical models (LR models) and more flexible ML approaches (RF models) to the same parameter space to explore common model structures and covariates of interest, and the results have provided evidence to extend current process understanding of baseflow based beyond individual LR (Bloomfield at al., 2009;Carlier et al., 2018;Zhang et al., 2020) and RF (Mazvimavi et al., 2005;Addor et al., 2018;Huang et al., 2021) studies. Now that the correlations between WRM covariates and BFI have been identified, future predictive models of BFI that take account of WRM practices could be developed using a refinement of the conceptual model (Fig. 4) to constrain a combination of multiple targeted statistical (LR) and multiple knowledge-guided ML models (Shen et al., 2021) deployed with appropriate cross-validation schemes.

Conclusions
Variation in BFI is predominantly explained by natural (climatic and catchment) characteristics, with the most important being the extent of high-productivity fractured aquifers within catchments. This latter observation is consistent with previous analyses of BFI within the study area. Although not the major control on variation in BFI, there is evidence that WRM practices systematically modify BFI in the study area.
Groundwater abstraction is the most influential of these practices, with a positive correlation between abstraction and baseflow, and this is consistent with the observation that the effect of groundwater abstraction on BFI is most evident in groundwater-dominated catchments where there are the highest levels of abstraction. However, a variety of schemes and measures are used to manage water resources in the UK, and systematic information on such schemes is currently lacking in the CAMELS-GB large sample dataset, so their specific effects on BFI cannot be constrained by the current study. Information regarding WRM practices, their temporally and spatially linked associations, and changes in flows associated with these schemes and measures should be incorporated in future conceptual models of BFI.
Large-sample datasets are increasingly being used to understand and predict the functioning of hydrological systems at scales above the individual catchment . Given the importance of understanding the effects of WRM practices on baseflow and a range of other hydrological signatures, there is a need to incorporate information about such practices in large-sample datasets. If such datasets are to be comparable, there is also the need to systematize how WRM practices, in all their diversity, are described and recorded.   (Morris and Flavin, 1990). variability in low flows . However, it less important with respect to mean residence and transit times, where topographic relief appears to be more important (McGlynn et al., 2003;Asano and Uchida, 2012;Muñoz-Villers et al., 2016).
dpsbar Catchment mean drainage slope path Mean drainage path slope (Bayliss, (m km −1 ). 1999) is an index of catchment steepness and is estimated as the mean of all inter-nodal slopes from UKCEH's Integrated Hydrological DTM for a given catchment (Morris and Flavin, 1990).
Climate aridity Aridity (-). Aridity in CAMELS-GB, as The primary input to the catchment indices with the other CAMELS datasets, is water balance and hence to baseflow calculated as the ratio of mean daily generation is precipitation minus potential evapotranspiration to mean evapotranspiration (Price, 2011, Fig . 1). daily precipitation (Addor et al., 2017;Coxon et al., 2020b). In the present study it has been reformulated as usually estimated (Joint Research Centre, 2019).

frac_snow
Fraction of precipitation falling as snow Barnhart et al. (2016) demonstrated a (for days colder than 0 • C) was estimated strong correlation between snowmelt by Coxon et al. (2020b). rate and baseflow efficiency for catchments from western USA.
Hydrogeology inter_high_perc Percentage of catchment designated as As Price (2011)  including two high-productivity and two low-productivity classes, on BFI.  (Hiederer, 2013).
influence of catchment attributes on a variety of hydraulic signatures including BFI_LH. Soil clay fraction was the most negatively correlated attribute with BFI_LH (Addor et al., 2018, Fig. 4).
Water surfacewater_abs Mean surface water abstraction (mm d −1 ). Wittenberg (2003), Wang and Cai resource Mean surface water and (2009), Weber and Perry (2006), and management groundwater abstraction and discharge Thomas et al. (2013) have all previously data were estimated by Coxon et al. identified changes in features of (2020a) based on monthly actual baseflow in catchments subject to abstractions and returns for the period groundwater abstraction or due to January 1999-December 2014. return flows.
groundwater_abs Mean groundwater abstraction (mm d −1 ). discharges Mean discharges (mm d −1 ). Discharge data consist of daily discharges into water courses from water companies and other discharge permit holders who reported to the Environment Agency from 1 January 2005 to 31 December 2015.
num_reservoirs Number of reservoirs in the catchment (-). Reservoir attributes were taken from an open-source UK reservoir inventory (Durant and Counsell, 2018).
Author contributions. JPB designed the study and undertook the literature review. MG and BPM performed the modelling, and all authors contributed to the analysis of the results. JPB prepared the manuscript, and GC prepared the figures (except Figs. 4 and 5, which were produced by JPB). GC, MG, BPM, and NA reviewed and edited the manuscript.
Competing interests. The contact author has declared that neither they nor their co-authors have any competing interests.