Hillslope characteristics as controls of subsurface flow variability

Introduction Conclusions References


Introduction
Hydrological dynamics of hillslopes, particularly shallow subsurface flow (SSF), are highly complex and variable in space and time (Bachmair et al., 2012).Nonetheless, a solid understanding of hydrological dynamics and soilvegetation-atmosphere interactions at the hillslope scale is necessary for predictions of ungauged hillslopes and catchments dominated by subsurface flow.In recent years a common theme has evolved focusing on the drivers or organizing principles of spatio-temporal variability of hydrologic fluxes at a range of spatial scales (McDonnell et al., 2007;Wagener et al., 2007;Sivapalan, 2005).At the hillslope scale, a number of studies addressed the issue of systematically investigating dominant controls of subsurface flow generation and their interrelations (Hopp andMcDonnell, 2009, 2011;Coenders-Gerrits et al., 2012;Keim et al., 2006;Tromp-van Meerveld and McDonnell, 2006b).
The dominant controls of hillslope response to rainfall or snowmelt can be classified as static factors, at least in the sense of shorter time scales such as years (parent material, surface and bedrock topography), and dynamic factors (e.g.soil moisture, vegetation).Surface and bedrock topography clearly exert key controls, since they directly govern flow processes and also the evolution of dynamic factors such as soil properties, soil moisture, microclimate, and thus vegetation patterns.A number of studies have assessed the effect of surface and bedrock terrain attributes on runoff generation (e.g.Freer et al., 2002;Troch et al., 2003;Aryal et al., 2002;Bogaart and Troch, 2006;Fujimoto et al., 2008;Berne et al., 2005).Bedrock microtopography (0.1-10 m) was found to create a threshold-like SSF response (bedrock depressions have to be filled up for hillslope response, "fill-and-spillhypothesis"; Tromp-van Meerveld and McDonnell, 2006a;Tromp-van Meerveld and Weiler, 2008).Bedrock permeability governs bedrock infiltration and exfiltration and thus influences SSF at the event scale and the long-term water balance of the hillslope and catchment (Tromp-van Meerveld et al., 2007;Kosugi et al., 2006;Graham et al., 2010).
The influence of vegetation on hydrological dynamics is mostly assessed either by field experiments at the plot scale focusing on tree/stand structure and their effects on precipitation redistribution, or at the hillslope scale via virtual experiments simulating throughfall/stemflow variability and their effects on soil moisture and SSF (e.g.Keim et al., 2006;Coenders-Gerrits et al., 2012;Hopp and McDonnell, 2011).Virtual experiments are numerical experiments with a model driven by collective field intelligence (Weiler and McDonnell, 2004).Hillslope-scale field experiments investigating vegetation effects are rare.Exceptions are recent sprinkler studies by Nordmann et al. (2009) and Jost et al. (2012), who showed the effect of different tree species on SSF generation and various interactions of vegetation with static factors, e.g.how roots channel water depending on subsurface conditions.
Numerous plot-scale experiments showed the prominent effect of vegetation on precipitation redistribution and smallscale variability of hydrologic fluxes.For instance, Liang et al. (2007) measured a rapid increase in water content and pore water pressure at the downslope side of a tree trunk growing on a steep hillslope, which they attributed to locally concentrated stemflow input.The effect of funnelled rainwater input close to stems was also observed in other studies (e.g.Chang and Matzner, 2000;Staelens et al., 2008;Herwitz, 1986).Moreover, throughfall patterns were found to create distinct patterns of wet and dry spots that persist over time (e.g.Guswa and Spence, 2011;Zimmermann et al., 2009;Keim et al., 2005;Staelens et al., 2006;Gerrits et al., 2010).Whether such small-scale vegetation effects influence SSF at the hillslope scale yet needs to be assessed in the field.Virtual experiments provide valuable insights, yet they always depend on the choice of the perceptual and numerical model.
To highlight the effect of different hillslope controls on SSF generation, we monitored shallow water table dynamics of three adjacent, relatively large-scale (33 × 75 m) hillslopes, which are side-slopes of a small v-shaped, zero-order catchment.The hillslopes are similar in terms of topography and parent material, but differ in vegetation cover (grassland, coniferous forest, and mixed forest).The aim of this study is to explore the relationship between variability of SSF, monitored as shallow water table dynamics, and various hillslope characteristics, representing both static and dynamic controls.The following research questions will be explored: 1. Can the spatial variability of SSF dynamics be explained by measurable hillslope characteristics?In other words, is there a direct link between patterns of water table fluctuations and hillslope characteristics?
A number of studies have shown the influence of vegetation on soil moisture patterns and runoff generation processes, e.g. through spatially variable rainfall input due to throughfall and stemflow, transpiration, and different root systems.We thus hypothesize that vegetation is an important driver of spatially variable SSF dynamics at our study site, especially given the small differences in topography (predominantly planar hillslopes).
2. Are there differences in explainability of water

Site description
To answer our research questions, shallow water table dynamics of three adjacent hillslopes were monitored with high spatial and temporal resolution (see Fig. 1).represent NW-facing side slopes of a small v-shaped, zeroorder catchment at the foot of the Black Forest in southwestern Germany (location of catchment outlet: 47.9570 • N, 7.8378 • E; catchment area: 0.21 km 2 ; catchment elevation range: 340-585 m a.s.l.).Due to their spatial vicinity, the hillslopes have similar planar topography, parent material, aspect, and steep slope (Fig. 1).Geology is crystalline bedrock overlain by periglacial drift cover (Geologisches Landesamt Baden-Württemberg, 1996).Cambisols have developed in the periglacial drift cover; soil texture is sandy loam (WaBoA, 2007).The main difference between hillslopes is vegetation cover (grassland, coniferous forest, and mixed forest).Due to the periglacial drift cover, there is no clear soilbedrock interface at the study site.Periglacial drift cover evolved from a combination of solifluction, cryoturbation and aeolian processes (Arno et al., 1998;Völkel et al., 2001).Stratigraphically, periglacial drift cover is comprised of three lithologic units: basal layer, intermediate layer, and upper layer.The basal layer is compacted; it shows high bulk density and slope-parallel alignment of clasts.In the intermediate layer the coarse fraction displays finer-sized clasts of varying orientation.The upper layer is the finest textured lithologic unit and has lower bulk density (Arno et al., 1998;Völkel et al., 2001).Generally, lateral subsurface flow in periglacial drift cover was observed to occur at the interface of the basal layer and intermediate layer, but also at the interface of the intermediate and upper layer (Arno et al., 1998;Chifflard et al., 2008;Nordmann et al., 2009).
The climate is warm temperate ("Cfb" Koeppen classification).Mean annual precipitation and temperature are 970 mm and 11 • C, respectively (time period 2007-2011; data from nearby WBI weather station by state-run viniculture institute Freiburg).Mean monthly rainfall and evaporation is highest in the summer months May until July, when convective storms dominate (Morhard, unpublished data).See also Fig. 2 for an overview on rainfall characteristics for different seasons.The small creek at the foot of the hillslopes carries water at all times, even under dry summer conditions (baseflow then smaller than 1 l s −1 ).
At the mixed forest hillslope, a mix of European beech (Fagus sylvatica) and fir (Abies alba) dominates, with some sycamore maple (Acer pseudoplatanus), ash (Fraxinus excelsior), and spruce (Picea abies) in between (see Fig. 1).There is little understory vegetation and no deadwood on the forest floor, which is covered by a thick layer of beech leaves.At the coniferous hillslope, primarily spruce and fir is found, interspersed with some deciduous trees at the lower part of the hillslope (sycamore maple and ash).The surface is covered by a needle layer, lots of deadwood, and understory vegetation.Tree age at both forested hillslopes is between 70 and 100 yr.The grassland hillslope, which has been under this vegetation cover for 200-300 yr, is used for sporadic sheep grazing in the summer.

Field methods
To monitor the internal hillslope response to rainfall or snowmelt with high spatial and temporal resolution, we installed 90 wells distributed over the three hillslopes.At each hillslope 30 wells were drilled; organized in three transects of 10 wells each; and laid out perpendicular to the slope gradient along the contour in the lower, middle, and upper part of the hillslope (see Fig. 1).The distance between wells per transect was 3 m.The distance between transects was 30 m, except for the upper transect of the mixed forest hillslope, which was only spaced 15 m apart due to an old forest road cutting the hillslope.The wells were drilled with a hand-held, gasoline-powered breaker (Cobra Standard).The depth of each well depended on below-ground conditions.We aimed to drill to a maximum depth of 2 m.However, many wells were shallower due to resistance in the periglacial drift cover or the bedrock.The majority of wells ended in dense layers of periglacial drift cover, since the actual bedrock was mostly located far beneath the drift cover.A PVC pipe (4 cm diameter) perforated over the entire length was placed into each well.The perforated PVC pipe had been wrapped in geotextile prior to field installation to prevent fine material transport into the pipe.The geotextile also covered the lower end of the PVC pipe.Since the PVC pipe was not closed with a cap, vertical drainage was possible.After inserting the PVC pipe into the well, bentonite clay pellets were pressed around the PVC casing at the soil surface to seal the well against preferential flow along the pipe.
Each well was equipped with an Odyssey Capacitance Water Level Recorder (Data Flow Systems, New Zealand).The measuring interval was set to 2 min but the loggers ran in compressed logging mode (recording only if water level change > ±5 mm).After downloading the data roughly every 3 months, the capacitance probes were pulled out, cleaned, and the water table height manually measured with an electronic contact gauge to validate data.Manufacturer's specifications of probe accuracy are 5 mm.However, the validation yielded unsatisfactory results for wells with low water table (water table within the lower 13 cm of a well).Since there was no continuous bias for wells with low water table, water table dynamics in the lower 13 cm of each well were omitted from the analysis to exclude uncertain data (this included the length of the 7.5 cm brass counter weight at the lower end of the Odyssey probe).
To monitor rainfall input and further meteorological variables, a Davis weather station Vantage Pro 2 was installed (10 min measurement interval).To gain insight into the spatial variability of rainfall and throughfall, 66 rainfall totalizators were positioned (one totalizator per well at both forested hillslopes, two totalizators per transect at the grassland hillslope).The totalizators were mounted to the upper end of the well casings.
For determining soil hydraulic properties, one slug test per well was carried out in December 2010.Prior to slug injection, the Odyssey capacitance probe was pulled out of the well and replaced by a pressure transducer (Diver Groundwater Datalogger, Schlumberger Water Services) due to higher accuracy and smaller logging interval of the pressure transducer (1 s interval).For a later correction of hydrostatic head with barometric head, a Baro-Diver (Schlumberger Water Services) was placed next to the well at the soil surface.A small-diameter PVC tube with attached packer was inserted into the well down to a depth of 50 cm above well bottom, where the packer was inflated.The slug tests thus only represent soil properties of the lower 50 cm of the well, which represents the zone of perched water table development for most wells.Water was injected into the PVC tube via a funnel until the tube was completely filled up.There were some cases where it was not possible to fill up the PVC tube because of very rapid drainage.
At both forested hillslopes the location of each tree, type of tree, and stem diameter at breast height were mapped.A grid out of barrier tape had been laid out on the hillslopes prior to mapping (3 m × 7.5 m grid size).The mapped area covers 36 m × 75 m at the coniferous forest hillslope and 36 m × 60 m at the mixed forest hillslope (3 m buffer at both sides of the horizontal transects, and 7.5 m buffer up-and downslope of the upper and lower transect, respectively).In addition, hemispherical pictures of canopy coverage were taken above each well with a digital camera and a fisheye lens.

Determination of water table response variables
Water table response data is available for the time period April 2010 until September 2011 (18 months).Different water table response variables were determined per well for three different temporal scales: -Entire time series (18 months).
-Event scale: five selected events covering a range of total rainfall, mean and maximum rainfall intensity, and antecedent wetness conditions (see Table 2).
For each temporal scale the following response variables were calculated: -WT NORM : mean water table normalized to well depth.
- -Seasonal scale: if there were small data gaps we used the remaining time period for calculating well response variables.For comparability, all response variables were normalized to the length of the time series.Wells with larger data gaps were treated as wells with missing values during this season.
-Event scale: wells with missing data were omitted.
-Entire time series: response variables were calculated from available data and normalized to the length of the time series.We tested the representativity of variables calculated from time series with partially missing data by applying the observed data gaps to time series with complete data.Since the representativity was reasonable to high, each well response variable is considered representative for the entire time series despite missing values.

Determination of hillslope characteristics
The following hillslope characteristics per well were determined in the field or with GIS analysis: upslope contributing area (UA), slope (SLOPE), plan curvature (PLAN CURV), profile curvature (PROF CURV), aspect (ASP), well depth (WDEPTH), hydraulic conductivity from slug tests according to HVORSLEV method (K), slope of the fast part of the recession after slug injection (SLUG HIGH), slope of the slow part of the recession after slug injection (SLUG LOW), number of trees in vicinity (N TREES), stemflow index (STEMF), percent throughfall (THROUGHF), and canopy coverage (CANOPY COV).In addition, we created two categorical variables representing land use classification (LULC: 1 = forest, 2 = grassland), and well location on hillslope (TRANSECT: 1 = lower transect well, 2 = middle transect well, 3 = upper transect well).Topography variables (upslope contributing area, slope, plan curvature, profile curvature, and aspect) were determined with a 1 m LiDAR DEM.All calculations were carried out with SAGA GIS: preprocessing of the DEM (sink removal, smoothing with simple filter; 3 m radius), standard terrain analysis (slope, aspect, plan curvature, profile curvature), and upslope contributing area (interactive module, multiple flow direction).The exact location of wells was determined by surveying with a laser theodolite.
Soil properties were either directly measured in the field (WDEPTH) or calculated from slug tests (K, SLUG HIGH, and SLUG LOW).Slug test data were preprocessed by correcting hydrostatic head with barometric head.Water table height prior to slug injection was determined and subtracted from measured water table height.Many wells, however, were dry prior to slug injection.The time series were then manually trimmed so that the start represents the highest water table followed by a constant drop (e.g. if there was fluctuation around a plateau, the last value prior to a constant decline was defined as start value).Determining hydraulic conductivity from the preprocessed time series was challenging since common methods, e.g.Bouwer and Rice (1976), Hvorslev (1951), assume a confined or unconfined aquifer.As already stated, unsaturated conditions were found in many wells.We chose to analyze the data according to the Hvorslev method (Hvorslev, 1951).We emphasize that the calculated values should not be regarded as physically true saturated hydraulic conductivity.We solely use K-values as a tool for intercomparing well behaviour.
Due to the limited applicability of common methods for determining hydraulic conductivity in our case, we calculated two additional variables from the slug test curves: SLUG HIGH and SLUG LOW, which represent the mean slope of the fast and slow part of the recession after slug injection, respectively.The calculations were carried out with IDL.First, the time series was smoothed (boxcar average) and the slope of the recession curve was calculated.A threshold of 0.5 cm s −1 was used to separate the fast part of the recession from the slow part.This threshold fitted best visual observations and resulted in two parts of the time series with distinct slopes.The mean slope per recession part was calculated after a buffer had been applied to cut data around the breakpoint.For wells that only showed a fast part of recession (fast decline to 0 cm, hence no slow part of recession), SLUG LOW was set to an arbitrary value of 10 cm s −1 indicating a large hydraulic conductivity.For wells with extremely rapid drainage, where the lower 50 cm of the well could not be filled up and SLUG HIGH could not be determined, SLUG HIGH was set to an arbitrary value of 150 cm s −1 indicating a large hydraulic conductivity.
The amount of trees in the vicinity of the wells and the stemflow index were derived from tree mapping and GIS analysis.The variable N TREES is the number of trees located within a 3 m radius of a well.Since we did not measure stemflow, we calculated a stemflow index based on amount of upslope trees, distance from well, type of tree, and DBH; first, all trees (i) within 3 m distance and within the upslope quadrant or (ii) within 1m distance and within the upslope semi-circle were identified.Trees meeting neither one of these conditions (e.g.located downslope or > 1 m to the side of a well) were not considered as stemflow contributing.Next, the stemflow index per tree was calculated by adding up normalized tree distance (normalized to maximum distance of 3 m), normalized DBH (normalized to maximum DBH), and a factor representing type of tree (rough-barked spruce/fir: 0.5, smooth-barked beech/ash tree: 1).These factors were set according to literature values of stemflow for different forest types (Levia et al., 2011).The maximum stemflow index per tree is 3; the overall stemflow index is the sum of stemflow indices for all trees contributing stemflow to one well.Self-evidently the derived stemflow index is based on subjective decisions and does not represent actual stemflow volumes.Nevertheless, we assume this index suitably characterizes the proneness of a well to receive stemflow.
The variable THROUGHF represents the mean percentage of maximum throughfall/rainfall of eight events per well during the leaf season in 2010/2011.Maximum throughfall was usually higher than open area rainfall, which results in values smaller than 100 % at the grassland hillslope.The eight events cover a range of rainfall characteristics (total rainfall, mean and maximum rainfall intensity).Standard deviation of mean throughfall is < 15 % for 82 % of the wells; the rest of the wells have a standard deviation between 15 % and 28.5 %.Canopy coverage was derived from digital fisheye photographs using the Gap Light Analyzer (GLA) imaging software (Frazer et al., 1999).We determined the percentage of canopy coverage per well for three different zenith angles (4.5 • , 9 • , and 18 • ).Since the canopy coverage for different zenith angles was highly correlated, we only considered the canopy coverage for 9 • zenith angle for further analysis (4.5 • /9 • : r = 0.96, 9 • /18 • : r = 0.82, 4.5 • /18 • : r = 0.93).

Correlation analysis
Since the response variables and some of the predictor variables are not normally distributed, we applied nonparametric statistical analyses.Spearman rank correlation coefficients were computed between different predictor variables.To assess the influence of different predictor variables on the water table response, rank partial correlation coefficients (Spearman) were calculated.Partial correlation is the correlation between a predictor variable and a response variable when the effects of all other predictor variables involved are removed.The hypothesis of no partial correlation was tested against the alternative that the partial correlation is significantly different from zero.Rank correltion, rank partial correlation, and statistical tests were carried out with MAT-LAB.

Random forests
In addition to correlation analysis, a random forest (RF) approach was chosen for revealing the effect of different hillslope characteristics on the observed water table response variables.A random forest represents a machine learning algorithm, where a large number of classification or regression trees (CARTs) are grown on a bootstrapped subsample of the data.The method developed by Breiman (2001) is a non-parametric multivariate technique.Generally, classification or regression trees explain the variation of a response variable by recursively splitting the data into more homogeneous groups (nodes) based on combinations of explanatory variables (Breiman et al., 1984;De'Ath, 2002;Strobl et al., 2009).The advantage of CART is that no assumptions are made about the distribution of the data and both continuous and/or categorical variables can be used.In addition, CART can handle interactions and nonlinearities among variables.A disadvantage of CART is tree instability, since small changes in input data can produce highly divergent trees (Prasad et al., 2006).Thus, an ensemble of trees, a "random forest", was shown to yield better predictions than a single tree (Strobl et al., 2008).
The random forest algorithm is described in detail by Breiman (2001).For our analysis we used the "randomForest" package by Liaw and Wiener (2002) implemented in the R environment, where also details about the program can be found.The steps during the RF analysis are (1) n tree bootstrap samples are drawn from the original dataset, each sample containing about two-thirds of the data (n tree is a userspecified parameter); (2) for each bootstrap sample, an unpruned tree is grown.In regular CARTs the data is split using the best split among all predictor variables.For random forest construction a subset of predictor variables is randomly chosen (amount of randomly selected predictor variables (m try ) specified by user).The best split among those variables is determined.New data are predicted by aggregating the predictions of all trees (i.e., majority votes for classification, average for regression) (Liaw and Wiener, 2002) .(3) At each bootstrap iteration, the data not included in building the tree model ("out-of-bag" data after Breiman, 2001) are predicted using the tree grown with the bootstrap sample.The "out-ofbag" predictions are aggregated and the error rate calculated.(4) The variable importance is determined for each predictor by assessing how much the prediction error increases when the "out-of-bag" data for that predictor are permuted while all others are left unchanged (Liaw and Wiener, 2002).
The outcomes of the random forest regression analysis that we use for interpretation are model performance (percentage of explained variance) and the variable importance measure % IncMSE (percent increase in mean square error) (Liaw and Wiener, 2002).The variable importance measure indicates how much worse the prediction would be if the data for that predictor were permuted randomly (Prasad et al., 2006).Large values thus represent an important variable.Strobl et al. (2009) state that "the absolute values of the importance scores should not be interpreted or compared over different studies".Instead, variable importance should be regarded as relative ranking (Strobl et al., 2009).
We set n tree to 3000, since this resulted in a rather stable prediction error (default = 500); m try was left at the default value (5).According to Díaz-Uriarte and De Andres ( 2006), changes in these parameters have in most cases negligible effects, suggesting the default values are a good option.Due to the random nature of the RF method, the model results slightly vary from run to run (e.g. standard deviation of explained variance 0.3 %-0.9 % for the response variable WT NORM at the seasonal scale).Some authors average variable importance measures over several runs, e.g.Grimm et al. (2008) or Loos and Elsenbeer (2011).We conducted several runs per RF and chose the variable importance for the run with the highest percentage of explained variance.We found that the top-scoring variables were unaffected; variables with similar variable importance values appeared in slightly different order per run.We thus only give weight to the highest scoring variables and interpret the rank of variables rather as a range of appearance (e.g.top-scoring, medium and lower range of variable importance).

Input variables
The water table response variables WT NORM and INDEX ACTI show high spatial variability (see Fig. 3 for WT NORM during the entire time and at the seasonal scale, INDEX ACTI and event scale not shown).Notable is withintransect variability, especially at the event scale, where adjacent wells exhibit contrasting water table dynamics (dry well/weak response vs. strong response within 3 m distance).There is also variation in spatial patterns over time, which is more pronounced at the event scale than at the seasonal scale.Moreover, there are differences among hillslopes: the mean of WT NORM and the mean of INDEX ACTI per hillslope ranks according to coniferous forest larger mixed forest larger grassland.The predictor variables also show high spatial variability (see the Supplement).Correlation between predictors is mostly low (Table 3).Exceptions are a moderate to high correlation between different pairs of vegetation predictors (NTREES, TROUGHF, CANOPY COV, STEMF, and LULC), and a moderate correlation between the categorical variable transect and some topography variables (UA, SLOPE, PROF CURV).

Partial correlation
The partial correlation coefficients depict the relationship between a predictor variable and a response variable when the effects of all other predictor variables are removed.As presented in Table 4, about half of the predictors show no significant correlation with the response variable WT NORM .For the predictor variables that are correlated with WT NORM , the strength of the relationship is generally low to moderate.Influential predictors are some topographic variables (slope, transect, aspect in some cases), soil properties (SLUG LOW, well depth in some cases, K for very few cases), and LULC.The spatially variable vegetation predictors throughfall, canopy coverage, stemflow index and amount of trees are not correlated with the water table response WT NORM , except for few cases with very low partial correlation (N TREES, CANOPY COV: r = −0.2 for few temporal scales).For the response variable INDEX ACTI the same tendency is found.
Table 4 also clearly reveals distinct differences among temporal scales.First, the correlation between different predictors and water table dynamics at the event scale tends to be weaker and is more often insignificant than at the seasonal scale and for the entire period.Second, there seems to be a pattern between strength of relationship and season: during summer seasons and summer events correlation coefficients are often slightly lower than for fall/winter seasons and late spring/winter events (e.g. 3 2010/3 2011 vs. 4 2010/1 2011, event on 12 July 2010/17 June 2011 vs. 12 May 2010/15 November 2010).To assess whether meteorological conditions potentially govern the observed differences in strength of partial correlation among seasons, we calculated linear correlation coefficients between partial r of each influential predictor and rainfall characteristics and antecedent wetness (see Table 5).We found a significant medium to high anti-correlation between mean/maximum rainfall intensity and strength of partial correlation for most predictors.The correlation with antecedent wetness is also medium to high for most predictors, yet only significant for the predictor LULC (small n due to event scale only).This indicates that the mapped predictors explain the observed water table response for time periods with high rainfall intensity and low AWI (antecedent wetness index) to a smaller degree.
When separately looking at both forested hillslopes vs. the grassland hillslope, distinct differences emerge (not shown; same trend for WT NORM and INDEX ACTI ).For the forest hillslopes, influential predictors and strength of partial correlation stays more or less the same as for the base case (=considering all 90 wells) (well depth, SLUG LOW, and K  slightly higher partial correlation coefficients; THROUGHF, CANOPY COV, STEMF, N TREES also higher values but still insignificant).Opposed to that, the grassland hillslope distinctly differs from the base case: there are hardly any significant partial correlations except for slope (similar to base case) and profile curvature (insignificant for base case).

Model performance
Model performance of different random forest models was low to moderate with an explained variance ranging from negative values to a maximum of 50 % (see Fig. 4 top graph).
The response variable INDEX ACTI exceeds WT NORM in most cases, except at the event scale.
Remarkable are the strong differences among temporal scales: the explained variance of the random forest models predicting WT NORM and INDEX ACTI lies above 20 % considering the entire time and most seasons.At the event scale, in contrast, the explained variance is always below 20 % (for most cases below 10 %) except for the event on 12 May 2010.A closer look at the differences among model performance reveals that models predicting the water table response for summer seasons and summer events tend to have a lower percentage of explained variance, e.g. 2 2010 (April-June) and 3 2010 (July-September) vs. 4 2010 (October-December) and 1 2011 (January-March), and the events on 12 July, 12 September, and 17 June (summer) vs. 12 May and 15 November (spring/winter).Similar to the results of the partial correlation analysis, we found a moderate negative correlation between RF model performance and mean rainfall intensity for WT NORM and INDEX ACTI (see Table 6).As for the partial correlation analysis, antecedent wetness is positively correlated with RF model performance at the event scale; yet, the correlation is not statistically significant.
Table 5. Correlation coefficients highlighting the effect of rainfall characteristics and antecedent wetness on the strength of partial correlation coefficients.Displayed are linear correlation coefficients between partial r of the response variable W TNORM and different predictors and different rainfall characteristics and antecedent wetness.For rank partial correlation coefficients see Table 4; for rainfall characteristics and antecedent wetness per temporal scale see Fig. 2 and Table 2.Only predictors showing a significant partial r for most temporal scales were chosen for correlation (SLOPE, SLUG LOW, WDEPTH, LULC, TRANS).Correlation with total rainfall n = 12 (all temporal scales), correlation with rainfall intensity n = 11 (seasonal and event scale), correlation with AWI n = 5 (event scale only).Regarding the differences among land use type, the RF models predicting the water table response WT NORM for the grassland hillslope (n = 30) show a distinctly lower percentage of explained variance than for the forested hillslopes (see Fig. 4 bottom graph).The RF models of the forested hillslopes (n = 60) perform slightly lower than when considering all wells (n = 90).The models for the response variable INDEX ACTI are comparable, yet there is more variability (few cases with model performance grassland hillslope > forested hillslopes and forested hillslopes > considering all wells).Since most models at the grassland hillslope perform very poorly overall (< 10 % explained variance), the chosen predictors cannot adequately explain the water table response at the grassland hillslope.

Variable importance
Predictor variables can be considered informative and important for a RF model if their variable importance value is above the absolute value of the lowest negative-scoring variable (Strobl et al., 2009).Thus, only variables meeting this condition are displayed in the variable importance plots.As can be seen in Fig. 5, the variable importance varies among models, explaining the water table response variable WT NORM for different temporal scales.Despite this complexity a few definite trends can be identified.The predictors SLUG LOW and PROF CURV are the most often occurring top-scoring variables and consequently have the highest single explanatory power.The percent increase in mean square error for SLUG LOW and PROF CURV exceeds other variable importance values by far for the entire period, 4 2010, 1 2011, 2 2011, and 3 2011.Predominantly topography predictors (slope, plan curvature, upslope contributing area, transect, and aspect) are found in the upper third of the variable importance ranking, following SLUG LOW and PROF CURV.Well depth shows an intermediate-ranking and high variability; other soil properties (K, SLUG HIGH) show lower importance, if any.Vegetation predictors (throughfall, canopy cover, stemflow, amount of trees, and LULC) primarily occur in the lower Fig. 5. Variable importance plots of random forest models predicting the water table response variable WT NORM for different temporal scales (entire time, seasonal scale, and event scale).The x-axis displays the percentage of increase in mean square error (MSE) when the data for that predictor are permuted randomly while all others are held constant.The number in the lower right corner of each variable importance plot is the percentage of explained variance for that random forest model.third of the variable importance ranking.Solely canopy cover and throughfall appear on higher ranks for some RF models; interestingly, these models all represent summer periods (seasons 3 2010, 2 2011, 3 2011, events on 12 May, 12 July, 12 September, and 17 June).Nevertheless, one has to keep in mind that for many of these summer period models the overall explanatory power is low, especially at the event scale.
RF models explaining the water table response variable INDEX ACTI follow a similar trend as for WT NORM (Fig. 6).SLUG LOW is the highest-scoring predictor in most cases.Profile curvature is high-ranking as well but less important than for WT NORM .Instead, well depth plays a more important role.The general trend of topography (PROF CURV, SLOPE, UA, PLAN CURV, TRANS) and soil properties (SLUG LOW, WDEPTH) occupying higher ranks, while most vegetation parameters score lower (LULC, STEMF, N TREES) stays the same.However, the vegetation predictors throughfall and stemflow are notable: THROUGHF ranks second and third for 3 2011 (26 % explained variance) and 2 2011 (32 % explained variance), respectively.CANOPY COV continuously occupies middle ranks.
In terms of the effect of different temporal scales, variable importance is more variable and trends are less clear at the event scale.In addition, the vegetation predictors throughfall and canopy cover are more important during summer periods.Regarding the effect of land use type, RF models explaining the water table response for the forested hillslopes differ little from RF models including all wells (WT NORM and INDEX ACTI ).SLUG LOW, PROF CURV and WDEPTH score highest.Noticeable is that among these highest-ranking predictors well depth plays a more important role than for models considering all wells.Vegetation predictors follow the same pattern as for RF models of all wells.In contrast, the importance of predictor variables for RF models predicting the water table response at the grassland hillslope is distinctly different: profile curvature, throughfall, and transect are top-scoring with interchanging order while Fig. 6.Variable importance plots of random forest models predicting the water table response variable INDEX ACTI for different temporal scales (entire time, seasonal scale, and event scale).The x-axis displays the percentage of increase in mean square error (MSE) when the data for that predictor are permuted randomly while all others are held constant.The numbers in the lower right corner of each variable importance plot is the percentage of explained variance for that random forest model.soil properties (SLUG LOW, WDEPTH) do not appear at all.Interesting is that THROUGHF, which represents open area rainfall at the grassland hillslope, only marginally differs among transects (lower transect: 90.1 %, middle transect: 89.8 %, upper transect 86.2 %; values per transect represent the mean of two rainfall totalizators over 8 events).Open area rainfall and also profile curvature (lower transect slightly concave, middle and upper transect more planar/convex) thus represent similar information as the predictor transect.This fits well with the observation that mostly the lower transect wells show a water table response at the grassland hillslope.Nonetheless, variable importance needs to be interpreted against the background that RF models of the grassland hillslope generally have a very poor explanatory power.

Can the spatial variability of SSF dynamics be explained by measurable hillslope characteristics?
The partial correlation analysis and the random forest approach revealed similar results: 1.The observed spatio-temporal variability of water table response results from a complex set of interactions among hillslope characteristics.Soil properties and topography show the highest single explanatory power.
2. Vegetation predictors surprisingly played a minor role, if any.Throughfall and canopy cover were the most important variables among vegetation predictors.
3. The mapped hillslope characteristics explain the observed spatial variability of water Despite some differences in outcome of both analyses, the observed similar results of both methods independently validate these conclusions.The high explanatory power of topography is interesting given the predominantly planar side slopes.Clearly, hillslope topography has been identified as a dominant control (e.g.Fujimoto et al., 2008;Hopp and Mc-Donnell, 2009;Berne et al., 2005;Bogaart and Troch, 2006); however, (micro-) topography differences are rather small at the studied hillslopes relative to differences within a catchment.The RF models identified profile curvature as a highly important variable.This predictor did not show any significant partial correlation with water table response.Instead, slope explained the most among topography variables.We assured the meaningfulness of different predictors during ensemble tree construction by single regression trees and partial dependence plots of ensemble tree models, which illustrate the relationship between the response variable and a given predictor after averaging out the effects of the other predictor variables (Liaw and Wiener, 2002).Important topography predictors all showed meaningful relationships, e.g.concave profile curvature or high upslope contributing area resulting in strong water table response.Soil properties (hydraulic conductivity and well depth) as influential drivers are not surprising.Interesting, however, is that SLUG LOW was identified as most important variable among variables calculated from slug tests, and also showed highest partial correlation among them (K after Hvorslev method, SLUG LOW, SLUG HIGH).Consequently, SLUG LOW seems to best represent actual hydraulic conductivity.It is possible that SLUG HIGH (fast part of recession, duration sometimes only for a few time steps) does not stand for true physical soil properties but rather for initial effects after slug injection.
For us it is surprising the subordinate effect of vegetation.Our hypothesis that vegetation exerts a major control on spatially variable SSF dynamics at our study site has to be rejected.This is remarkable given the many studies reporting on the effect of precipitation redistribution (stemflow, throughfall) and rooting patterns on hydrologic fluxes (e.g.Nordmann et al., 2009;Nikodem et al., 2010;Chang and Matzner, 2000;Jost et al., 2004Jost et al., , 2012;;Gerrits, 2010;Sansoulet et al., 2008;Liang et al., 2007Liang et al., , 2011;;Keim et al., 2006).In our study no significant partial correlation between stemflow index and different water table response variables could be identified and the variable importance during tree model construction was marginal.The predictor number of trees in vicinity of a well showed slightly higher variable importance and there were few cases with low but significant partial correlation.This variable may act as proxy for some vegetation related parameter, such as root density or root water uptake, but may also simply mimic canopy coverage and throughfall.THROUGHF and CANOPY COV showed the highest influence among vegetation predictors during tree model construction, yet still played a secondary control overall.Noteworthy is the relative stronger influence during summer seasons and summer events.Our experimental findings cannot confirm the clear effect of interception on subsurface saturation patterns as shown by a virtual experiment by Keim et al. (2006).Another virtual experiment, however, also found only a minor effect of fine-scale throughfall patterns on the hillslope hydrological response (Hopp and McDonnell, 2011).
The question is why vegetation predictors at our hillslopes exert a generally weak control on spatial variability of shallow water table dynamics, even though other studies proved the importance of vegetation on hydrologic fluxes.One likely reason may be comparably thick soils and hence a deep unsaturated zone, which mask or blur the effect of spatially variable input due to throughfall and stemflow.It would be interesting to assess whether the same analysis for hillslopes with shallow soil provides a different picture regarding the effect of vegetation.Other controls, e.g.highly transmissive soil, as observed for some wells, may also override effects of precipitation redistribution.Furthermore it is possible that our calculated stemflow index is not representative for actual stemflow volumes, but quantitative information was not available.
Another important outcome of our analyses is that the predictors representing many facets of the hillslope configuration do not sufficiently explain the spatial variability of water table response.The explained variance of different ensemble tree models rarely exceeded 30 percent and was often lower.Hence, there must be very important additional drivers not captured by our measurements.Bedrock topography is not too well represented by out predictors except for well depth.However, bedrock upslope contributing area and micro-relief was found to strongly govern the hillslope hydrologic response (e.g.Tromp-van Meerveld and McDonnell, 2006a;Graham et al., 2010;Freer et al., 2002).Nevertheless, highresolution information about subsurface topography is rare and highly elaborate to determine.The periglacial deposits at our study site resulting in no clear soil-bedrock interface add even more complexity.Experimental studies at sites with similar subsurface structure showed subsurface flow occurrence in different layers of periglacial deposits, and different behaviour of impeding layers depending on antecedent wetness (Chifflard et al., 2008;Nordmann et al., 2009).Additional drivers may be preferential pathways (macropores, roots, soil pipes), as numerously observed (e.g.Uchida et al., 2002;Weiler and Flühler, 2004;Bachmair et al., 2009;Anderson et al., 2009;Noguchi et al., 1999;Kienzler and Naef, 2008).Deriving quantitative data representing this variable at high spatial resolution for a large hillslope seems impossible with current methods (excavations, electrical resistivity tomography).Further controls could also be spatially variable surface conditions and thus overland flow due to litter, soil compaction, and hydrophobicity.S. Bachmair and M. Weiler: Hillslope characteristics as controls of subsurface flow variability 4.2 Are there differences in explainability of water table dynamics among temporal scales and seasons/events?
Both analyses exposed that the water table response at the event scale is less explainable: partial correlation between influential predictors and water table response and explained variance of ensemble tree models was mostly lower than at the seasonal scale and for the entire time.Furthermore, trends of predictor variable importance were more variable at the event scale.Short-term variability of water table dynamics is thus less explainable than longer-term variability.
A closer look at the differences among seasons and events revealed that there is a tendency of lower explainability for summer seasons and summer events (by degree of explainability we mean strength of partial correlation between water table response and influential predictors, and degree of ensemble tree model performance).Interestingly, we found a negative correlation between explainability of water table response and rainfall intensity.We also found a moderate relationship between explainability of water table response and AWI; nevertheless, due to the small sample size (AWI only available at event scale) this should be interpreted cautiously.
These findings indicate that the mapped predictors explain the observed water table response for time periods with high rainfall intensity (and low AWI) to a lesser degree.We reason that during such periods additional drivers of SSF dynamics play a more important role, e.g.stronger hydrophobicity and thus bypass-flow, exploitation of preferential pathways initiated by high rainfall intensity, etc.
The assumption about rainfall intensity and antecedent wetness inducing different infiltration and saturation patterns coincides with findings by Kienzler and Naef (2008).Sprinkling experiments at several hillslopes revealed differences in SSF response, which they attribute to "indirect feeding" of preferential pathways via saturated areas, or "direct feeding" via rainfall (Kienzler and Naef, 2008).We suggest that direct feeding of preferential pathways, evoked by high rainfall intensity, plays a bigger role during summer seasons and summer events, thereby reducing the correlation between water table response and measured hillslope characteristics.It would be interesting to assess the relationship between saturation patterns and subsurface flow volume.If preferential pathways were an important control in summer, saturation patterns and subsurface flow volume would be lowly correlated, since preferential flow takes places under unsaturated conditions.A recent review highlights that preferential flow does not require matrix saturation or saturation of the conduits (Nimmo, 2012).Following the concept by Kienzler and Naef (2008), we would expect indirect feeding of preferential pathways via saturated areas during seasons with low rainfall intensity and high antecedent wetness, and therefore a higher correlation between saturation patterns and subsurface flow volume, and saturation patterns and measured hillslope characteristics.
The observation that throughfall and canopy cover had a greater influence on random forest models during summer periods also highlights the seasonality effect.More pronounced throughfall patterns in summer would explain the higher importance.There is no consensus whether increasing rainfall intensity de-or increases spatial variability of throughfall, yet higher total rainfall reduces spatial variability (Levia and Frost, 2006).The differences in importance among seasons likely go back to foliation vs. defoliation of deciduous trees.Staelens et al. (2006), for instance, found a significantly higher spatial variability of throughfall during the leafed season than during the leafless.

Are there differences in explainability of water table dynamics regarding land use?
When we split the water table response into forest hillslope response vs. grassland, distinct differences emerged.There were minor differences in explainability of water table dynamics between the base case (all wells) and forest only, whereas the grassland hillslope response was basically not explainable with the same predictors (hardly any significant partial correlation and RF model performance very poor).
There are two possible explanations why shallow water table dynamics at the grassland hillslope are not attributable to hillslope properties: (1) either the water table response at the grassland hillslope is inherently different due to other runoff generation mechanisms for this type of land use, e.g. higher importance of preferential flow, different infiltration processes, etc.; or (2) the grassland hillslope shows distinct differences in static drivers, i.e. drivers not influenced by land use, such as (bedrock-) topography.In this case, differences are coincidental and not physically attributable to vegetation cover effects.
To validate these possible explanations we carried out non-parametric statistical tests to assess whether topography and soil properties of the three hillslopes significantly differ among hillslopes (Kruskal-Wallis test comparing all hillslopes; Wilcoxon rank-sum test for pairwise tests).We found that upslope contributing area and well depth are significantly different among all hillslopes, and between each pair of hillslopes (p < 0.05).Plan curvature significantly differs between grassland vs. mixed forest and grassland vs. coniferous forest (p < 0.05), yet not between both forest hillslopes.Consequently, smaller UA, slightly convex plan curvature, and deeper soil may induce a different SSF response at the grassland hillslope.One could hypothesize there is a threshold concerning input volume until the SSF response follows the pattern of the forested hillslopes (higher storage capacity of the grassland hillslope and lower UA).On the other hand, one could argue the grassland hillslope is more homogeneous due to the homogeneous vegetation cover, whereas trees enforce small-scale spatial variability of rainfall input (and other micro-meteorological parameters), soil properties and soil moisture, and thus hydrologic fluxes.At the grassland hillslope there is a distinct pattern of stronger and more continuous response at the lower transect, while the upslope transects show little to no SSF response.In the forest there is more spatial variability within and among transects, and upper transect wells exhibit strong water table fluctuations (Bachmair et al., 2012).It is hence possible that there are inherent differences in spatial variability of hillslope properties due to vegetation cover, which lead to a higher explanatory power of water table variability for the forested hillslopes vs. the grassland hillslope.

Conclusions
Partial correlation analysis and a random forest approach elucidated the relationship between variability of shallow water table response and different hillslope characteristics.The following key points were uncovered: -Complex interplay of predictors: the observed variability of water table response results from complex interactions among hillslope characteristics.Soil properties and topography, despite predominantly planar hillslopes, showed the highest single explanatory power.
-Minor role of vegetation: at our study site vegetation predictors played a minor role, if any.Solely throughfall and canopy cover exerted a slightly stronger control, especially in summer.
-Low overall explanatory power: the examined hillslope characteristics do not sufficiently explain the water table response.Hence, there must be additional very important drivers not represented by our measurements of the hillslope configuration (e.g.bedrock topography, preferential pathways, hydrophobicity).
-Differences among temporal scales: short-term variability was less explainable than longer-term variability.Additionally, the degree of explainability among seasons and events was correlated with rainfall intensity: the mapped predictors explain the water table response for time periods with high rainfall intensity to a lesser degree.
-Grassland vs. forest: explainability of SSF variability at only the forested hillslopes was similar to the base case (all wells).In contrast, the SSF response at only the grassland hillslope was hardly explainable with the same predictors.
The fact that detailed information on the hillslope configuration did not help to sufficiently explain the SSF response raises the question about the validity of physically-based process or numerical models designed to predict subsurface flow at the observed scale.A better characterization of measurable model parameters at this spatial scale will be hard to derive.
One possible reason for the mediocre overall explainability is that additional drivers such as preferential pathways, especially during high-intensity rainfall, could not be derived.
Further research should therefore focus on how to represent spatially and temporarily variable preferential pathways in models and how to parameterize them.Ecohydrological feedbacks also need to be further assessed.Are the minor effects of vegetation different at other hillslopes?Furthermore, we need to assess the relationship between saturation patterns and subsurface flow volume, and whether this relationship varies temporally due to connectivity of saturated zones or onset of preferential pathways.

Fig. 1 .
Fig. 1.Overview of catchment and location of hillslopes (top graph), location of wells within hillslopes (middle graph), and location of mapped trees at the forested hillslopes (bottom graph, DBH= stem diameter at breast height).Catchment area: 0.21 km 2 ; elevation range: 340-585 m a.s.l.; each hillslope (grassland, coniferous forest, and mixed forest hillslope) is equipped with 30 wells.

Fig. 4 .
Fig. 4. Explained variance of random forest models for different temporal scales: entire time series, seasonal scale (highlighted in grey), and event scale.Top graph: explained variance of random forest models predicting the water table response variables WT NORM and INDEX ACTI .Bottom graph: explained variance of random forest models predicting the water table response variable WT NORM for all wells (n = 90), only the grassland hillslope wells (n = 30), and only the forest hillslope wells (n = 60).The y-axis is scaled to a minimum of 0, missing bars represent 0 % explained variance or negative values.

Table 1 .
List of acronyms.For details about the determination of the response and predictor variables see methods section.
The hillslopes Hydrol.Earth Syst.Sci., 16, 3699-3715, 2012 www.hydrol-earth-syst-sci.net/16/3699/2012/ INDEX ACTI : percentage of time during which a well is activated; as activated we define a water table > 15 cm from the bottom of the well.WT NORM and INDEX ACTI range from 0-1.Several other water table response variables had been computed (e.g.mean water table height, absolute water table rise, coefficient of variation, percentage of time during which at least 10 %, 30 % or 50 % of the well was saturated); however, since they were either meaningless (coefficient of variation) or highly correlated with WT NORM and INDEX ACTI , we omitted them from further analyses.Note that the selected response variables are highly correlated as well (WT NORM versus INDEX ACTI : r = 0.74-0.86 for all temporal scales).The data analysis was conducted with IDL (Interactive Data Language, ITT VIS).Handling of missing data during the determination of water table response variables:

Table 2 .
Rainfall characteristics and antecedent wetness conditions (discharge before event, weighted rainfall last 14 days) per event.
* Missing data supplemented with data from nearby WBI weather station (state-run viticulture institute Freiburg).

Table 4 .
Rank partial correlation coefficients (Spearman) between response variable WT NORM and different predictors.

Table 6 .
Correlation coefficients highlighting the effect of rainfall characteristics and antecedent wetness on RF model performance.Displayed are linear correlation coefficients between RF model performance for the response variables WT NORM and INDEX ACTI and different rainfall characteristics and antecedent wetness.For model performance per temporal scale see Fig. 4, top graph; for rainfall characteristics and antecedent wetness per temporal scale see Fig. 2 and Table 2. Correlation with total rainfall n = 12 (all temporal scales), correlation with rainfall intensity n = 11 (seasonal and event scale), correlation with AWI n = 5 (event scale only).
describe the response there must be additional important drivers not captured by our measurements.