Stable water isotope tracing through hydrological models for 1 disentangling runoff generation processes at the hillslope scale 2 3

12 Hillslopes are the dominant landscape components where incoming precipitation is 13 transferred to become groundwater, streamflow or atmospheric water vapor. However, 14 directly observing flux partitioning in the soil is almost impossible. Hydrological hillslope 15 models are therefore being used to investigate the processes involved. Here we report on a 16 modeling experiment using the Catchment Modeling Framework (CMF) where measured 17 stable water isotopes in vertical soil profiles along a tropical mountainous grassland hillslope 18 transect are traced through the model to resolve potential mixing processes. CMF simulates 19 advective transport of stable water isotopes O and H based on the Richards equation within 20 a fully distributed 2-D representation of the hillslope. The model successfully replicates the 21 observed temporal pattern of soil water isotope profiles (R2 0.84 and NSE 0.42). Predicted 22 flows are in good agreement with previous studies. We highlight the importance of 23 groundwater recharge and shallow lateral subsurface flow, accounting for 50% and 16% of 24 the total flow leaving the system, respectively. Surface runoff is negligible despite the steep 25 slopes in the Ecuadorian study region. 26


Introduction
Delineating flow path in a hillslope is still a challenging task (Bronstert, 1999;McDonnell et al., 2007;Tetzlaff et al., 2008;Beven and Germann, 2013).However, a more complete understanding of the partitioning of incoming water to surface runoff, lateral subsurface flow components or percolation allows better understanding, for example, the impact of climate and land use change on hydrological processes.Models are often used to test different rainfall-runoff generation processes and the mixing of water in the soil (e.g., Kirkby, 1988;Weiler and McDonnell, 2004).Due to the prevailing measurement techniques and therefore the available data sets, it has become common practice to base the validation of modeled hillslope flow processes on quantitative data on storage change.In the simplest case, system-wide storage changes are monitored by discharge and groundwater level measurements or, on more intensively instrumented hillslopes, the storage change of individual soil compartments is monitored by soil moisture sensors.In the typical 2-D flow regime of a slope, such models bear the necessity to account not only for the vertical but also for the lateral movements of water within the soil (Bronstert, 1999).Quantitative data on storage change in this regard are only suitable to account for the actual change in soil water volume, but not to assess the source or flow direction.Knowing tracer compositions of relevant hydrological components along a hillslope allows one to account for mixing processes and thereby to delineate the actual source of the incoming water.Over the years a number of artificial, e.g., fluorescence tracers like uranine, and natural tracers, e.g., chloride or stable water isotopes, have emerged.While the application of the artificial tracers is rather limited in space and time (Leibundgut et al., 2011), the latter ones can be used over a wide range of scales (Barthold et al., 2011;Genereux and Hooper, 1999;Leibundgut et al., 2011;Muñoz-Villers and McDonnell, 2012;Soulsby et al., 2003).Stable water isotopes such as oxygen-18 ( 18 O) and hydrogen-2 ( 2 H) are integral parts of water molecules and consequently ideal tracers Published by Copernicus Publications on behalf of the European Geosciences Union.D. Windhorst et al.: Stable water isotope tracing through hydrological models of water.Over the last decades isotope tracer studies have proven to provide reliable results on varying scales (chamber, plot, hillslope to catchment scale) and surface types (open water, bare soils, vegetated areas) to delineate or describe flow processes under field experimental or laboratory conditions (Garvelmann et al., 2012;Hsieh et al., 1998;Sklash et al., 1976;Vogel et al., 2010;Zimmermann et al., 1968).
Although the first 1-D process-orientated models to describe the dynamics of stable water isotope profiles for open water bodies (Craig and Gordon, 1965) were developed as early as in the mid-1960s, and a bit later for soils (Zimmermann et al., 1968), fully distributed 2-D to 3-D hydrological tracer models benefitting from the additional information to be gained by stable water isotopes are still in their early development stages (Davies et al., 2013) or use strong simplifications of the flow processes (e.g., TAC D using a kinematic wave approach; Uhlenbrook et al., 2004).This can be attributed to the high number of interwoven processes affecting the soil water isotope fluxes not only in the soil's liquid phase but also in its vapor phase.The more process-based 1-D models (Braud et al., 2005;Haverd and Cuntz, 2010) therefore simultaneously solve the heat balance and the mass balance simultaneously for the liquid and the vapor phase and are thereby describing the -convection and molecular diffusion in the liquid and vapor phase, -equilibrium fractionation between liquid and vapor phase, -fractionation due to evaporation, -non-fractionated flux due to percolation and transpiration.
To obtain and compute the data required to apply these kind of models beyond the plot scale is still challenging.However, due to emerging measuring techniques the availability of sufficient data is currently becoming more realistic.Increasing computational power and especially the cavity ring-down spectroscopy (CRDS) -a precise and cost-effective method to analyze the signature of stable water isotopes (Wheeler et al., 1998) -promise progress.Hence, it is tempting to investigate the suitability of isotope tracers to delineate hydrological flow paths using a more physical modeling approach.Recent research in this direction includes the work of McMillan et al. (2012) and Hrachowitz et al. (2013) using chloride as a tracer to study the fate of water in catchments in the Scottish Highlands.Even though some processes affecting the soil water isotope transport are still represented in a simplified manner or, due to their limited effect/importance of the respective process within the given study site, could be omitted, this approach allows us to determine the potential of soil water isotope modeling in catchment hydrology and highlights the future need for research.
This study is conducted in a 75 km 2 montane rain forest catchment in south Ecuador, the upper part of the Rio San Francisco, which has been under investigation since 2007 (Bogner et al., 2014;Boy et al., 2008;Bücker et al., 2011;Crespo et al., 2012;Fleischbein et al., 2006;Goller et al., 2005;Timbe et al., 2014;Windhorst et al., 2013b).The findings of those studies (briefly synthesized in Sect.2.3) will (a) ease the setup of chosen model, (b) let us define suitable boundary conditions for the chosen modeling approach and (c) serve as a reference for the delineated flow bath.The additional information from previous studies conducted in the study area will therefore highlight the potential of this new model approach to delineate hydrological flow paths under natural conditions and support our preliminary hydrological process understanding retrieved from more classical methods conducted in the past.
Within this catchment we selected a hillslope with a distinct drainage area and nearly homogenous land use and established an experimental sampling scheme to monitor the isotopic signatures of the soil water of three soil profiles using passive capillary fiberglass wick samplers (PCaps).Based on the proposed modeling approach a 2-D virtual hillslope representation of this hillslope was then implemented using the Catchment Modeling Framework (CMF; Kraft et al., 2011).Due to the necessity to mix the flows in accordance with the observed soil water isotope signatures, we are confident that the degree of certainty for the modeled flow path will be higher than for conventional modeling approaches relying solely on quantitative information to evaluate the modeled data.Replacing the calibration target bears now the necessity to mix the right amount and signature of any given flow component, whereas the quantitative change only relies on the actual amount of water leaving or entering any given compartment.We will quantify the following flow components to disentangle the runoff generation processes: surface runoff, lateral subsurface flow in the vadose zone and percolation to groundwater.The lateral subsurface flow will be further subdivided into near-surface lateral flow and deep lateral flow.
To validate the chosen modeling approach and assess our process understanding, we tested the following hypotheses: 1.Under the given environmental conditions -high precipitation and humidity - (Bendix et al., 2008) and full vegetation cover (Dohnal et al., 2012;Vogel et al., 2010) only non-fractionating and advective water transport of isotopes is relevant.
2. Gaseous advection and diffusive process in the gaseous as well as the liquid phase and the enrichment due to evaporation are negligible; hence the stable water isotopes behave like a conservative tracer.
3. Large shares of the soil water percolate to deeper horizons, thereby creating long mean transit times (MTT) (Crespo et al., 2012;Timbe et al., 2014).4. Due to the high saturated conductivities of the top soil layers, the occurrence of Hortonian overland flow is unlikely to have an important contribution to the observed flows (Crespo et al., 2012).
5. Fast near-surface lateral flow contributes essentially to downhill water flows and plays a relevant role to understand the overall hydrological system (Bücker et al., 2010).

Study area
The hillslope under investigation is located within the catchment of the Rio San Francisco in south Ecuador (3 • 58 30 S, 79 • 4 25 W) at the eastern outskirts of the Andes and encompasses an area of 75 km 2 .Close to the continental divide the landscape generally follows a continuous eastward decline towards the lowlands of the Amazon basin (Fig. 1b).Due to the high altitudes (1720-3155 m a.s.l.), the deeply incised valleys (slopes are on average 25-40  ) with the main rainy season in the austral winter (Bendix et al., 2008).A comprehensive description of the soils, climate, geology and land use has been presented by Beck et al. (2008), Bendix et al. (2008) and Huwe et al. (2008).

Experimental hillslope
To test our understanding of hydrological processes within the study area, we chose a hillslope with a nearly homogenous land use (Fig. 1).It is located on an extensive pasture site with low-intensity grazing by cows and dominated by Setaria sphacelata.Setaria sphacelata is an introduced tropical C4 grass species that forms a dense tussock grassland with a thick surface root mat (Rhoades et al., 2000).This grass is accustomed to high annual rainfall intensities (> 750 mm a −1 ), has a low drought resistance and tolerates water logging to a greater extent than other tropical grass types (Colman and Wilson, 1960;Hacker and Jones, 1969).The hillslope has a drainage area of 0.025 km 2 , a hypothetical length of the subsurface flow of 451 m and an elevation gradient of 157 m with an average slope of 19.2 • .The soil catena of the slope was recorded by Pürckhauer sampling and soil pits.To investigate the passage of water through the hillslope, a series of three wick samplers has been installed along the line of subsurface flow.
Climate forcing data with an hourly resolution of precipitation, air temperature, irradiation, wind speed and relative humidity were collected by the nearby (400 m) climate station "ECSF" at similar elevation.Isotopic forcing data were collected manually for every rainfall event from October 2010 until December 2012 using a Ø25 cm funnel located in close proximity to the chosen hillslope at 1900 m a.s.l.(Timbe et al., 2014).To prevent any isotopic fractionation after the end of a single rainfall event (defined as a period of 30 min without further rainfall), all samples were directly sealed with a lid and stored within a week in 2 mL amber glass bottles for subsequent analysis of the isotopic signature as described in Sect.2.4.1 (all samples < 2 mL were discarded).

Current process understanding at the catchment scale
The catchment of the Rio San Francisco has been under investigation since 2007 (Bücker et al., 2011;Crespo et al., 2012;Timbe et al., 2014;Windhorst et al., 2013b) and was complemented by a number of studies on forested microcatchments (≈ 0.1 km 2 ) within this catchment (Bogner et al., 2014;Boy et al., 2008;Fleischbein et al., 2006;Goller et al., 2005).Studies on both scales identify the similar hydrological processes as being active within the study area.
Studies on the micro-scale (Boy et al., 2008;Goller et al., 2005), supported by solute data and end member mixing analysis at the meso-scale (Bücker et al., 2011;Crespo et al., 2012), showed that fast "organic horizon flow" in forested catchments dominates during discharge events if the mineral soils are water saturated prior to the rainfall.Due to an abrupt change in saturated hydraulic conductivity (K sat ) between the organic (38.9 m d −1 ) and the near-surface mineral layer (0.15 m d −1 ), this organic horizon flow can contribute up to 78 % of the total discharge during storm events (Fleischbein et al., 2006;Goller et al., 2005).However, the overall importance of this organic horizon flow is still disputable because the rainfall intensity rarely gets close to such a high saturated hydraulic conductivity.In 95 % of the measured rainfall events between June 2010 and October 2012 the intensity was below 0.1 m d −1 (≈ 4.1 mm h −1 ) and was therefore 15 times lower than the saturated hydraulic conductivity of the mineral soil layer below the organic layer under forest vegetation and around 30 times lower than the saturated hydraulic conductivity of the top soil under pasture vegetation (Zimmermann and Elsenbeer, 2008;Crespo et al., 2012).The same conclusion holds true for the occurrence of surface runoff due to infiltration access on pasture (lacking a significant organic layer).Solely based on rainfall intensities, surface runoff is therefore relatively unlikely to contribute to a larger extent in rainfall-runoff generation.The reported K sat values are based on measurements of 250 cm 3 undisturbed soil core samples vertically extracted from the center of each respective layer.Due to the chosen sampling method and the limited size of the soil cores, the effective saturated hydraulic conductivity will be even higher and can vary for the horizontal flow component.When and to which extent a subsurface saturated prior to the rainfall event would still trigger surface runoff on pastures therefore remains to be investigated.Bücker et al. (2010) and Timbe et al. (2014) were able to show that base flow, on the other hand, has a rather large influence on the annual discharge volume across different land use types, accounting for > 70 and > 85 %, respectively.These findings are also supported by the long MTT of the base flow for different sub-catchments of the Rio San Francisco in comparison to the fast runoff reaction times, varying according to Timbe et al. (2014) between 2.1 and 3.9 years.Accordingly, the current findings confirm that the base flow -originating from deeper mineral soil and bedrock layersis dominating the overall hydrological system in the study area (Crespo et al., 2012;Goller et al., 2005).Apart from this dominating source of base flow, Bücker et al. (2010) identified near-surface lateral flow as a second component to be relevant for the generation of base flow for pasture sites.

Passive capillary fiberglass wick samplers (PCaps)
We installed passive capillary fiberglass wick samplers (wick samplers for short; designed according to Mertens et al., 2007) as soil water collectors at three locations along an altitudinal transects under pasture vegetation at three soil depths.PCaps maintain a fixed tension based on the type and length of wick (Mertens et al., 2007), require low maintenance and are most suitable to sample mobile soil water without altering its isotopic signature (Frisbee et al., 2010;Landon et al., 1999).We used woven and braided 3/8 in.fiberglass wicks (Amatex Co. Norristown, PA, US).Half (0.75 m) of the 1.5 m wick was unraveled and placed over a 0.30 × 0.30 × 0.01 m square plastic plate, covered with fine-grained parent soil material and then put in contact with the undisturbed soil.
Every collector was designed to sample water from three different soil depths (0.10, 0.25 and 0.40 m) with the same suction, all having the same sampling area of 0.09 m 2 , wick type, hydraulic head of 0.3 m (vertical distance) and total wick length of 0.75 m.To simplify the collection of soil water, the wick samplers drained into bottles placed inside a centralized tube with an inner diameter of 0.4 m and a depth of 1.0 m.To avoid any unnecessary alterations of the natural flow above the extraction area of the wick sampler, the centralized tube was placed downhill and the plates were evenly spread uphill around the tube.A flexible silicon tube with a wall thickness of 5 mm was used to house the wick and   2012) and Zimmermann and Elsenbeer (2008).
to connect it to the 2 L sampling bottles storing the collected soil water.The silicon tube prevents evaporation and contamination of water flowing through the wick.Weekly bulk samples were collected over the period from October 2010 until December 2012 when the sample volume exceeded 2 mL.Soil water and the previously mentioned precipitation samples are analyzed using a CRDS with a precision of 0.1 ‰ for 18 O and 0.5 for 2 H (Picarro L1102-i, CA, US).

Soil survey
The basic soil and soil hydraulic properties for each distinct soil layer along the hillslope were investigated up to a depth of 2 m.Pürckhauer sampling for soil texture and succession of soil horizons was done every 25 m, while every 100 m soil pits were dug for sampling soil texture, soil water retention curves (pF curves), porosity and succession of soil horizons.The results were grouped into eight classes (Table 1) and assigned to the modeling mesh as shown in Fig. 2. Retention curves (pF curves) were represented by the van Genuchten-Mualem function using the parameters α and n.
All soils developed from the same parent material (clay schist) and are classified as Haplic Cambisol with varying soil thickness.Soil thickness generally increased downhill, varying between 0.8 and 1.8 m in depressions.Clay illuviation was more pronounced in the upper part of the hillslope (higher gradient in clay content), indicating lower conductivities in deeper soil layers.

The Catchment Modeling Framework (CMF)
The Catchment Modeling Framework developed by Kraft et al. (2011) is a modular hydrological model based on the concept of finite volume method introduced by Qu and Duffy (2007).Within CMF those finite volumes (e.g., soil water storages, streams) are linked by a series of flow-accounting equations (e.g., Richards or Darcy equation) to a one-to three-dimensional representation of the real-world hydrological system.The flexible setup of CMF and the variety of available flow-accounting equations allows customizing the setup as required in the presented study.In addition to the water fluxes, the advective movement of tracers within a given system can be accounted for by CMF, making this modeling framework especially suitable to be used in our tracer study (Kraft et al., 2010).Starting with Beven and Germann (1982) scientists over the last decades have frequently argued that the Richards equation along with similar flow-accounting equation assuming a time invariant and well-mixed homogenous flow of water through the soil pore space, similar to those currently implemented in CMF, are not suitable to account for preferential flow relevant for modeling tracer transport (Brooks et al., 2010;Germann et al., 2007;Hrachowitz et al., 2013;Stumpp and Maloszewski, 2010).Being developed for the quantitative representation of soil water flow, these equations cannot distinguish between water stored in different soil compartments (namely the soil matrix and macro-pores) and only artificially try to represent macropore flow, e.g., by favoring high saturated conductivity values or misshaped conductivity curves controlling the flow of water between soil compartments.Even though the capabilities of CMF to account for preferential flow are still in the development phase (e.g., by following the dualpermeability approach in the future) and are not accounted for in the presented setup, our setup will once more highlight potential drawbacks of the modeling approaches relying on the Richards equation while modeling tracer transport at the hillslope scale.

Setup of CMF
To govern the water fluxes within our system, we used the following flow-accounting equations: Manning equation for surface water flow; Richards equation for a full 2-D representation of the subsurface flow; Shuttleworth-Wallace modification (Shuttleworth and Wallace, 1985) of the Penman-Monteith method to control evaporation and transpiration; and constant Dirichlet boundary conditions representing the groundwater table and the outlet of the system as a rectangular ditch with a depth of 1.5 m.The lower boundary condition is only applicable if groundwater table is > 2 m below ground.Preliminary testing revealed that a discretization based on a constant vertical shift (5 m) and alternating cell width increasing width depth (ranging from 1.25 to 83.75 cm) yielded the optimum model performance with regard to computing time and model quality.Based on 5 m contour lines (derived by local lidar measurements with a raster resolution of 1 m; using the Spatial Analyst package of Ar-cGis 10.1 from ESRI) this hillslope was further separated into 32 cells ranging in size from 16.6 to 2921.6 m 2 (Fig. 1a).
To account for small-scale dynamics in the mixing process of stable water isotopes and to be able to run the model with a satisfactory speed, two different horizontal resolutions were used to discretize each layer with depth.Layers encompassing wick samplers and their upslope neighbor were run with a finer resolution of at least 26 virtual soil layers increasing in thickness width depth (1 × 1.25, 13 × 2.5, 7 × 5 and 5 × 10-50 cm).All other cells were calculated with coarser resolution of at least 14 virtual soil layers (1 × 1.25, 1 × 2.5, 6 × 5, 3 × 10 cm and 3 × 15-83.75cm).When the delineated soil type changed within a soil layer, it was further subdivided according to Fig. 2.

Evapotranspiration
Soil evaporation, evaporation of intercepted water and plant transpiration are calculated separately using the sparse canopy evapotranspiration method by Shuttleworth and Wallace (1985), in its modification by Federer et al. (2003) and Kraft et al. (2011).This approach requires the following parameterizations: soil-surface-wetness-dependent resistance to extract water from the soil (r ss ); the plant-typedependent bulk stomatal resistance to extract water from the leaves (r sc ); and the aerodynamic resistance parameters (r aa , r as , and r ac ) for sparse crops as described by Shuttleworth and Gurney (1990) and Federer et al. (2003), whereby r ac (resistance canopy atmosphere) restricts the vapor movement between the leaves and the zero plane displacement height and r as (resistance soil atmosphere) restricts the vapor movement between the soil surface and the zero plane displacement height, which is the height of the mean canopy flow (Shuttleworth and Wallace, 1985;Thom, 1972).The aerodynamic resistance parameter r aa refers to the resistance to move vapor between the zero plane displacement height and the reference height at which the available measurements were made.The necessary assumptions to parameterize the plant (Setaria sphacelata) and soil-dependent parameters of the Shuttleworth-Wallace equation using the assumptions made by Federer et al. (2003) and Kraft et al. (2011) are listed in Table 2.
Furthermore, soil water extraction by evaporation only affects the top soil layer, and soil water extraction by transpiration is directly controlled by root distribution at a certain soil depth.In accordance with field observations, we assumed an exponential decay of root mass with depth, whereby 90 % of the total root mass is concentrated in the top 0.20 m.

Calibration and validation
For calibration and validation purposes, we compared measured and modeled stable water isotope signatures of 2 H and 18 O of the soil water at each depths of the each wick sampler along the modeled hillslope.Hourly values of the modeled isotopic soil water signature were aggregated to represent the mean isotopic composition in between measurements (≈ 7 days) and are reported in per mill relative to the Vienna Standard Mean Ocean Water (VSMOW) (Craig, 1961).
Literature and measured values for soil and plant parameters (Tables 1 and 2) were used to derive the initial values for the calibration process.The initial states for calibration were retrieved by artificially running the model with those initial values for the first 2 years of the available data set (Table 3).The results of this pre-calibration run were used as a starting point for all following calibration runs.A warm-up period of 4 months (1 July-31 October 2010) preceded the calibration period (1 November 2010-31 October 2011) to adjust the model to the new parameter set.To simulate a wide range of possible flow conditions and limit the degrees of freedom for the possible model realizations, we selected K sat and porosity for calibration, while the van Genuchten-Mualem parameters remained constant since measured pF curves were available.Even though not all sensitive parameters of the Richards equation controlling the flow regime were accounted for during the calibration process, we assume that the measured van Genuchten-Mualem parameters alpha and n are in the good agreement with the actual flow characteristics of the soils.As is typical for the application of the van Genuchten-Mualem approach, the tortuosity/connectivity coefficient remained constant throughout all model runs with a value of 0.5.Beside the four soil parameters shown in Table 1 and the upper and lower boundary conditions, only the nine parameters of the Shuttleworth-Wallace equation (Table 2) had to be set prior to each model run.To further control the unknown lower boundary condition and complement the calibration process, the suction induced by groundwater depth was changed for each calibration run.
To increase the efficiency of the calibration runs and evenly explore the given parameter space, we used the Latin Hypercube method presented by McKay et al. (1979).The parameter range of each variable was therefore subdivided into 10 strata and sampled once using uniform distribution.All strata are then randomly matched to get the final parameter sets.A total of 10 5 parameter sets were generated for calibration with varying values for K sat and porosity for all eight soil types as well as different groundwater depths.An initial trial using 10 4 parameter sets was used to narrow down the parameter range as specified in and porosity for all eight soil types and to 0 to 100 m for the applicable groundwater depths.The performance of each parameter set was evaluated based on the goodness-of-fit criteria Nash-Sutcliffe efficiency (NSE) and the coefficient of determination (R 2 ).In addition, the bias was calculated as an indicator for any systematic or structural deviation of the model.
After the calibration the best-performing ("behavioral") models according to a NSE > 0.15, an overall bias < ±20.0 ‰ δ 2 H and a coefficient of determination R 2 > 0.65 were used for the validation period (Table 3) using the final states of the calibration period as initial values.

Model performance
In order to quantify the flow processes, we first validated the overall suitability of the chosen model approach and the performance of the parameter sets.The parameter sets best representing the isotope dynamics of δ 2 H (as previously defined as best-performing (behavioral) parameter sets; same accounts for δ 18 O; results are not shown) during the calibration period explained the observed variation to an even higher  The linear correlation between modeled and observed isotope dynamics of δ 2 H, for the best-performing parameter sets, were equally good during the calibration and validation period (R 2 ≈ 0.66) (Table 5).The goodness-of-fit criteria for the single best performing parameter set ("best model fit") show an R 2 of 0.84 and a NSE of 0.42.
Figure 3 depicts the measured and modeled temporal development of the soil water isotope profile along the studied hillslope as well as the δ 2 H signature and amount of the incoming rainfall used to drive the model.The measured temporal delay of the incoming signal with depth and the general seasonal pattern of the δ 2 H signal are captured by the model (Fig. 3).
The bias was negative throughout all model realizations during calibration and validation (−15.90 (±0.11SD) ‰ δ 2 H and −16.93 (±0.34 SD) ‰ δ 2 H, respectively; see Table 5).Even though the high bias indicates a structural insufficiency of the model, we are confident that this can be mostly attributed to the discrimination of evaporation processes at the soil-atmosphere interface and on the canopy.
Our first hypothesis -that evaporation in general plays only a minor role for the soil water isotope cycle under full vegetation -therefore needs to be reconsidered.Even though hypothesis I has previously been frequently used as an untested assumption for various models (e.g., Vogel et al., 2010;Dohnal et al., 2012), it is rarely scrutinized under natural conditions.A complete rejection of this hypothesis could therefore affect the interpretations in those studies and limit their applicability.However, further studies are needed to support these findings and before finally rejecting this hypothesis.The lateral mixing processes may be obscuring the observed near-surface enrichment, and the effect of preferential flow currently not fully accounted for could further hinder the full interpretation of these findings.It still holds true that -the quantitative loss due to surface evaporation in areas with a high leaf area index is more or less insignificant (accounting for 38 mm a −1 out of 1896 mm a −1 ; ≈ 2 %; Fig. 5); -the isotopic enrichment due to evaporation for vegetated areas is considerably lower than for non-vegetated areas, as previously shown by Dubbert et al. (2013); -high rainfall intensity constrains any near-surface isotopic enrichment related to evaporation (Hsieh et al., 1998).
However, our results indicate that the contribution of potential canopy evaporation (accounting for 344 mm a −1 out of 1896 mm a −1 ; ≈ 18 %; Fig. 5) to enrich the canopy storage and thereby potential throughfall (discriminating 18 O and 2 H resulting in more positive isotope signatures) still could partially explain the observed bias.
Nevertheless we presume that fog drip, created by sieving passing clouds or radiation fog frequently occurring in the study area (Bendix et al., 2008), explains the majority of the observed bias.Depending on the climatic processes, generating the fog drip is typically isotopically enriched compared to rainfall, due to different condensation temperatures (Scholl et al., 2009).To get an impression of the magnitude of the possible bias due to throughfall and fog drip compared to direct rainfall, we compare the observed bias with a study presented by Liu et al. (2007) conducted in a tropical seasonal rain forest in China.They observed an average enrichment of +5.5 ‰ δ 2 H for throughfall and +45.3 ‰ δ 2 H for fog drip compared to rainfall.Even though the observed enrichment of fog drip and throughfall by Liu et al. (2007) may not be as pronounced within our study area (Goller et al., 2005), the general tendency could explain the modeled bias.According to Bendix et al. (2008) fog and cloud water deposition within our study area contributes 121 to 210 mm a −1 at the respective elevation.Assessing the actual amount fog drip for grass species like Setaria sphacelata under natural conditions is challenging and has so far not been accounted for.
If further discrimination below the surface were to substantially alter the isotope signature, the bias would change continuously with depth.Any subsurface flow reaching wick samplers at lower elevations would then further increase the bias.However, the negative bias of −16.19 (±2.80 SD) ‰ δ 2 H in all monitored top wick samplers during validation accounts for most of the observed bias in the two deeper wick samplers amounting to −17.32 (±2.47 SD) ‰ δ 2 H. Thus we conclude that the bias is mainly a result of constrains related to modeling surface processes, rather than subsurface ones.
Figure 4 shows the behavior of the chosen parameter sets for saturated hydraulic conductivity and groundwater depth during calibration and validation.The parameter space allows us to assess the range of suitable parameters and their sensitivity over a given parameter range.During calibration the given parameter space could not be constrained to more precise values for all parameters, which in this case should show a lower SD (Table 6) and narrower box plots (Fig. 4).Especially the K sat values of the soil layers A1, A3 and B1-B3; the porosity for all soil layers (not included in Fig. 4); and the groundwater depth depict a low sensitive over the entire calibration range (indicated by a high SD, wide box plot and evenly scattered points; Table 6 and Fig. 4).In particular the low sensitivity of the model towards groundwater depth seems surprising, but it can be explained by the potentially low saturated hydraulic conductivities of the lower soil layers C1 and C2 limiting the percolation into the lower soil layers outside of the modeling domain.Even an extreme hydraulic potential, induced by a deep groundwater body, can be limited by a low hydraulic conductivity.Nonetheless it is noteworthy that no model run without an active groundwater body as a lower boundary condition (groundwater depth < 2 m) results in a model performance with NSE > 0 (Fig. 4).With a groundwater depth above 2 m the boundary condition would serve as a source of water with an undefined isotopic signal and prevent any percolation of water into deeper soil layers outside of the modeling domain.The results are therefore in alignment with the topography of the system, indicating an active groundwater body deeper than 2 m, and support our second hypothesis, which we will further discuss in Sect.3.2.We identified several parameter combinations showing the same model performance, known as equifinality according to Beven and Freer (2001).The observed equifinality can partially be explained by counteracting effects of a decreasing K sat and an increasing pore space, or by the water flow being restrained due to lower hydraulic conductivities at adjoining soil layers.Especially for deeper soil layers the interaction between surrounding layers makes it especially difficult to further constrain the given parameter range.Even though the parameter ranges for all behavioral model realizations are not so well confined, the small confidence intervals indicate a certain degree of robustness towards the predicted flows (Fig. 3).Additional soil moisture  measurements complementing the current setup in the future will allow us to put further confidence in this new approach and the drawn conclusions and allow us to directly compare different calibration targets (i.e., soil moisture vs. soil water isotopic signature).
Initial K sat values based on literature values (see Table 1) deviate to a large extent from those derived through the calibration process.This is attributable to the occurrence of preferential flow within the macro-pores (Bronstert and Plate, 1997) and the sampling method (PCaps) used to extract the soil water stored in the soil with a matrix potential up to 30 hPa (Landon et al., 1999).It becomes apparent that the mixing processes (based on dispersion and molecular diffusion) are not sufficient to equilibrate the isotope signature over the entire pore space (Landon et al., 1999;Šimůnek et al., 2003) and that the flow through the pore space is not homogenous.Thus the isotopic signature between the sampled pore media and the total modeled pore space differs (Brooks et al., 2010;Hrachowitz et al., 2013;McDonnell and Beven, 2014;McGlynn et al., 2002).The model tries to account for these effects by favoring high K sat values during calibration (McDonnell and Beven, 2014;McGlynn et al., 2002).
Modeling soil water movement under such conditions should therefore be used with caution for models based on the Darcy-Richards equation, which assumes instantaneously homogeneous mixed solutions and uniform flow.In line with the argumentation started by Beven and Germann (1982) and refreshed in their recent paper (Beven and Germann, 2013), we therefore stress the importance of accounting for preferential flow processes and overcoming the limitation of the Darcy-Richards equation limiting the explanatory power of hydrological models predicting water flow and solute/isotope transport in particular.Like Gerke (2006) and Šimůnek and van Genuchten (2008), among others, we therefore seek to implement a dualpermeability approach accounting for different flow patterns within the soil pore space (Gerke, 2006;Jarvis, 2007;Šimůnek and van Genuchten, 2008;Vogel et al., 2000Vogel et al., , 2006Vogel et al., , 2010)).In the style of existing 1-D models for soil water isotope transport presented by Braud et al. (2005) and Haverd and Cuntz (2010), the inter-soil mixing processes by dispersion and molecular diffusion between different soil pore space compartments shall be accounted for in the future.Based on the presented findings this can now be extended towards the development and application of soil water isotope models under natural conditions.To conclude, the results highlight the general suitability of high-resolution soil water isotope profiles to improve our understanding of subsurface water flux separation implemented in current hillslope model applications and to predict subsurface soil water movement.

Modeled water fluxes
Acknowledging the general suitability of the model to delineate the prevailing flow patterns, we will now compare those to the current hydrological process understanding presented in the Introduction.Figure 5 depicts the water balance of the modeled hillslope based on all behavioral model realizations, separating the amount of incoming precipitation into the main flow components: surface runoff and subsurface flow directly entering the stream, percolation to groundwater and evapotranspiration.
Evapotranspiration is further subdivided into transpiration and evaporation from the soil surface and the canopy, whereby evaporation from the canopy is designated as interception losses.Due to the small confidence intervals of the behavioral model runs (see Fig. 3) the standard deviations of the model's flow components are relatively small (see Fig. 5; standard deviation and mean value were computed without weighting the likelihood value).
The observed order of magnitude for evapotranspiration is in good agreement with previous values of 945 and 876 mm a −1 reported for tropical grasslands by Windhorst et al. (2013a) and Oke (1987), respectively.As previously mentioned, the evaporation of 382 mm a −1 is dominated by interception losses accounting for 344 mm a −1 .Overall, these results support hypothesis II, which stated that a large share of the incoming precipitation is routed through the deeper soil layer and/or the groundwater body (here 49.7 % or 942 mm a −1 ) before it enters the stream.This also explains the long mean transit time of water of around 1.0 to 3.9 years (Crespo et al., 2012;Timbe et al., 2014)  current process understanding and hypothesis III, we can further show that the occurrence of surface runoff (33 mm a −1 ) due to Hortonian overland flow is less important.For the graphical representation the surface runoff has therefore been combined with subsurface flow (2 mm a −1 ) to "surface runoff and subsurface flow", accounting in total for 35 mm a −1 (see Fig. 5).A more heterogeneous picture can be depicted if we take a closer look at the flow processes along the studied hillslope and its soil profiles (Fig. 6).
Vertical fluxes still dominate the flow of water (Fig. 6b), but the near-surface lateral flow components predicted by Bücker et al. (2010) become more evident (Fig. 6a).Explained by the high saturated hydraulic conductivities in the top soil layers (Table 6 and Fig. 4) up to 7.3 × 10 3 m 3 a −1 are transported laterally between cells in the top soil layer, referring to 15.6 % of the total flow leaving the system per year.According to the model results, deep lateral flow is minimal, accounting only for < 0.1 % of the total flow.It only occurs on top of the deeper soil horizons with low K sat values.For all behavioral model realizations the groundwater level was > 2 m, thereby limiting the direct contribution of subsurface flow (2 mm a −1 ) to the tributary, which had a hydraulic potential of only 1.5 m.Over the entire hillslope the importance of overland flow remains below 3 % (≈ 50 mm a −1 ), of which a part is re-infiltrating, summing up to total overland flow losses of around 2 % at the hillslope scale (35 mm a −1 , Fig. 5).These results demonstrate the importance of nearsurface lateral flow and hence support hypothesis IV.

Conclusions
These data and findings support and complement the existing process understanding mainly gained by Goller et al. (2005), Fleischbein et al. (2006), Boy et al. (2008), Bücker et al. (2010), Crespo et al. (2012) and Timbe et al. (2014) to a large extent.Moreover, it was possible to quantify for the first time the relevance of near-surface lateral flow generation.The observed dominance of vertical percolation into the groundwater body, and thereby the importance of preferential flow seems to be quite common for humid tropical montane regions and has recently been reported by Muñoz-Villers and McDonnell (2012) in a similar environment.
Being aware of the rapid rainfall-runoff response of streams within the catchment of the Rio San Francisco, it has been questioned whether and how the system can store water for several years and still release it within minutes.Throughout the last decades several studies have observed similar hydrological behavior especially for steep humid montane regions (e.g., McDonnell, 1990;Muñoz-Villers and McDonnell, 2012), and concepts have been developed to explain this behavior: e.g., piston flow (McDonnell, 1990), kinematic waves (Lighthill and Whitham, 1955) and transmissivity feedback (Kendall et al., 1999).Due to the limited depth of observations (max depth 0.4 m) and the low overall influence of the lateral flows, a more exact evaluation of the fate of the percolated water is still not possible.However, we are confident that in combination with a suitable concept to account for the rapid mobilization of the percolated water into a tributary and experimental findings, further confining possible model realizations, an improved version of the current approach could further close the gap in our current process understanding.
Over decades hydrological models which are based on the Richards or Darcy equation (like the one we used) have been tuned to predict quantitative flow processes and mostly been validated using soil moisture data suitable to account for overall storage changes.Our results imply that doing this considerably well does not necessarily mean that the models actually transport the right water at the right time.Using tracer data to validate models as we did entails that those models now not only have to transport the correct amount but additionally the right water.Consequently, the relevance of the correct representation of uneven preferential flow through pipes or macropores, which is misleadingly compensated by high conductivities over the entire pore space within models based on the Richards or Darcy equation, becomes immense.Distinguishing between water flowing in different compartments (e.g., pipes, cracks and macro-pores) of the soil is a key task to get a closer and more precise representation of the natural flow processes.Even though the chosen modeling structure currently lacks a sufficient robustness to be widely applicable, it highlights the potential and future research directions for soil water isotope modeling.

Figure 1 .
Figure 1.(a) Outline of the modeled hillslope and its virtual discretization into cells.(b) Location of the study area within Ecuador.(c) Photograph showing the location of the wick samplers (P represents pasture, B represents bajo/lower level, M represents medio/middle level, A represents alto/top level sampler).
during the last decades and are now partially used for extensive pasture (Setaria sphacelata Schumach.);reforestation sites (Pinus patula) are covered by shrubs or invasive weeds (especially tropical bracken fern; Pteridium aquilinum L.).The climate exhibits a strong altitudinal gradient, creating relatively low temperatures and high rainfall amounts (15.3 • C and 2000 mm a −1 at 1960 m a.s.l. to 9.5 • C and > 6000 mm a −1 at 3180 m a.s.l.

Figure 2 .
Figure 2. Elevation profile (top black line, left ordinate), succession of soil layer types (color plate) and soil depths assigned to the modeling grid (right ordinate).

Figure 4 .
Figure 4. Dotty plots of NSE values (> 0.0) during calibration (blue) and for behavioral model runs (NSE > 0.15, bias < ±20.0 ‰ δ 2 H and R 2 > 0.65) during calibration (orange) and validation (red) for saturated hydraulic conductivity (K sat ) for all soil types and groundwater depth.Box plots show the unweighted parameter distribution of all behavioral model runs (NSE > 0.15, bias < ±20.0 ‰ δ 2 H and R 2 > 0.65).Results for soil porosity look similar to those of the groundwater and are therefore not shown.

Figure 5 .
Figure 5. Mean annual flows and standard derivation (SD) of the main flow components at a hillslope scale of all behavioral model runs from 2010 to 2012.
in comparison to the fast runoff reaction time.Well in agreement with our D. Windhorst et al.: Stable water isotope tracing through hydrological models

Figure 6 .
Figure 6.(a) Lateral and (b) vertical fluxes for the best modeled fit.Arrows indicate the amount of surface runoff and direct contribution to the outlet through subsurface flow.The maximum flow between storage compartments is 7.3 × 10 3 m 3 a −1 , and the total observed flow leaving as well as entering the system accumulates to 37 × 10 3 m 3 a −1 .

Table 1 .
Soil physical parameters.K sat values are based on values taken within the proximity of the hillslope under similar land use by Crespo et al. ( *

Table 4 .
Table 4 for K sat Soil parameter ranges for the Monte Carlo simulations (assuming uniform distribution for each parameter).

Table 5 .
Model performance during calibration and validation for all behavioral model runs (based on all calibration runs with NSE > 0.15, bias < ±20.0 ‰ δ 2 H and R 2 > 0.65).Best modeled fit based on NSE.

Table 6 .
Parameter ranges used for validation (all calibration runs with NSE > 0.15, bias < ±20.0 ‰ δ 2 H and R 2 > 0.65) and parameter set for the best modeled fit based on NSE.