Tandem use of transit time distribution and fraction of young water reveals the dynamic flow paths supporting streamflow at a mountain headwater catchment

Abstract. Current understanding of the dynamic flow paths and subsurface water storages that support streamflow in mountain catchments is inhibited by the lack of long-term hydrologic data and the frequent use of single age tracers that are not applicable to older groundwater reservoirs. To address this, the current study used both multiple metrics and tracers to characterize the transient nature of flow paths with respect to change in catchment storage at Marshall Gulch, a sub-humid headwater catchment in the Santa Catalina Mountains, Arizona, USA. The fraction of streamflow that was untraceable using stable water isotope tracers was also estimated. A Gamma-type transit time distribution (TTD) was appropriate for deep groundwater analysis, but there were errors in the TTD shape parameters arising from the short record length of 3H in deep groundwater and stream water, and inconsistent seasonal cyclicity of the precipitation 3H time series data. Overall, the mean transit time calculated from 3H data was more than two decades greater than the mean transit time based on δ18O at the same site. The fraction of young water (Fyw) in shallow groundwater was estimated from δ18O time series data using weighted wavelet transform (WWT), iteratively re-weighted least squares (IRLS), and TTD-based methods. Estimates of Fyw depended on sampling frequency, the method of estimation, bedrock geology, hydroclimate, and factors affecting streamflow generation processes. The coupled use of Fyw and discharge sensitivity indicated highly dynamic flow paths that reorganized with changes in shallow catchment storage. The utility of 3H to determining Fyw in deeper groundwater was limited by data quality. Given that Fyw, discharge sensitivity, and mean transit time all yield unique information, this work demonstrates how co-application of multiple methods can yield a more complete understanding of the transient flow paths and observable storage volumes that contribute to streamflow in mountain headwater catchments.


Abstract. Current understanding of the dynamic flow paths and subsurface water storages that support streamflow in mountain catchments is inhibited by the lack of long-term hydrologic data and the frequent use of single age tracers that are not applicable to older groundwater reservoirs. To address this, the current study used both multiple metrics and tracers to characterize the transient nature of flow paths with respect to change in catchment storage at Marshall Gulch, a sub-humid headwater catchment in the Santa 25 Catalina Mountains, Arizona, USA. The fraction of streamflow that was untraceable using stable water isotope tracers was also estimated. A Gamma-type transit time distribution (TTD) was appropriate for deep groundwater analysis, but there were errors in the TTD shape parameters arising from the short record length of 3 H in deep groundwater and stream water, and inconsistent seasonal cyclicity of the precipitation 3 H time series data. Overall, the mean transit time calculated from 3 H data was more than 30 two decades greater than the mean transit time based on δ 18 O at the same site. The fraction of young water (Fyw) in shallow groundwater was estimated from δ 18 O time series data using weighted wavelet transform (WWT), iteratively re-weighted least squares (IRLS), and TTD-based methods. Estimates of Fyw depended on sampling frequency, the method of estimation, bedrock geology, hydroclimate, and factors affecting streamflow generation processes. The coupled use of Fyw and discharge sensitivity indicated 35 highly dynamic flow paths that reorganized with changes in shallow catchment storage. The utility of 3 H to determining Fyw in deeper groundwater was limited by data quality. Given that Fyw, discharge sensitivity, and mean transit time all yield unique information, this work demonstrates how co-application of multiple methods can yield a more complete understanding of the transient flow paths and observable storage volumes that contribute to streamflow in mountain headwater catchments.
Sustainable Development Goal (SDG) #15 of the 2030 Agenda for Sustainable Development (United Nations, 2015) specifically lists this shortcoming as a major hurdle to attaining sustainable development (United Nations, 2021;Creed and Noordwijk, 2018). Therefore, additional research is needed to thoroughly characterize dynamic relationships between storage, flow paths, and streamflow in the 50 mountains.
Dual tracer approaches using stable water isotopes and tritium ( 3 H) are recommended to determine the contribution of deep groundwater to streamflow in mountain sites characterized by fractured bedrock aquifers (Stewart et al., 2010;Stewart et al., 2012). This recommendation originates from previous work that shows how mean transit time (mTT) estimates based on stable water isotopes alone may be 55 underestimated because the tails of the transit time distributions (TTDs) that correspond to longer transit times can become truncated (Stewart et al., 2010;Stewart et al., 2012;Frisbee et al., 2013;DeWalle et al., 1997), and additionally takes into consideration that certain model performance criteria (e.g., Nash-Sutcliffe Efficiency) are insensitive to longer transit times (Seeger and Weiler (2014). Underestimated transit times can have cascading impacts on subsurface weathering rates, leading to incorrect 60 understanding of stream water chemistry (Frisbee et al., 2013;Clow et al., 2018). As a result, the current https://doi.org/10.5194/hess-2021-355 Preprint. Discussion started: 8 July 2021 c Author(s) 2021. CC BY 4.0 License. study leverages the use of 3 H as a tracer with longer-period variations as a means to more completely characterize deeper flow paths that contribute to streamflow.
Using a virtual or synthetic experimental model setup and stable water isotope data, Kirchner (2016bKirchner ( , 2016a identified significant aggregation errors in mTT estimates for heterogeneous catchments. As an alternative, Kirchner (2016bKirchner ( , 2016a proposed the fraction of young water metric (Fyw), i.e., the fraction of water that has resided in a catchment for less than a threshold period of time, that is largely insensitive to spatial and temporal aggregation errors when evaluated for annual tracer cycles in inflow and outflow. Using a similar virtual experimental model setup but with 3 H as a tracer, Stewart et al. (2017) also noted spatial aggregation errors in mTT but not Fyw. Together, these results suggest that catchment 70 storage estimates based on Fyw should be robust. Importantly, these and other Fyw-based studies (Jasechko et al. (2016) Table S6; Table 2) considered annual or seasonal cycles in stable water isotope data or only one period when using only tritium (Stewart et al., 2017) or both tracers (Rodriguez et al., 2021); the authors know of no previous studies that have 75 considered both 3 H and δ 18 O tracers and multiple periods.
When Fyw is related to the discharge flux, it contributes to a more thorough understanding of groundwater flow path dynamics and thus water quantity and quality in headwater catchments. Accordingly, the resulting discharge sensitivity of Fyw can be considered the "diagnostic finger print" for streamflow generation processes. In this way, von Freyberg et al. (2018) distinguished between sites in 80 terms of dominant flowpaths and flowpath changes during large and small storms, and suggested evaluation of their framework in different climatic and geological settings. Gallart et al. (2020b) applied an alternative formulation for discharge sensitivity to a sub-humid Mediterranean field site, again concluding that flow paths responded dynamically to precipitation flux. Similar conclusions resulted from studies of alluvial and lower-elevation mountain-block aquifers in Arizona, USA (Eastoe and Wright,85 2019; Eastoe and Towne, 2018) where recharge only occurred during the wettest ~30% of months. In this study, Fyw is considered along with discharge sensitivity to better understand the degree to which catchment storage and flow paths are interrelated.
The current study addresses the following research questions at a high-elevation, sub-humid mountain site: (i) what is the appropriate TTD type and mTT for the deep groundwater system that 90 supports streamflow? (ii) What are the Fyw and resulting Fyw-based catchment storage estimates calculated from age tracers applicable to younger and older groundwater and stable water isotope and 3 H time series data, respectively? (iii) What is the discharge sensitivity of Fyw as determined by stable water isotope tracers? Following a description of the field site and data, we describe theoretical models and estimation methods for 3 H-based TTD and mTT estimation and Fyw (using both stable water isotopes and 3 H) and its 95 discharge sensitivity (stable water isotopes only). We discuss potential reasons for bias in the Fyw literature toward both stable isotope tracers and annual cyclic variations and broaden the Fyw approach by using both stable water isotope and 3 H tracers simultaneously. The present study complements Dwivedi et al. (2021) who estimated TTD and mTT for shallow groundwater storages at the same study site using gamma distributions fitted to stable water isotope time series data in precipitation and streamflow.  (Dickinson et al., 2002). The prevailing soil type is sandy loam (Holleran, 2013b) with soil depth varying from 0 m to 1.5 m (Pelletier and Rasmussen, 2009). Soils overlying micaceous schist are generally deeper and have a higher clay content than soils overlying granite (Heidbüchel et al., 2013;Holleran, 2013a). Based on a 30-year (1981-2010) record, the long-term 110 average annual precipitation at MGC is 920 mm (PRISM Climate Group, 2018). The catchment received an average of 654 mm (±158 mm) of precipitation per year between water years (WY) 2008 through 2017; the mean annual streamflow for the same period was 247 mm (±138 mm). WY n is defined here as the period from July 1 of year n-1 through June 30 of year n. Instrumentation relevant to this study within and around the field site is shown in Figure 1.

Hydrologic fluxes
The MGC-scale daily precipitation (P) and streamflow (Q) data were calculated between WY 2008 and 2017 ( Figure 2A; Dwivedi et al. (2019a); Dwivedi et al. (2020) intervals at eight measurement sites equipped with tipping bucket precipitation gages at seven locations 120 and a heated precipitation gage at the remaining site ( Figure 1). From the precipitation time series, Thiessen polygon-derived weights were used to estimate daily catchment-scale mean precipitation (Dwivedi et al., 2019a). Streamflow was measured at 30-minute intervals at the MG-Weir site ( Figure 1) using a pressure transducer (U20-001-01; Onset) with maximum error of 0.62 kPa and accuracy of 0.02 kPa and a previously derived stage-discharge relationship (Heidbüchel et al., 2012).

Precipitation
The precipitation bulk samples at MGC were collected using bulk samplers at the Schist, Fern Valley, and Granite station and using ISCO autosamplers at the MG-Weir and Mt. Lemmon stations (Figure 1).
At the Fern Valley, Granite, and Schist stations, two collectors were installed at each station and 130 samples were collected every 5 to 7 days (Heidbüchel et al. (2012); Lyon et al. (2009). At the Mt.
Lemmon and MG-Weir stations, daily bulk precipitation samples were collected. At the Mt. Lemmon station, sampling mainly focused on summer monsoons (Heidbüchel et al., 2012)

Streamflow
Stream water samples were collected using an autosampler installed at the MG-Weir site prior to 2012 and by grab sampling after 2012 ( Figure 2B). While the stream water autosampler collected daily samples, 140 sub-daily samples were also collected on the rising and falling limbs of the hydrograph during large runoff events (Heidbüchel et al., 2012). In the current study, sub-daily samples are volume-weighted to daily resolution (Dwivedi et al., 2021) .

Tritium in precipitation, streamflow and deep groundwater
We calculated the amount-weighted time series of 3 H in Tucson precipitation since 1992 following (i) 145 Eastoe et al. right inset). Additionally, the five discharge measurements from Pigeon Spring ( Figure 1) were considered representative of deep groundwater (Dwivedi et al., 2019b). Data are grouped into half-yearly 160 brackets using the following criteria: (i) sampling months 6 to 10 of year n, and (ii) sampling months 11 of year n-1 to 5 of year n. For groups with three or more measurements, the data are expressed as a mean ± 1σ (Fig. 3 inset).  Dwivedi et al. (2021) proposed an improved practical approach for estimating the TTD of stream water using long-term measurements of hydrologic fluxes and δ 18 O in precipitation and streamflow.

Stable water isotope-based TTD and mTT estimates
Evaluating multiple TTD types and using the weighted wavelet spectral analysis method of Kirchner and Neal (2013), they determined that a combined Piston Flow and Gamma TTD was applicable for 170 periods of up to one month, and that a Gamma TTD was applicable thereafter. The resulting Gamma TTD shape parameter (α) was 0.42±0.001 (dimensionless) and the mTT was 0.82±0.03 years.

Tritium-based TTD and mTT estimation
Previous work determined that deep groundwater at MGC was recharged at a time scale of 0.5 years or 175 less on the basis of end-member mixing analysis (Ajami et al. (2011);Dwivedi et al. (2019b). Here, we use time series 3 H data to expand on these results. Given that these data are only available at seasonal (1) 180 where CRecharge(t) and CQ(t) are 3 H concentration in recharge and stream water at time t, is the transit time in years, t1/2 is the 3 H half-life, and "e" is the exponential function. Since the decay in 3 H concentration during recharge is insignificant in relation to the precision of the analysis (± 0.5 TU), the precipitation 3 H concentration, i.e., the input function in Figure 3, is used as CRecharge(t). Following Maloszewski and Zuber (1993), only TTD types that require at most two fitting parameters were 185 evaluated. Thus, parallel exponential models (Seeger and Weiler, 2014;Hrachowitz et al., 2009) and exponential piston flow models (Georgek et al., 2017) are excluded here. The specific TTD types evaluated by the current study were Piston Flow (PF), Exponential (Exp), Gamma (Gam), Fixed path onedimensional advection dispersion (ADE-1x), and Multiple path one-dimensional advection dispersion (ADE-nx) (Dwivedi et al., 2021). 190

Optimization of model parameters using the Downhill simplex method in conjunction with a performance criterion
The Downhill Simplex method (Nelder and Mead, 1965;Gupta, 2016) was used to evaluate the performance of each TTD (Dwivedi et al., 2021). The modified Kling Gupta efficiency or KGE' (Gupta et al., 2009;Kling et al., 2012) was used as the model performance criterion: In Equation (2) Average catchment scale Péclet number (Pe): 0.1 to 100 (Kirchner et al., 2001;Kirchner and Neal, 2013).

Mathematical development of the flux-weighted ( * ) and unweighted (Fyw) fraction of young water
The fraction of young water (Fyw) can be estimated from the amplitude ratio of tracer concentrations in 210 outflow and inflow for any tracer (Kirchner, 2016b;von Freyberg et al., 2018). Thus, if the amplitudes of the tracer concentrations in outflow and inflow for any period λ are AQ(λ) and AP(λ), respectively, then: This method is preferred when sufficient long-term tracer data are available and can be applied without a priori knowledge of the TTD type (Kirchner, 2016b). If the TTD type and its parameters are 215 known, then Fyw can also be estimated (Kirchner, 2016b;von Freyberg et al., 2018;Stewart et al., 2017): where Tyw is the threshold age for the young water, defined as the upper limit in Equation (4) for which both Equations (3) and (4) provide the equivalent value of ( ). Note that Tyw depends not only on the periods of sinusoidal cycles in tracer concentrations, but also on the TTD parameters. However, the mathematics in both cases generally remain the same, and we provide only the mathematical 225 derivation for Fyw below, unless otherwise noted. It is assumed that the precipitation tracer flux can be represented by Equation (5), which denotes an input function of sinusoidal type with period λ: where AP(λ), ϕP(λ), and KP(λ) are the amplitude, phase angle and a constant for a given period. Similarly, the stream water tracer flux can be represented by Equation (6), which denotes an output function of tracer 230 fluxes with period λ: in which AQ(λ), ϕP(λ), and KQ(λ) are analogous to the variables of Equation (5). Note that AQ is also used to refer to the 3 H tracer cycle in deep groundwater.
where PC(t) and QC(t) are the transient tracer fluxes in precipitation and stream water, respectively, and h(τ) is the transit time distribution or TTD (e.g., Equation 4 for the Gamma TTD). (5) into Equation (7) yields: Kirchner (2016b) has shown that the amplitude damping for the outflow 245 relative to inflow tracer signals can be estimated from the Fourier transform of h(τ). For example, for a Gamma TTD with shape parameter α and scale parameter β (=mTT/ α), its power spectrum can be expressed as the following (Bain, 1982):

Substitution of Equation
Thus, the amplitude damping for any period can be expressed as: where is the standard MATLAB® inverse Gamma function. In section S1, Equation 11 is used to compare the output:input amplitude ratios for various periods with respect to the Exponential TTD, which is a special case of the Gamma TTD. In section S2, the formulation of Tyw (Equation 12)  Multiple methods were utilized to compare * and * estimates from long-term stable water isotope data during the same period. In all cases, estimates of * were obtained using tracer flux data, i.e., as a product of tracer concentration and hydrologic flux. Daily precipitation and streamflow were used to calculate the fraction of precipitation contributing to streamflow where daily precipitation was aggregated 275 to time steps corresponding to the availability of stable water isotope data in precipitation. The iteratively re-weighted least square (IRLS) method was used to estimate * by fitting sinusoidal functions with periods ranging from 2 days to 5 years to tracer flux data (Kirchner, 2016b;von Freyberg et al., 2018).
Additional estimates of * were obtained using the weighted wavelet transform (WWT) method (Kirchner and Neal (2013) and the TTD method (Equation 11), also with periods of 2 days to 5 years. 280 With respect to δ 18 O, method-and period-based * estimates were coupled to previously established TTD parameters (Dwivedi et al., 2021), and multi-year average values of * were used to compare between different methods (Stockinger et al. (2019); Gallart et al. (2020a). The * was estimated as a function of period using * estimates obtained from application of each of the three methods.
The temporal variability of δ 18 O in precipitation was addressed by means of uncertainty analyses 285 of * and * . For * , the temporal variability of δ 18 O was expressed as three statistics: daily mean, mean + 1 standard deviation (σ), and mean -1σ calculated for both precipitation (P) and stream water (Q); consideration of all pair combinations resulted in total nine scenarios. For each period, the minimum, mean (referred to as the ensemble mean below), maximum, and 1σ of the * results were computed for all nine scenarios. The * process was similar but included additional uncertainty associated with the 290 Gamma TTD parameter such that there were 27 total scenarios for each period.

Estimation of Fyw and Tyw using 3 H
The 3  is important to note here that what has been termed "discharge sensitivity" of Fyw in previous studies is in 305 fact the response of discharge to Fyw. Nonetheless, if tracer cycle in precipitation has amplitude AP, then Fyw can be expressed as: Using the expression for tracer cycle amplitude (Equation 13) and an equation similar to Equation (6) above for the sinusoidal tracer concentration cycle, the unweighted tracer cycle in stream water can be 310 expressed as: By fitting CQ(t) from Equation (14) to the observed tracer time series (e.g., blue points in Figure  Nonetheless, using Equation (15), the tracer concentration in stream water can be expressed as: Thus, the IRLS method can be used for estimating Fyw discharge sensitivity and its associated uncertainty.
In contrast, three separate Gam TTD runs yielded similar mTTs (mTT ~ 26 years) and α parameters (5.23). Overall, the PF TTD mTT varied between 4 and 33 years with a coefficient of variation of 0.57 355 (Table 1), and the Gam TTD mTT and α parameters varied between 26 and 30 years (mean mTT = 27 yrs; coefficient of variation = 0.05) and 2.17 to 14.58 (unitless) (mean α = 6.53; coefficient of variation = 0.64), respectively. Considering * variability due to δ 18 O variability in precipitation and stream water, the IRLS method yielded more variable results than the WWT or TTD-based methods, particularly for periods less than one year ( Figure 6A). Note that both the IRLS and WWT methods are based on sinusoidal curve fitting to the observed tracer data, but the * estimates resultant from the WWT method are less scattered because it involves spectral smoothing of data noise (Dwivedi et al., 2020;Kirchner and Neal, 2013). For 375 a period of 1 year, * estimated from the TTD-based method was higher than the corresponding estimates obtained from the IRLS and WWT methods ( Figure 6A). Comparison of ensemble means (due to significant variability in * estimates) for λ = 1 year indicated a * ensemble mean ± 1σ of 34.9 ± 0.5% using the TTD-based method, compared to 11.4 ± 0.7% and 7.9 ± 0.2% for the IRLS and WWT methods, respectively.

Comparison of δ 18 O-based * estimates
As with * , the * results showed significantly greater variability when estimated using the IRLS method compared to the WWT or TTD-based methods, especially for periods below 0.5 years ( Figure  6B). Estimation of * with any method requires a priori knowledge of the TTD parameters (Equation 385 12) that are identical to those used to calculate * . For λ = 1, the ensemble * means ± 1σ were 0.125 ± 0.0058 yrs (TTD), 0.008 ± 0.0013 yrs (IRLS), and 0.004 ± 0.0003yrs (WWT).

Fyw
The time series' of 3 H in groundwater and streamflow (Fig. 3, inset) were too sparse and coarse for reliable 390 estimation of AQ/AP using the IRLS or WWT methods (Equation 3). Instead, we used the TTD method to calculate Fyw with model parameters drawn from section 4.1.1. The Fyw was characterized by a gradual increase in dFyw /dλ ( Figure 6C). For λ = 1, the ensemble mean-based Fyw was (1.6 ± 2.40) x 10 -3 % (blue triangle in Figure 6C).

395
Although Tyw estimated using the TTD method gradually increased with period, dTyw/dλ gradually declined ( Figure 6D). As with the Fyw estimates, large error bars reflect variability in the TTD parameters.
For an annual cycle, the ensemble mean-based Tyw was 2.03 ± 2.22 yr (blue triangle in Figure 6D).

Discharge sensitivity of annual Fyw estimated from δ 18 O data
The discharge sensitivity of Fyw is the slope of Fyw vs. discharge, Q ( Figure 7A). Following the methods 400 of von Freyberg et al. (2018) and Gallart et al. (2020b) and for λ = 1 year, calculated discharge sensitivities were 0.09 ± 0.02 day/mm and 0.11 ± 0.02 day/mm at Marshall Gulch, respectively. However, discharge sensitivity depended on whether tracer data were weighted by streamflow. Discharge sensitivities were 0.09 ± 0.02 (mean ± standard error) and 0.11 ± 0.02 without weighting but decreased to 0.03 ± 0.01 day/mm and 0.04 ± 0.01 day/mm when the tracer data were weighted by streamflow. These estimates 405 were computed by fitting a sinusoidal cycle of an annual period to the observed stream water δ 18 O data.
Analysis of Fyw vs. Q suggests that Fyw initially decreases with increasing Q ( Figure 7A). This pattern may be due to an evaporative increase in stream water δ 18 O under low-flow conditions, leading to an increase in AQ, and thus an increased AQ/AP ratio as Q decreases (Jasechko (2019) Given that the high flow observations correspond to periods immediately following high intensity precipitation, the results suggest an asymptotic nature of discharge sensitivity at higher flows.  , 1997;Dwivedi et al., 2021) and are therefore appropriate for subsurface storages with faster flow (Stewart et al., 2010). In contrast, 3 H is generally considered applicable up to a period of 50 years (Suckow, 2014;Aggarwal, 2013) and is thus appropriate for estimating "hidden" or deep groundwater contributions to streamflow (Stewart et al., 2012). The current study that uses both tracers can separate the contributions of both quick and slow groundwater flow to 430 streamflow in a headwater mountain catchment.

Subsurface storages
The δ 18 O tracer is ostensibly applicable to soil water storage at MGC because the residence time of soil water is expected to be low due to high hydraulic conductivity (Heidbüchel et al., 2013;Heidbüchel et al., 2012;van der Velde et al., 2014). As a result, the δ 18 O tracer-based TTD calculated by the current study 435 is likely associated with soil water storage. The current work also suggests that short-term storage estimates depend on the method used to estimate * and * . If the short-term storage in a catchment is defined as the upper limit of water storage with age ≤ * (or ≤ Tyw), and calculated as * x * *Q (Jasechko et al., 2016), estimates for short-term storage at MGC vary between 0.08 mm (WWT method), 0.22 mm (IRLS method), and 10.7 (TTD method). Using a method akin to IRLS, previous reported global 440 short-term storages ranged between 1 and 55 mm (median 14 mm) (Jasechko et al. (2016). For catchments comparable in size to MGC and for which mean annual discharge data are reported, the range narrows to between 6 mm (Rietholzbach site) and 30 mm (McDonalds B site) (Jasechko et al. (2016). Thus, the In the traditional approach (see Rodriguez et al. (2021)

465
Previous work has proposed Fyw as a metric that can be used to compare hydrologic characteristics among catchments (Jasechko, 2016;Kirchner, 2016b;von Freyberg et al., 2018). Here, Fyw and Tyw estimates for MGC are compared to other study sites; in all cases, the estimates correspond to λ = 1 year.

Estimates from stable water isotopes
The TTD-based * (0.125 ± 0.0058 yrs) was within the range reported by Kirchner (2016b) i.e., between 470 0.11 and 0.25 years for α ranging between 0.2 and 2, but * and * estimates from the IRLS and WWT methods were lower than the corresponding TTD-based metrics (Section 4.2). Gallart et al. (2020a) reported that * estimates for λ = 1 year increased with higher sampling resolution (this finding was also corroborated by Stockinger et al. (2016) with * values of 10.3%, 22.6%, and 30.4% resultant from weekly, high-resolution (30-minute as well as flow-dependent sampling), and "virtual thorough" 475 sampling that involved using 5-minute discharge along with the Fyw vs. Q relationship to estimate Fyw (Table S6). The results of the current work support Gallart et al. (2020a) insofar as the TTD-based * represents a thorough sampling of flowpaths with transit time between 0 and * years. In this way, the TTD based * results may be more reliable than estimates derived from the IRLS or WWT methods that potentially lack thorough sampling of flowpaths between the transit times of 0 and * years. However, 480 the literature on * is mostly based on IRLS or similar methods with few studies reporting TTD-based results ( Table 2).
In comparing the results for MGC with those of other studies (    Table 2); however, the fractured bedrock at MGC is functionally distinct than the "watertight" bedrock characterized by Gallart et al. (2020a) Table 2).
Comparison of the * at MGC to TTD-based Fyw estimates in the literature shows that MGC is at lower end of the range reported by Wilusz et al. (2017) for humid Plynlimon catchments in the U.K.
( Table 2). We attribute these differences to methodological inconsistencies including the as opposed to various periods and wavelet analysis to determine the appropriate TTD type and its parameters at MGC (Dwivedi et al., 2021). A TTD-based * estimate (1.5%) for an oceanic, forested 515 catchment was significantly lower than MGC and was likely due to gently sloping topography at that site versus the steep topography at MGC (Rodriguez et al. (2021)

Estimates from 3 H
The ensemble mean Fyw based on annual 3 H cycles at MGC was 1.6 x 10 -3 %, or effectively 0% ( Figure   6C). and 27 years. At MGC, this is true for AP at periods greater than 19 years, but not for AQ at any period up to 27 years (Section S3); consequently, the available data are inadequate for calculating Fyw. This is apparent in the lack of consistent annual periodicity in the Tucson Basin precipitation data (Figure 3), 545 which may not be possible to overcome even with a much larger 3 H dataset.

Dynamic catchment behavior revealed by the discharge sensitivity
The discharge sensitivity of Fyw at MGC suggests that flowpaths in shallow storages restructure and reorganize dynamically as catchment storage changes ( Figure 7B). Using the discharge sensitivity of There is evidence for a global inverse relationship between topographic slope and Fyw (Jasechko 565 et al., 2016;2017). Topographic roughness and fractured bedrock permeability may also play roles in promoting infiltration to fractured-bedrock aquifers in steep mountainous catchments once shallow   (Table 2; Table S6). The current work contributes to the growing body of Fyw research by quantifying the variability of IRLS-based results at a mountain headwater catchment ( Figure 6A). For an annual tracer cycle, Fyw from the IRLS 585 method was one-third of Fyw from the TTD method, and it is therefore likely that previously reported Fyw estimates may be underestimated. This would have significant implications for Fyw-based understanding of contaminant and nutrient transport, surface water quality (Kirchner, 2016a;Jasechko et al., 2016), and estimation of TTD parameters (Lutz et al., 2018). As a result, future studies utilizing IRLS or similar methods may wish to report Fyw for various periods, in addition to the annual period, in order to better 590 constrain the variability of the results. Future studies that reported TTD-based results would also be useful to characterize the methodological sensitivity of Fyw across a broader range of natural systems.
The use of a 3 H-based Fyw metric has been recommended toward an improved understanding of deep and/or slow flowpaths contributing to streamflow (Jacobs et al., 2018;Jasechko, 2019). However, the current study highlights that the 3 H-based Fyw metric may be inappropriate when there are insufficient 595 deep groundwater data, which is a general limitation in groundwater aquifers including MGC (Rodriguez et al., 2021;Gleeson et al., 2015). This limitation can also lead to significant variability in the estimated Gamma TTD parameters when the 3 H tracer is applied to the question of "hidden streamflow" (Stewart et al., 2010;Stewart et al., 2012;Seeger and Weiler, 2014;Jacobs et al., 2018). In contrast to Fyw, the 3 Hbased mTT metric does not depend on any particular period of tracer cycles in inflow and outflow, but 600 aggregation errors may lead to estimates of mTT that are low by several orders of magnitude relative to known mTTs from virtual experiments, especially in heterogeneous catchments such as MGC (Kirchner, 2016b;Stewart et al. (2017). The current work also demonstrates that the 3 H-based mTT can lead to greatly over-estimated total deep groundwater storage estimates. To address this issue, appropriate longterm discharge estimates, not including storm runoff, are critical to accurate storage calculations (Section 605 5.2). Finally, the current results support the use of multiple ("lumped") parameter models, qualified by site-specific hydrogeological information to reduce aggregation errors in real catchments, but acknowledge that model parameters may be difficult to constrain in the multiple parameter approach (Stewart et al. (2017); Jacobs et al. (2018); Hrachowitz et al. (2009) iteratively re-weighted least square (IRLS), and transit time distribution (TTD) methods to annual cycles of δ 18 O in stream water resulted in flux-based Fyw values of 7.9 ± 0.2%, 11.4 ± 0.7%, and 34.9 ± 0.5%, respectively. The current study therefore constrains the degree to which Fyw depends on the method of estimation. At MGC, the Gamma TTD was preferred on the basis that it thoroughly sampled flow paths between a transit time of zero and a threshold age for young water. In comparison, the IRLS 620 results were scattered over the periods of interest and were only approximately one-third of the TTDbased estimate an annual period; WWT results were similar to IRLS but showed much less scatter owing to spectral smoothing of the data.
The Gamma TTD-based mTT using 3 H data was 27 years. The same methodology yielded an MTT of 0.82 years when based on δ 18 O (Dwivedi et al., 2021); hence, we conclude that the former mTT 625 may correspond to groundwater stored in fractured bedrock, whereas the latter applies to shallow storages in the soil profile. The shape parameters of the 3 H-based Gamma TTD at MGC demonstrated significant variability arising from the short length and inconsistent seasonal cyclicity of the available 3 H time series data that precluded adequate estimation of Fyw in fractured-bedrock groundwater.
Although data quality could be addressed by longer-term observation and attention to precision, 630 variations of 3 H in precipitation at some locations may restrict the applicability of this approach. In summary, using δ 18 O-based Fyw together with its discharge sensitivity was an effective method with which to quantify the dynamic nature of shallow groundwater flowpaths at MGC. Beyond a threshold Fyw in short-term storage, additional infiltration is likely to activate deeper groundwater flow paths.     Table S3), based on uncertainty in amount-weighted 3 H concentrations in precipitation and deeper groundwater.  (h(τ)) and TTD parameters estimated using tritium, which compliments previous stable water isotopebased TTD estimates at the same site (Dwivedi et al., 2021).   (2) and (3) are based on the input and output functions shown in Figure 3. The TTD parameter statistics in columns 880 (5) through (14) are based on the first set of model runs that consider amount-weighted 3 H concentration uncertainty in precipitation and concentration uncertainty in deep groundwater (Table S3). Parameter 1 is the mean transit time (in years). Parameter 2 is not applicable for the PF TTD type and is the scale parameter α (unitless) for the Exp and Gam TTD and the Pe parameter for the ADE-1x and ADE-nx TTD types. The parameter α is set to 1 for the Exp TTD type. KGE' is the modified 885 which ranges between 0 (for the best fitting model) and infinity (for the worst fitting model).