Tandem use of transit time distribution and fraction of young water reveals the dynamic flow paths supporting streamflow at a mountain headwater catchment

Dwivedi, Ravindra; Eastoe, Christopher; Knowles, John F.; McIntosh, Jennifer; Meixner, Thomas; Ferre, Ty P. A.; Minor, Rebecca; Barron-Gafford, Greg; Abramson, Nathan; Stanley, Michael; Chorover, Jon

doi:10.5194/hess-2021-355

Preprints

https://doi.org/10.5194/hess-2021-355

Preprints

08 Jul 2021

| 08 Jul 2021

Status: this discussion paper is a preprint. It has been under review for the journal Hydrology and Earth System Sciences (HESS). The manuscript was not accepted for further review after discussion.

Tandem use of transit time distribution and fraction of young water reveals the dynamic flow paths supporting streamflow at a mountain headwater catchment

Ravindra Dwivedi, Christopher Eastoe, John F. Knowles, Jennifer McIntosh, Thomas Meixner, Ty P. A. Ferre, Rebecca Minor, Greg Barron-Gafford, Nathan Abramson, Michael Stanley, and Jon Chorover

Abstract. Current understanding of the dynamic flow paths and subsurface water storages that support streamflow in mountain catchments is inhibited by the lack of long-term hydrologic data and the frequent use of single age tracers that are not applicable to older groundwater reservoirs. To address this, the current study used both multiple metrics and tracers to characterize the transient nature of flow paths with respect to change in catchment storage at Marshall Gulch, a sub-humid headwater catchment in the Santa Catalina Mountains, Arizona, USA. The fraction of streamflow that was untraceable using stable water isotope tracers was also estimated. A Gamma-type transit time distribution (TTD) was appropriate for deep groundwater analysis, but there were errors in the TTD shape parameters arising from the short record length of ³H in deep groundwater and stream water, and inconsistent seasonal cyclicity of the precipitation ³H time series data. Overall, the mean transit time calculated from ³H data was more than two decades greater than the mean transit time based on δ¹⁸O at the same site. The fraction of young water (F_yw) in shallow groundwater was estimated from δ¹⁸O time series data using weighted wavelet transform (WWT), iteratively re-weighted least squares (IRLS), and TTD-based methods. Estimates of F_yw depended on sampling frequency, the method of estimation, bedrock geology, hydroclimate, and factors affecting streamflow generation processes. The coupled use of F_yw and discharge sensitivity indicated highly dynamic flow paths that reorganized with changes in shallow catchment storage. The utility of ³H to determining F_yw in deeper groundwater was limited by data quality. Given that F_yw, discharge sensitivity, and mean transit time all yield unique information, this work demonstrates how co-application of multiple methods can yield a more complete understanding of the transient flow paths and observable storage volumes that contribute to streamflow in mountain headwater catchments.

Received: 02 Jul 2021 – Discussion started: 08 Jul 2021

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 1650 KB)

Supplement (736 KB)

Download & links

Ravindra Dwivedi, Christopher Eastoe, John F. Knowles, Jennifer McIntosh, Thomas Meixner, Ty P. A. Ferre, Rebecca Minor, Greg Barron-Gafford, Nathan Abramson, Michael Stanley, and Jon Chorover

Status: closed

RC1:
'Comment on hess-2021-355', Anonymous Referee #1, 14 Oct 2021
Recommendation to Editor:

I recommend this paper be rejected for publication in H(HESS) in its current form. I recommend the authors resubmit after major revision. The topic is certainly of great interest in scientific hydrology. The combination of data sets be better leveraged to make clearer inferences about the true range of water residence times in small headwater catchments. I provide three general criticisms here that are reiterated in a numbered list of specific comments about the manuscript and graphics.

One major criticism is that the work relies heavily on antiquated methodologies. Major portions of the results are based on application of the so-called lumped-parameter-transport model based on transit-time distributions (TTDs; equation 1). The TTDs have time-invariant parameters, which assumes that the distribution of flow pathways within the landscape and associated water velocities are constant in time. This assumption defies intuition, but was applied out of convenience for decades [e.g., most works reviewed by , 2006]. A theoretical basis for analysis based on TTDs was presented as early as Lewis and Nir [1978]. The theory has been advanced by many recent works [, 2010; 2011; , 2015; , 2011; , 2012]. That TTDs should be time-invariant was shown to be theoretically implausible for low-order watersheds with dynamic flow [, 2011]—a result that has been supported by empirical results from manipulative tracer experiments [e.g., , 2016b]. I don’t think our premier disciplinary journals should continue publishing results based on this antiquated approach.

A second major criticism is that there are significant shortcomings in the data, especially the measurements of Tritium in surface waters. There appear to be only 6 data points representing Tritium abundance in stream water. That’s not many, but the authors rely on that data to calibrate and compare a range of models. Also, the time interval over which precipitation is sampled is coarse. Much of the temporal dynamics of tracer concentration in that inflow will be lost. For these reasons, most of the parameters for the various TTD models are not uniquely identifiable. The different models generate markedly different estimates of different water age metrics. There is inadequate guidance, or rationale, for the reader to understand which, if any, should be considered correct.

The third major criticism is that the article does not clearly convey what outstanding question/problem in scientific hydrology is likely to be resolved through the elaborate set of methodologies employed here. The discussion section does not convey any new insights about flow processes in headwater catchments. Rather, that section seems to emphasize intricacies of the various technical approaches that lead to order-of-magnitude differences in water age metrics such as the fraction of young water and mean transit time.

My first recommendation for revision would include omission of methods and results that are based on lumped-parameter transport modeling using time-invariant TTDs. My second recommendation for revision would be a deep consideration of what is the specific gap in knowledge that the research would address, and a deeper discussion about what all these (somewhat abstract) age metrics tell us about flow processes, or the linkage between water age and catchment structure, that we didn’t already know. Emphasis should be placed on new knowledge that may be generalizable across watersheds. This is especially important since the study focuses on a single watershed where quite a lot of tracer-aided flow and transport studies have already been conducted [, 2013; , 2012; , 2008; 2009], including multiple previous works by the lead author.

Specific comments on content in the text:

Line 23: The phrase “single age tracer” is unclear and not conventional. Please omit or rephrase.

Lines 42-51: The rationale provided here is not very strong. To say that the processes being studied are “still incompletely understood” is not a very effective way to communicate (1) what aspects of the processes well understood (because certainly a lot of prior knowledge does exist), (2) what is/are the explicit knowledge gap/s, and (3) how this research is designed to specifically address that knowledge gap.

Lines 59-61: Doesn’t really make sense to say that an underestimated quantitative metric affects actual substrate weathering in Earth’s crust. The conjunctive phrase “As a result” in the following sentence seems out of place. Consider rephrasing.

Lines 70-71: I think you should temper the language here. Robust? That implies the result should be representative across a range of systems. Yet the papers you cite apparently rely on simulated experiments in synthetic landscapes. Any evidence from the real world that you could cite?

Lines 74-76: To help emphasis the knowledge gap, I think you need to clarify what exactly is meant by “only one period”.

Line 90: The question “what is the appropriate TTD type” is somewhat unclear. Precipitation is episodic. There is not a continuous inflow of water volumes with different ages entering any watershed. Therefore, the distribution of transit times (i.e., exit time – entry time) must also be discontinuous. This fact is illustrated by real TTDs observed from active tracer introductions [e.g., , 2016a]. They are quite messy and not continuous distributions. Any continuous function that is chosen as a TTD for application in lumped-parameter-transport modeling is therefore just an approximation of reality. If that is accepted as true, then it seems your question could be restated as “what mathematical distribution yields simulation results that best fit the data from this particular watershed?”. That is not a question of great relevance for scientific hydrology in general, in my opinion.

Lines 91-93: I have a very hard time interpreting this sentence. Please rephrase. Again, I would suggest carefully explaining, or omitting, the phrase “age tracers”. Are you meaning to distinguish stable isotopes from radioactive isotopes?

Lines 93-94: Suggest deleting “...as determined by stable water isotope tracers”. It implies to the reader that the answer to your more general question (i.e., “what is the discharge sensitivity of F_yw”) is somehow conditional on this particular data type? Is that in fact what you think? If so, it raises some concern about the generality of the results.

Lines 95-100: Suggest deleting all of this. A prelude to the methods elaborated on the following pages is unnecessary. The concluding paragraph of the introduction should highlight the identified knowledge gap then state the objectives of this study and how they address that gap. The final sentence raises some concern that the current work is partially redundant.

Lines 112-113: So it was a notably drier than average 9 years, or the PRISM results are biased high here?

Lines 130-131: This is a very coarse sampling resolution for the intended application of the data. Undoubtedly there are tremendous temporal dynamics in the stable-isotope composition of precipitation within and among individual storms that occur during 5-7 day intervals. The range of stable-isotope abundances in precipitation observed during individual storms may be comparable or greater to the range observed among monthly-aggregated samples collected across years [e.g., , 1993]. The true temporal dynamics of tracer concentration in precipitation are lost in a lumped sample that aggregates over 5-7 days. Any quantitative model that uses those tracer concentrations as input will be very limited in its ability to accurately simulate the temporal dynamics of the same tracer in the stream. That limitation seems very germane to the stated objectives of this study. Passive, sequential sampling devices are easy to make and deploy. Analysis of stable isotope abundances by laser spectrometry for large sample numbers is relatively inexpensive. This data limitation is hard to excuse.

Lines 149-151: I can’t quite understand what this means. Please consider rephrasing.

Line 164-166: Simplify the headings and sub-headings. Here and elsewhere there are sub-headings with no content underneath. Suggest deleting.

Line 171: When you say “thereafter”, do you mean over longer time increments than 1 month? Please rephrase to clarify.

Line 176: Some formatting inconsistencies with citations here and throughout the manuscript. Uneven use of open and closed parentheses and lack of spacing between cited papers within in-line citations. Proofread carefully. Suggest using “[(“ instead of duplicate parentheses. Also, I cannot find the Dwivedi 2019b entry in your bibliography. Is it missing? Put spaces between entries in the bibliography. It is terribly difficult to read through single spaced.

Line 177: “expand on these results” again seems to suggest this is somewhat redundant with the previous works from the same catchment.

Lines 185-190: Pretty sure h(tau) is the specified functional form of the TTD, but that is not stated in the paragraph.

Line 193: I am not familiar with the DownHill Simplex method. It is described in a single sentence, yet it is apparently the method for evaluating how appropriate is one versus the other TTD model. Could you please elaborate just a little bit on what this is for the unaffiliated reader? The KGE is used as the “model performance criteria” but you say that the Downhill Simplex was used to evaluate “the performance of each TTD”. This is confusing to me. Equation 1 is the model, but the variable performance of the model is due only to the selection of different functional forms of the TTD. So the performance of the model is a direct reflection of the performance of the function selected as TTD, no? Please clarify.

Line 228, equation 5: Use “C” with “Q” and “P” subscripted to indicate concentration in streamflow versus precipitation. You already adopted this notation in equation 1. Be consistent here and in subsequent equations.

Lines 348-357: What about all the other models? You only discuss PF and Gamma. The KGE of the 1d-ADE falls exactly between the values for the Gamma and PF models, yet the mTT estimated by the 1d-ADE is factors of 8-9 less than the mTT from those models, respectively. Why do you ignore the other models and what do you conclude from this order-of-magnitude difference? If I understand Figure 4 correctly, then only the “ADE-nx” and Exponential function as TTDs seem to generate uniquely identifiable parameters. Is that correct? Neither model is discussed at all here.

Lines 390-392: The data are also far too sparse to reliably fit the parameters for TTDs used in equation 1. Isn’t this confirmed by (1) the lack of unique solutions illustrated in most cases shown in Figure 4 and (2) the generally poor accuracy of all model simulations shown in Figure 5? I would argue yes.

Lines 400-414: The text makes no allusion at all to Figure 7D, which has unusual qualitative axes and cannot be easily interpreted by the reader. The figure caption only provides a citation to a previous work to explain the graphic. More explanation is needed in this section of the Results, or Figure 7B should be deleted.

Lines 437-440: I am unclear what is the importance or relevance of this concept of “short-term storage”. What is it and why does it matter? In any case, you present estimates of this metric based on three competing approaches that vary by a factor of approximately 125 (0.08, 0.22, and 10.7)! Which, if any, should we believe is correct, and why?

Lines 472-477: The results are highly dependent on the temporal resolution of the input time series. As I noted in a comment above, if a temporal dynamic in the tracer concentration in precipitation is hidden within a sample that accumulated over 5-7 days, then the model can’t possibly simulate the effects of that dynamic in streamflow. The results are entirely dependent on the resolution of sampling the tracer concentration in inflow, and the resolution used in this study is quite coarse.

Lines 499-501: You make a sweeping assumption here that the bedrock at several research sites is “water tight”. That seems quite speculative. What evidence supports this assumption? Most rocks are fractured and jointed to some extent. Even exposed, granitic plutons commonly have sufficient fracturing and water storage capacity to host woody-stemmed plant communities and support inter-storm flow from emergent springs. More support for this assumption is needed here, perhaps through more extensive synopsis of the geology of the sites used in these cited studies.

Lines 538-539 and 544-546: So, you’re saying the fraction of young water estimates are invalid when based on the use of Tritium as a tracer?

Lines 555-564: Here there are a series of sentences elaborating some intricate details of methodology which seem misplaced in the discussion. They lead to the ultimate conclusion at the end of the paragraph that “infiltration may activate deeper groundwater flowpaths”. That is not a novel conclusion in scientific hydrology, and it is not even stated definitively here (i.e., ..). This paper uses a wide ensemble of methodological approaches, which, from my view, has only created ambiguity in how the markedly contrasting results can be interpreted. I find no new insight into hydrological processes resulting from all this computational effort.

Comments on Figures and Tables:

Figure 3: Does the inset have a linear scale? If not, please make it linear. If so, please add more tick marks to the vertical axis so we can approximate the numeric values of the data points. Are these six data points all you have to calibrate the parameters of the TTD models used with equation 1? If so, that seems inadequate.

Figure 4: Is the vertical axis the KGE? If so, please label it that way. I am unclear what the variable “response surface” on the vertical axis indicates.

Table 1: Words and numbers should not be split between rows. Use emboldened lines, or no lines, to better delineate the content. This is not acceptable for a journal article. Please make it more presentable.

Figure 5: Y-axis labels should have “3” as a superscript preceding “H” to conform to established conventions of symbolizing isotopes. Would suggest compressing this into a single graph with a legend indicating the results from different TTD models. The gray dots are the same across all 5 subplots.

Figure 7: Here and elsewhere the font size is illegible. Please enlarge font on axes and in legends.

Works Cited:

Botter, G., E. Bertuzzo, and A. Rinaldo (2010), Transport in the hydrologic response: Travel time distributions, soil moisture dynamics, and the old water paradox, , , doi:W0351410.1029/2009wr008371.

Botter, G., E. Bertuzzo, and A. Rinaldo (2011), Catchment residence and travel time distributions: The master equation, , , doi:L1140310.1029/2011gl047666.

Harman, C. J. (2015), Time-variable transit time distributions and transport: Theory and application to storage-dependent transport of chloride in a watershed, , (1), 1-30, doi:10.1002/2014wr015707.

Heidbuchel, I., P. A. Troch, and S. W. Lyon (2013), Separating physical and meteorological controls of variable transit times in zero-order catchments, , (11), 7644-7657, doi:10.1002/2012wr013149.

Heidbuchel, I., P. A. Troch, S. W. Lyon, and M. Weiler (2012), The master transit time distribution of variable flow systems, , , doi:W0652010.1029/2011wr011293.

Kim, M., L. A. Pangle, C. Cardoso, M. Lora, T. H. M. Volkmann, Y. Wang, C. J. Harman, and P. A. Troch (2016), Transit time distributions and StorAge Selection functions in a sloping soil lysimeter with time-varying flow paths: Direct observation of internal and external transport variability, , (9), 7105-7129, doi:10.1002/2016WR018620.

Lewis, S., and A. Nir (1978), On tracer theory in geophysical systems in the steady and non-steady state. Part II. Non-steady state - theoretical introduction, , , 260-271.

Lyon, S. W., S. L. E. Desilets, and P. A. Troch (2008), Characterizing the response of a catchment to an extreme rainfall event using hydrometric and isotopic data, , (6), doi:W0641310.1029/2007wr006259.

Lyon, S. W., S. L. E. Desilets, and P. A. Troch (2009), A tale of two isotopes: differences in hydrograph separation for a runoff event when using delta D versus delta O-18, , (14), 2095-2101, doi:10.1002/hyp.7326.

McGuire, K. J., and J. J. McDonnell (2006), A review and evaluation of catchment transit time modeling, , (3-4), 543-563, doi:10.1016/j.jhydrol.2006.04.020.

Rinaldo, A., K. J. Beven, E. Bertuzzo, L. Nicotina, J. Davies, A. Fiori, D. Russo, and G. Botter (2011), Catchment travel time distributions and water flow in soils, , , doi:10.1029/2011wr010478.

Rozanski, K., L. Araguas-Araguas, and R. Gonfiantini (1993), Isotopic patterns in modern global precipitation, , , 1-36.

van der Velde, Y., P. J. J. F. Torfs, S. E. A. T. M. van der Zee, and R. Uijlenhoet (2012), Quantifying catchment-scale mixing and its effect on time-varying travel time distributions, , (6), W06536, doi:10.1029/2011wr011310.
Citation: https://doi.org/10.5194/hess-2021-355-RC1
- AC3:
  'Reply on RC1', Ravindra Dwivedi, 27 Oct 2021
  RC1: 'Comment on hess-2021-355', Anonymous Referee #1, 14 Oct 2021
  Recommendation to Editor:
  1.1 I recommend this paper be rejected for publication in Hydrology and Earth System Science(HESS) in its current form. I recommend the authors resubmit after major revision. The topic is certainly of great interest in scientific hydrology. The combination of data sets mightbe better leveraged to make clearer inferences about the true range of water residence times in small headwater catchments. I provide three general criticisms here that are reiterated in a numbered list of specific comments about the manuscript and graphics.
  Dear reviewer, Thank you for your in-depth review of our paper. We plan on addressing each of your comments related to novelty and organization of our paper in the revised version of the paper documents. Please see also our responses to the comments # 1.2 and #1.4 below.
  1.2 One major criticism is that the work relies heavily on antiquated methodologies. Major portions of the results are based on application of the so-called lumped-parameter-transport model based on time-invariant transit-time distributions (TTDs; equation 1). The TTDs have time-invariant parameters, which assumes that the distribution of flow pathways within the landscape and associated water velocities are constant in time. This assumption defies intuition, but was applied out of convenience for decades [e.g., most works reviewed by McGuire and McDonnell, 2006]. A theoretical basis for analysis based on time-variable TTDs was presented as early as Lewis and Nir [1978]. The theory has been advanced by many recent works [Botter et al., 2010; 2011; Harman, 2015; Rinaldo et al., 2011; van der Velde et al., 2012]. That TTDs should be time-invariant was shown to be theoretically implausible for low-order watersheds with dynamic flow [Botter et al., 2011]—a result that has been supported by empirical results from manipulative tracer experiments [e.g., Kim et al., 2016b]. I don’t think our premier disciplinary journals should continue publishing results based on this antiquated approach.
  Our response: We appreciate this concern but respectfully disagree. The papers cited in the above comment mostly focus on the highly dynamic and not the relatively stable part of a catchment’s flow system (e.g., baseflow through fractured bedrock aquifers). The papers cited above or those listed at the end of the reviewer’s comments or similar papers related to dynamic transit time distributions (TTDs) have used either stable water isotopes (e.g., Heidbüchel et al. [2013]; Heidbüchel et al. [2012])) or similar tracers (e.g., chloride tracer in Harman [2015]; Hrachowitz et al. [2016]), which are applicable on shorter time scales [Suckow, 2014]. In contrast, our study uses tritium sampled under low flow or baseflow conditions, and this tracer is applicable under a longer time scale than stable water isotope tracers [Suckow, 2014]. Most importantly, the use of any time-variable transit time method (e.g., rSAS functions of Harman [2015], master equation method [Botter et al., 2011; Heidbüchel et al., 2012] or wavelet analysis methods [Dwivedi et al., 2021]) requires high-frequency data on hydrologic fluxes and tracer concentrations in inflow and outflow. These high frequency datasets of hydrologic fluxes and tritium concentrations are generally not available on the time scale for which tritium-based TTDs are estimated; this includes our study site, Marshall Gulch catchment, as well as many global sites as acknowledged in Gleeson et al. [2015]. Therefore, several studies, when modeling TTD using tritium sampled under baseflow conditions or in groundwater, use steady state TTDs (e.g., Stewart et al. [2017]; a paper published recently in HESS). Even with a coarser resolution, tritium observations are able to shed light on long-period dynamics in the transit time distributions for deep fractured bedrock aquifer groundwaters. In the revised version of our paper, we plan to clearly highlight the aforementioned point as a rationale for using the steady state version of TTDs.
  1.3 A second major criticism is that there are significant shortcomings in the data, especially the measurements of Tritium in surface waters. There appear to be only 6 data points representing Tritium abundance in stream water. That’s not many, but the authors rely on that data to calibrate and compare a range of models. Also, the time interval over which precipitation is sampled is coarse. Much of the temporal dynamics of tracer concentration in that inflow will be lost. For these reasons, most of the parameters for the various TTD models are not uniquely identifiable. The different models generate markedly different estimates of different water age metrics. There is inadequate guidance, or rationale, for the reader to understand which, if any, should be considered correct.
  Our response: The purpose of this study was not to explicitly simulate high temporal resolution dynamics of tracer concentrations in stream water. In contrast, by using the stable water isotope tracers, our study seeks to estimate the fraction of young water metric in a time averaged sense to compare and contrast the value and information contained in that metric at our field site to literature-reported data from other sites, in order to better understand dynamic flow path behavior. The temporal resolution of the stable water isotope data in precipitation and stream water was sufficient to meet these study objectives, based on estimated Fyw values that were within the range reported in the literature using the TTD-based method. Please also note that the stable water isotope data (collected during water year 2008 through 2012) in our study were collected by the Santa Catalina Mountains and Jemez River Basin Critical zone observatory prior to this study. In our study, the TTD model performance was not only assessed by the value of the Kling-Gupta efficiency or KGE’, but also by: (i) evaluating the reliability of the estimated optimal model parameters by running the same model three times with separate initial model parameter guesses and (ii) determining whether the estimated model parameters are within the permissible parameter space. Thus, while KGE’ model performance may be lower than the KGE’ obtained from a Gamma TTD, the response surface for an ADE-1x TTD type (Figure 4C) suggests that the estimated model parameters were not reliable and sometimes at the edge of the permissible parameter space. In contrast, the estimated model parameters with a Gamma TTD were unique (Figure 4B vs. 4C). A similar explanation also applies when comparing a Gamma to an exponential TTD type (Figure 4A vs. 4B) where the response surface of an ADE-nx TTD type is similar to the response surface for the exponential TTD in the sense that the estimated model parameters are at the edge of their permissible parameter spaces. Therefore, ADE-1x, ADE-nx and Exponential TTD types do not meet our set criteria for selecting an appropriate TTD type and its parameters. For the sake of brevity, we only discussed piston flow and gamma TTD types in section 4.1.1. In section S3 in supporting information we provide more in depth information about the performance of each TTD type for three separate model runs. To address this comment, we plan on including some more details about the other TTD types in the revised version of this section. Please see also our response to comment # 1.2 above.
  1.4 The third major criticism is that the article does not clearly convey what outstanding question/problem in scientific hydrology is likely to be resolved through the elaborate set of methodologies employed here. The discussion section does not convey any new insights about flow processes in headwater catchments. Rather, that section seems to emphasize intricacies of the various technical approaches that lead to order-of-magnitude differences in water age metrics such as the fraction of young water and mean transit time.
  Our response: We used multiple metrics including the state-of-the-art fraction of young water (Fyw) metric and the mean transit time metric in conjunction with both young and old groundwater residence time tracers to better understand the dynamic nature of hydrologic flowpaths at a sub-humid mountain catchment in Arizona, USA. Please note that our work builds upon previous efforts [Kirchner, 2016a; b; Stewart et al., 2017] that show that spatial and temporal aggregation errors are lower when using the fraction of young water metric compared to the mean transit time metric. However, as acknowledged by the reviewer, these efforts are made using a virtual experimental setup. Therefore, the principal contribution of our study is co-application of these two metrics, i.e., fraction of young water and mean transit time, using multiple tracer types for a real world catchment. Additionally, our study makes the following specific contributions:
  1. Use of multiple methods to estimate Fyw that demonstrate the variability associated with sampling frequency, hydroclimate, the method used in its estimation, and the processes that dictate streamflow generation.
  2. As most of the existing literature on fraction of young water metric is focused on Fyw for annual or seasonal tracer cycles, in our work we estimated Fyw for not only annual cycles but also for several periods ranging from 2 days to 5 years when using stable water isotope tracers.
  3. Delineation of a consistent mathematical framework to estimate Fyw using both young- and old- groundwater age tracers. Our proposed framework is flexible and can be easily employed at sites with long-term observations of young and old groundwater age tracers in inflow and outflow
  4. Characterization and discussion of the limitations of tritium-based Fyw estimates due to a common lack of long-term data, coarse and/or sparse tritium concentration time series observations, and/or lack of measurement precision
  5. Description of an alternative approach to more reliably estimate deep subsurface storage
  6. Characterization of dynamic flowpaths that reorganize and restructure with catchment storage through the use of multiple metrics
  7. Identification of a threshold short-term storage that once reached, increases the propensity for precipitation to infiltrate and activate deeper flow paths
  In light of these contributions, we believe our study will be broadly useful to hydrological researchers and practitioners that rely on either Fyw or mean transit time metrics to understand the subsurface residence time of water, or that aim to constrain the links between water quantity and quality as water moves along subsurface flow paths. However, based on this reviewer’s comment, we now recognize that we did not do a sufficient job of communicating these contributions to hydrologic scientists, and will improve the discussion in this regard in the revision.
  1.5 My first recommendation for revision would include omission of methods and results that are based on lumped-parameter transport modeling using time-invariant TTDs. My second recommendation for revision would be a deep consideration of what is the specific gap in knowledge that the research would address, and a deeper discussion about what all these (somewhat abstract) age metrics tell us about flow processes, or the linkage between water age and catchment structure, that we didn’t already know. Emphasis should be placed on new knowledge that may be generalizable across watersheds. This is especially important since the study focuses on a single watershed where quite a lot of tracer-aided flow and transport studies have already been conducted [Heidbuchel et al., 2013; Heidbuchel et al., 2012; Lyon et al., 2008; 2009], including multiple previous works by the lead author.
  Our response: Thank you. We will revise accordingly. Please see our response to comments 1.1 and 1.2 above.
  Specific comments on content in the text:
  Line 23: The phrase “single age tracer” is unclear and not conventional. Please omit or rephrase.
  
  Our response: Thank you. The rephrased sentence between lines 21 and 23 now reads: “Current understanding of the dynamic flow paths and subsurface water storages that support streamflow in mountain catchments is inhibited by the lack of long-term hydrologic data and the frequent use of short residence time tracers that are not applicable to older groundwater reservoirs.”
  Lines 42-51: The rationale provided here is not very strong. To say that the processes being studied are “still incompletely understood” is not a very effective way to communicate (1) what aspects of the processes are well understood (because certainly a lot of prior knowledge does exist), (2) what is/are the explicit knowledge gap/s, and (3) how this research is designed to specifically address that knowledge gap.
  
  Our response: In the revised version of the main document, we plan on revising text between lines 42 and 51 to clearly state existing knowledge gaps and study objectives. Please see also our response to comment # 1.4 above.
  Lines 59-61: Doesn’t really make sense to say that an underestimated quantitative metric affects actual substrate weathering in Earth’s crust. The conjunctive phrase “As a result” in the following sentence seems out of place. Consider rephrasing.
  
  Our response: Our intention was to point out that an underestimated residence time can lead to inaccurate understanding or estimate of subsurface mineral weathering rate (as shown by Frisbee et al. [2013]). In the revised version of the document, we plan to revise text between lines 59 to 62 to address this comment.
  Lines 70-71: I think you should temper the language here. Robust? That implies the result should be representative across a range of systems. Yet the papers you cite apparently rely on simulated experiments in synthetic landscapes. Any evidence from the real world that you could cite?
  
  Our response: In the revised document, we used “more accurate” instead of the word “robust”. Please note that the use of “robust” in the original document refers to only one site and not to any range of systems. Our use of the word is also motived by the results reported by previous efforts [Kirchner, 2016a; b; Stewart et al., 2017] that show that the spatial and temporal aggregation errors are much less when using the fraction of young water metric, in contrast to mean transit time metric. However, as acknowledged by the reviewer, these efforts are made using a virtual experimental setup. Therefore, co-application of these two metrics, i.e., fraction of young water and mean transit time, using multiple tracer types for a real catchment, is the novel contribution of our study.
  Lines 74-76: To help emphasis the knowledge gap, I think you need to clarify what exactly is meant by “only one period”.
  
  Our response: Here, our intention was to highlight that most of the literature on fraction of young water metric is focused on Fyw for annual or seasonal tracer cycles. In our work, we estimated Fyw for not only annual time frames but also for shorter periods ranging from 2 days to 5 years (Figure 6A and B).
  Line 90: The question “what is the appropriate TTD type” is somewhat unclear. Precipitation is episodic. There is not a continuous inflow of water volumes with different ages entering any watershed. Therefore, the distribution of transit times (i.e., exit time – entry time) must also be discontinuous. This fact is illustrated by real TTDs observed from active tracer introductions [e.g., Kim et al., 2016a]. They are quite messy and not continuous distributions. Any continuous function that is chosen as a TTD for application in lumped-parameter-transport modeling is therefore just an approximation of reality. If that is accepted as true, then it seems your question could be restated as “what mathematical distribution yields simulation results that best fit the data from this particular watershed?”. That is not a question of great relevance for scientific hydrology in general, in my opinion.
  
  Our response: Note that our complete research question # 1 is “what is the appropriate TTD type and mTT for the deep groundwater system that supports streamflow?” Thus, our focus is estimating/finding an appropriate transit time distribution and distribution for deep groundwater. Please see also our response to comment # 1.2 above.
  Lines 91-93: I have a very hard time interpreting this sentence. Please rephrase. Again, I would suggest carefully explaining, or omitting, the phrase “age tracers”. Are you meaning to distinguish stable isotopes from radioactive isotopes?
  
  Our response: As most of the existing literature on the fraction of young water metric is based on the stable water isotope or chloride tracers, the aim here was to extend this literature by including tritium tracer-based fraction of young water estimates. The rephrased question 2 between lines 91 and 93 reads “What do the Fyw and storage estimates vary between shallow and deep groundwaters for a high elevation mountainous catchment?” Please see also our response to comment # 1.2 above.
  Lines 93-94: Suggest deleting “...as determined by stable water isotope tracers”. It implies to the reader that the answer to your more general question (i.e., “what is the discharge sensitivity of F_yw”) is somehow conditional on this particular data type? Is that in fact what you think? If so, it raises some concern about the generality of the results.
  
  Our response: Thank you. We plan on revising the third research question in the revised version of the main document.
  Lines 95-100: Suggest deleting all of this. A prelude to the methods elaborated on the following pages is unnecessary. The concluding paragraph of the introduction should highlight the identified knowledge gap then state the objectives of this study and how they address that gap. The final sentence raises some concern that the current work is partially redundant.
  
  Our response: In the revised version of the main document, we plan on revising the text between lines 94 and 100 to better highlight the knowledge gap and to more concisely state the study objectives and how these study objectives address the identified knowledge gaps.
  Lines 112-113: So it was a notably drier than average 9 years, or the PRISM results are biased high here?
  
  Our response: It was notably drier.
  Lines 130-131: This is a very coarse sampling resolution for the intended application of the data. Undoubtedly there are tremendous temporal dynamics in the stable-isotope composition of precipitation within and among individual storms that occur during 5-7 day intervals. The range of stable-isotope abundances in precipitation observed during individual storms may be comparable or greater to the range observed among monthly-aggregated samples collected across years [e.g., Rozanski et al., 1993]. The true temporal dynamics of tracer concentration in precipitation are lost in a lumped sample that aggregates over 5-7 days. Any quantitative model that uses those tracer concentrations as input will be very limited in its ability to accurately simulate the temporal dynamics of the same tracer in the stream. That limitation seems very germane to the stated objectives of this study. Passive, sequential sampling devices are easy to make and deploy. Analysis of stable isotope abundances by laser spectrometry for large sample numbers is relatively inexpensive. This data limitation is hard to excuse.
  
  Our response: The purpose of this study was not to explicitly simulate high temporal resolution dynamics of tracer concentrations in stream water. In contrast, by using the stable water isotope tracers, our study seeks to estimate the fraction of young water metric in a time averaged sense to compare and contrast the value and information contained in that metric at our field site to literature-reported data from other sites, in order to better understand dynamic flow path behavior. It is important to note that:
  Whatever sampling interval is chosen, there will be shorter periods of data variation that are not sampled, a characteristic Nyquist frequency, and a range of frequency responses that cannot be addressed. We have limited out interpretations to responses that can be addressed, using the data available.
  
  The physics of our system will act to filter out very high frequency variations is isotopes in precipitation. Soils are wet by rain and remain wet as more rainwater is added – mixing is inevitable. Runoff flowing in the main stem of a stream is a mixture of flow from small tributaries of different length, water held on wet leaves and water held in leaf litter or very shallow soil. Given that most of our summer storms are < 1 hour in length, mixing at shorter time scales is likely.
  
  High frequency variations (which cannot indeed be addressed in our study because of the 5-7 day sampling ) are of less interest than lower-frequency, i.e., longer-period, phenomena, as we seek estimates of mean transit time and fraction of young water metrics in a time-averaged sense.
  
  Lines 149-151: I can’t quite understand what this means. Please consider rephrasing.
  
  Our response: In the revised version of the main document, we plan on revising the sentence between 148 and 151 for a better readability.
  Line 164-166: Simplify the headings and sub-headings. Here and elsewhere there are sub-headings with no content underneath. Suggest deleting.
  
  Our response: In the revised version of the document, we plan on providing an improved structure of each section. We further plan to simplify section headings.
  Line 171: When you say “thereafter”, do you mean over longer time increments than 1 month? Please rephrase to clarify.
  
  Our response: On line 171, when we say “thereafter”, we meant for longer periods. We plan to rephrase this line for clarity in the revised version of the main document.
  Line 176: Some formatting inconsistencies with citations here and throughout the manuscript. Uneven use of open and closed parentheses and lack of spacing between cited papers within in-line citations. Proofread carefully. Suggest using “[(“ instead of duplicate parentheses. Also, I cannot find the Dwivedi 2019b entry in your bibliography. Is it missing? Put spaces between entries in the bibliography. It is terribly difficult to read through single spaced.
  
  Our response: Thank you. In the revised version of our paper, we plan to address this comment by: (i) using a consistent representation for multiple citations, (ii) providing the complete reference to Dwivedi et al. [2019] citation, and (iii) using dual-spacing for the whole main document for a better readability.
  Line 177: “expand on these results” again seems to suggest this is somewhat redundant with the previous works from the same catchment.
  
  Our response: Our statement regarding “expanding on these results” when referring to previous work of Ajami et al. [2011] or Dwivedi et al. [2019], is meant to say that our present work aims to further improve our understanding of deep groundwater flow paths by characterizing their transit time distributions and evaluating if the state-of-the-art fraction of young water metric is appropriate for deep groundwater. This has not been reported in the literature, and thus our work makes a novel contribution of assessing the appropriateness of Fyw metric for deep groundwater flow paths. Please also note that Ajami et al. [2011] have not considered either transit time distribution or Fyw for deep groundwater and Dwivedi et al. [2019] have not considered various TTD types for deep groundwater.
  Lines 185-190: Pretty sure h(tau) is the specified functional form of the TTD, but that is not stated in the paragraph.
  
  Our response: In our work, a TTD is referred to as h(τ) where τ is the transit time (in years). To address this comment, in the revised version of the paper h(τ) will be used between lines 185-190.
  Line 193: I am not familiar with the DownHill Simplex method. It is described in a single sentence, yet it is apparently the method for evaluating how appropriate is one versus the other TTD model. Could you please elaborate just a little bit on what this is for the unaffiliated reader? The KGE is used as the “model performance criteria” but you say that the Downhill Simplex was used to evaluate “the performance of each TTD”. This is confusing to me. Equation 1 is the model, but the variable performance of the model is due only to the selection of different functional forms of the TTD. So the performance of the model is a direct reflection of the performance of the function selected as TTD, no? Please clarify.
  
  Our response: We agree with the reviewer that the performance of the model, i.e., Equation 1 in our paper, is based on the performance of a selected TTD type. As far as the Downhill Simplex method is concerned, it a way to “search” for the model parameters (e.g., mean age and the shape parameters for a gamma TTD) such that the model performance is optimal, which in our work is assessed by using the modified Kling Gupta efficiency. In the revised version of our paper, we plan to explain this method in some more details for an unfamiliar reader.
  Line 228, equation 5: Use “C” with “Q” and “P” subscripted to indicate concentration in streamflow versus precipitation. You already adopted this notation in equation 1. Be consistent here and in subsequent equations.
  
  Our response: Please note that Equation (5) on line 228 in the original version of the paper is for input tracer flux, which is a product of tracer concentration (C) and precipitation flux (P). Similarly, Equation (6) is for tracer flux in stream water, which is the product of tracer concentration C and streamflow (Q).
  Lines 348-357: What about all the other models? You only discuss PF and Gamma. The KGE of the 1d-ADE falls exactly between the values for the Gamma and PF models, yet the mTT estimated by the 1d-ADE is factors of 8-9 less than the mTT from those models, respectively. Why do you ignore the other models and what do you conclude from this order-of-magnitude difference? If I understand Figure 4 correctly, then only the “ADE-nx” and Exponential function as TTDs seem to generate uniquely identifiable parameters. Is that correct? Neither model is discussed at all here.
  
  Our response: Please see our response to comment # 1.3 above.
  Lines 390-392: The data are also far too sparse to reliably fit the parameters for TTDs used in equation 1. Isn’t this confirmed by (1) the lack of unique solutions illustrated in most cases shown in Figure 4 and (2) the generally poor accuracy of all model simulations shown in Figure 5? I would argue yes.
  
  Our response: Between lines 390-392 in the main text, our emphasis is fitting sinusoidal curves to the tritium concentration data for tritium-based Fyw estimation. We noted that sinusoidal curve fitting was not appropriate to tritium tracer data due the coarse resolution of our dataset, which has also led to a higher standard error in the estimated tracer cycle amplitude. However, we were able to identify unique solutions for gamma distribution type TTD parameters when using tritium tracer under low-flow conditions (Figure 4B in the main document). Please note that Figure 4 shows the response surface for various TTD types. Therefore, neither applicability nor poor fit of a particular TTD type should be considered as an indication of data limitation alone. For example, assumptions implied during derivation of the equation model of a particular TTD type may also contribute to a poor data fit. A case to cite is the multiple-paths advection and dispersion model type TTD (ADE-nx) of Kirchner et al. [2001]. When deriving model for this TTD type, it is assumed that the recharge to an aquifer is spatially distributed. When tracer concentration prediction from this TTD is tested against the observations for our study site, the results show a very poor model fit (Figures 4D and 5D). Therefore, a poor fit for the ADE-nx model can be hypothesized as indicating that “recharge to the fractured bedrock aquifer is not spatially distributed at our field site”. This is an important finding, because such recharge pathways can lead to replenishment of deep subsurface storages that support mountain block recharge to valley fill aquifers.
  Lines 400-414: The text makes no allusion at all to Figure 7D, which has unusual qualitative axes and cannot be easily interpreted by the reader. The figure caption only provides a citation to a previous work to explain the graphic. More explanation is needed in this section of the Results, or Figure 7B should be deleted.
  
  Our response: We are confused by this comment because Figure 7 in our work only has A and B panels: Figure 7A is cited in section 3.3 (line # 302) and in section 4.4 (line # 400), Figure 7B is cited on line 549 and 550 in section 5.4.
  Lines 437-440: I am unclear what is the importance or relevance of this concept of “short-term storage”. What is it and why does it matter? In any case, you present estimates of this metric based on three competing approaches that vary by a factor of approximately 125 (0.08, 0.22, and 10.7)! Which, if any, should we believe is correct, and why?
  
  Our response: In contrast to the traditional approach for estimating subsurface storage that requires values for a subsurface property (e.g., porosity), the short-term storage can be estimated without requiring infromation on porosity by simply using the Fyw metric in conjunction with streamflow and threshold age for young water [Jasechko et al., 2016]. By quantifying this metric for our study site, we noted (between lines 562 and 564 of the orignal paper document) “Thus, after a threshold of 0.05 m short-term near-surface storage at MGC, the current study supports that infiltration may activate deeper groundwater flowpaths [Dwivedi et al., 2019].” Before our reported value of this storage, no such estimates are available/reported in literature at our study site. Please see also our response to comment # 1.1.
  Lines 472-477: The results are highly dependent on the temporal resolution of the input time series. As I noted in a comment above, if a temporal dynamic in the tracer concentration in precipitation is hidden within a sample that accumulated over 5-7 days, then the model can’t possibly simulate the effects of that dynamic in streamflow. The results are entirely dependent on the resolution of sampling the tracer concentration in inflow, and the resolution used in this study is quite coarse.
  
  Our response: We agree with the reviewer that we cannot ask a model to reveal a higher resolution pattern that the resolution of the data used in the model. Further, we also agree with the reviewer that the Fyw results are dependent on the temporal resolution of the data using in computing Fyw, which is the point we tried making in the lines cited by the reviewer. However, we respectfully disagree with the reviewer that “the resolution used in this study is quite coarse”. Please note that when using stable water isotopes for stable water isotopes-based Fyw, our data have a high resolution, i.e., daily for streamflow and approximately weekly for precipitation as it does not rain every day at our study site [Heidbüchel et al., 2012]. When using the tritium data for tritium-based Fyw, while our data have coarse resolution as we sampled low-flow conditions, they nonetheless facilitate a better understanding of the longer period component which will otherwise be hidden from sampling dynamic flow conditions from a catchment.
  Lines 499-501: You make a sweeping assumption here that the bedrock at several research sites is “water tight”. That seems quite speculative. What evidence supports this assumption? Most rocks are fractured and jointed to some extent. Even exposed, granitic plutons commonly have sufficient fracturing and water storage capacity to host woody-stemmed plant communities and support inter-storm flow from emergent springs. More support for this assumption is needed here, perhaps through more extensive synopsis of the geology of the sites used in these cited studies.
  
  Our response: Thank you for your comment. We concur with you completely. However, our use of the term “water tight” was a quotation from Gallart et al. [2020] who were describing their field site. We have made no assumptions about the tightness of the bedrock aquifer at other sites. Between lines 499 to 501 in the main document we stated the following “however, the fractured bedrock at MGC is functionally distinct than the "watertight" bedrock characterized by Gallart et al. [2020] and the majority of humid sites in Jasechko et al. [2016] that are comparable in size to MGC.”
  Lines 538-539 and 544-546: So, you’re saying the fraction of young water estimates are invalid when based on the use of Tritium as a tracer?
  
  Our response: To avoid ambiguity regarding the Fyw metric when using tritium for low flow conditions, we stated “A negligible F_yw at MGC calls into question of the suitability of the ³H-based F_yw approach for deeper groundwater.” Thus, our intent is to suggest (based on our study results) that this metric is unsuitable when applied to the baseflow or deeper groundwater component of a catchment’s flow system due to its longer residence time and significant mixing-amplitude dampening in the subsurface.
  Lines 555-564: Here there are a series of sentences elaborating some intricate details of methodology which seem misplaced in the discussion. They lead to the ultimate conclusion at the end of the paragraph that “infiltration may activate deeper groundwater flowpaths”. That is not a novel conclusion in scientific hydrology, and it is not even stated definitively here (i.e., may..). This paper uses a wide ensemble of methodological approaches, which, from my view, has only created ambiguity in how the markedly contrasting results can be interpreted. I find no new insight into hydrological processes resulting from all this computational effort.
  
  Our response: Thank you! The three cases mentioned in these lines will be placed in the methods sections in the revised main document. Please see also our response to comment # 1.1 above.
  Comments on Figures and Tables:
  Figure 3: Does the inset have a linear scale? If not, please make it linear. If so, please add more tick marks to the vertical axis so we can approximate the numeric values of the data points. Are these six data points all you have to calibrate the parameters of the TTD models used with equation 1? If so, that seems inadequate.
  
  Our response: The inset plot has a logarithmic scale as the main plot. This will be clearly stated in the revised version of this figure. Please see also our response to the comment # 1.2 above.
  Figure 4: Is the vertical axis the KGE? If so, please label it that way. I am unclear what the variable “response surface” on the vertical axis indicates.
  
  Our response: A revised version of this document that addresses your comment will be provided with the revised main document.
  Table 1: Words and numbers should not be split between rows. Use emboldened lines, or no lines, to better delineate the content. This is not acceptable for a journal article. Please make it more presentable.
  
  Our response: A revised version of Table 1 that addresses your comment will be provided with the revised version of the main document.
  Figure 5: Y-axis labels should have “3” as a superscript preceding “H” to conform to established conventions of symbolizing isotopes. Would suggest compressing this into a single graph with a legend indicating the results from different TTD models. The gray dots are the same across all 5 subplots.
  
  Our response: In the revised version of Figure 5, we plan on: (i) including all TTDs into a single plot and (ii) properly label the y-axis.
  Figure 7: Here and elsewhere the font size is illegible. Please enlarge font on axes and in legends.
  
  Our response: Thank you. This figure will be revised in the next version of the main document.
  Works Cited:
  Botter, G., E. Bertuzzo, and A. Rinaldo (2010), Transport in the hydrologic response: Travel time distributions, soil moisture dynamics, and the old water paradox, Water Resources Research, 46, doi:W0351410.1029/2009wr008371.
  Botter, G., E. Bertuzzo, and A. Rinaldo (2011), Catchment residence and travel time distributions: The master equation, Geophysical Research Letters, 38, doi:L1140310.1029/2011gl047666.
  Harman, C. J. (2015), Time-variable transit time distributions and transport: Theory and application to storage-dependent transport of chloride in a watershed, Water Resources Research, 51(1), 1-30, doi:10.1002/2014wr015707.
  Heidbuchel, I., P. A. Troch, and S. W. Lyon (2013), Separating physical and meteorological controls of variable transit times in zero-order catchments, Water Resources Research, 49(11), 7644-7657, doi:10.1002/2012wr013149.
  Heidbuchel, I., P. A. Troch, S. W. Lyon, and M. Weiler (2012), The master transit time distribution of variable flow systems, Water Resources Research, 48, doi:W0652010.1029/2011wr011293.
  Kim, M., L. A. Pangle, C. Cardoso, M. Lora, T. H. M. Volkmann, Y. Wang, C. J. Harman, and P. A. Troch (2016), Transit time distributions and StorAge Selection functions in a sloping soil lysimeter with time-varying flow paths: Direct observation of internal and external transport variability, Water Resources Research, 52(9), 7105-7129, doi:10.1002/2016WR018620.
  Lewis, S., and A. Nir (1978), On tracer theory in geophysical systems in the steady and non-steady state. Part II. Non-steady state - theoretical introduction, Tellus, 30, 260-271.
  Lyon, S. W., S. L. E. Desilets, and P. A. Troch (2008), Characterizing the response of a catchment to an extreme rainfall event using hydrometric and isotopic data, Water Resources Research, 44(6), doi:W0641310.1029/2007wr006259.
  Lyon, S. W., S. L. E. Desilets, and P. A. Troch (2009), A tale of two isotopes: differences in hydrograph separation for a runoff event when using delta D versus delta O-18, Hydrol. Process., 23(14), 2095-2101, doi:10.1002/hyp.7326.
  McGuire, K. J., and J. J. McDonnell (2006), A review and evaluation of catchment transit time modeling, J. Hydrol., 330(3-4), 543-563, doi:10.1016/j.jhydrol.2006.04.020.
  Rinaldo, A., K. J. Beven, E. Bertuzzo, L. Nicotina, J. Davies, A. Fiori, D. Russo, and G. Botter (2011), Catchment travel time distributions and water flow in soils, Water Resources Research, 47, doi:10.1029/2011wr010478.
  Rozanski, K., L. Araguas-Araguas, and R. Gonfiantini (1993), Isotopic patterns in modern global precipitation, American Geophysical Union Monographs, 78, 1-36.
  
  van der Velde, Y., P. J. J. F. Torfs, S. E. A. T. M. van der Zee, and R. Uijlenhoet (2012), Quantifying catchment-scale mixing and its effect on time-varying travel time distributions, Water Resour. Res., 48(6), W06536, doi:10.1029/2011wr011310.
  References
  Ajami, H., P. A. Troch, T. Maddock, T. Meixner, and C. Eastoe (2011), Quantifying mountain block recharge by means of catchment-scale storage-discharge relationships, Water Resources Research, 47(4), 1-14.
  Botter, G., E. Bertuzzo, and A. Rinaldo (2011), Catchment residence and travel time distributions: The master equation, Geophysical Research Letters, 38(11), 1-6.
  Dwivedi, R., T. Meixner, J. McIntosh, P. A. T. Ferré, C. J. Eastoe, G.-Y. Niu, R. L. Minor, G. Barron-Gafford, and J. Chorover (2019), Hydrologic functioning of the deep Critical Zone and contributions to streamflow in a high elevation catchment: testing of multiple conceptual models, Hydrological Processes, 33, 476-494, doi: 10.1002/hyp.13363.
  Dwivedi, R., C. Eastoe, J. F. Knowles, L. Hamann, T. Meixner, P. A. T. Ferre, C. Castro, W. E. Wright, G.-Y. Niu, R. Minor, G. A. Barron-Gafford, N. Abramson, B. Mitra, S. A. Papuga, M. Stanley, and J. Chorover (2021), An improved practical approach for estimating catchment-scale response functions through wavelet analysis, Hydrological Processes, 35(3), 1-20.
  Frisbee, M. D., J. L. Wilson, J. D. Gomez-Velez, F. M. Phillips, and A. R. Campbell (2013), Are we missing the tail (and the tale) of residence time distributions in watersheds?, Geophysical Research Letters, 4633–4637.
  Gallart, F., M. Valiente, P. Llorens, C. Cayuela, M. Sprenger, and J. Latron (2020), Investigating young water fractions in a small Mediterranean mountain catchment: Both precipitation forcing and sampling frequency matter, Hydrological Processes, 34(17), 3618-3634.
  Gleeson, T., K. M. Befus, S. Jasechko, E. Luijendijk, and M. B. Cardenas (2015), The global volume and distribution of modern groundwater, Nature Geoscience, doi: 10.1038/ngeo2590.
  Harman, C. J. (2015), Time-variable transit time distributions and transport: Theory and application to storage-dependent transport of chloride in a watershed, Water Resources Research, 51, 1-30.
  Heidbüchel, I., P. A. Troch, and S. W. Lyon (2013), Separating physical and meteorological controls of variable transit times in zero-order catchments, Water Resources Research, 49(11), 7644-7657.
  Heidbüchel, I., P. A. Troch, S. W. Lyon, and M. Weiler (2012), The master transit time distribution of variable flow systems, Water Resources Research, 48(6), 1-19.
  Hrachowitz, M., P. Benettin, B. M. van Breukelen, O. Fovet, N. J. K. Howden, L. Ruiz, Y. van der Velde, and A. J. Wade (2016), Transit times-the link between hydrology and water quality at the catchment scale, Wiley Interdisciplinary Reviews: Water, doi: 10.1002/wat2.1155.
  Jasechko, S., J. W. Kirchner, J. M. Welker, and J. J. McDonnell (2016), Substantial proportion of global streamflow less than three months old, Nature Geoscience, 9(2), 126-129.
  Kirchner, J. W. (2016a), Aggregation in environmental systems- Part 1: Seasonal tracer cycles quantify young water fractions, but not mean transit times, in spatially heterogeneous catchments, Hydrology and Earth System Sciences, 20(1), 279-297.
  Kirchner, J. W. (2016b), Aggregation in environmental systems -Part 2: Catchment mean transit times and young water fractions under hydrologic nonstationarity, Hydrology and Earth System Sciences, 20(1), 299-328.
  Kirchner, J. W., X. Feng, and C. Neal (2001), Catchment-scale advection and dispersion as a mechanism for fractal scaling in stream tracer concentrations, Journal of Hydrology, 254, 82-101.
  Stewart, M. K., U. Morgenstern, M. A. Gusyev, and P. Maloszewski (2017), Aggregation effects on tritium-based mean transit times and young water fractions in spatially heterogeneous catchments and groundwater systems, and implications for past and future applications of tritium, Hydrology and Earth System Sciences, 21, 4615–4627.
  Suckow, A. (2014), The age of groundwater – Definitions, models and why we do not need this term, Applied Geochemistry, 50, 222-230.
  
  Citation: https://doi.org/10.5194/hess-2021-355-AC3
RC2:
'Comment on hess-2021-355', Anonymous Referee #2, 15 Oct 2021

The work of Dwivedi et al. studies travel times in the Marshall Gulch research catchment, Arizona, for a better understanding of flow paths and storage in a mountain catchment. This is done through a strong data set of stable isotopes and tritium. The paper is mostly well written. I like and appreciate the combination of tritium and stable isotopes, I believe that this is an important endeavor. However, the current manuscript has a range of serious limitations.

First, the introduction reads like a patchwork of ideas and concepts but stays vague and thus not convincingly outline a limitation/research gap. Thus, the research question came somewhat out of the blue for me. I was not able to find any information if the research on these objectives is needed or not. After reading the full manuscript, I felt that this even more important as the work read like a compilation of applying methods without a clear strategy concluding that there are different results depending on tracer and methods.

Second, the methods are an issue. It is unclear why the methods are chosen. It feels like an application of a range of methods and see what comes out. I cannot find a clear strategy behind. Even more critical, by applying time invariant approaches for travel time distributions, the paper methodology is lacking a decade behind recent developments in the field (see the wide range of work, even cited, on time variant TTD and SAS functions). The young water fraction is state of the art though, but here the work again suffers from the lack of clear strategy. In addition, stable isotopes and tritum tracers should ideally be used in a joint calibration to obtain a travel time consistent for the tritium and stable isotope observations (cf. Rodriguez et al., 2021). You might even be able to calibrate the multimodal age distributions of your travel time doing so – however, this is just speculation. Yet, his could be a really nice contribution to the field of TTDs.

Overall, I think that the manuscript would need a very major rework to be publishable. This would include a full adaption of the methods to the state of the art. I have doubt that this can be done within a major revision.

Citation: https://doi.org/10.5194/hess-2021-355-RC2
- AC2: 'Reply on RC2', Ravindra Dwivedi, 27 Oct 2021
  
  RC2: 'Comment on hess-2021-355', Anonymous Referee #2
  2.1 The work of Dwivedi et al. studies travel times in the Marshall Gulch research catchment, Arizona, for a better understanding of flow paths and storage in a mountain catchment. This is done through a strong data set of stable isotopes and tritium. The paper is mostly well written. I like and appreciate the combination of tritium and stable isotopes, I believe that this is an important endeavor. However, the current manuscript has a range of serious limitations.
  Our response: We appreciate the reviewer recognizing the benefits to using multiple tracers. To address your concerns and comments, we plan to make major revisions to our paper described below. Please see also our specific responses to the comments # 2.2 and 2.3.
  2.2 First, the introduction reads like a patchwork of ideas and concepts but stays vague and thus not convincingly outline a limitation/research gap. Thus, the research question came somewhat out of the blue for me. I was not able to find any information if the research on these objectives is needed or not. After reading the full manuscript, I felt that this even more important as the work read like a compilation of applying methods without a clear strategy concluding that there are different results depending on tracer and methods.
  Our response: We acknowledge that some reorganization and restructuring of our paper is clearly warranted. As a roadmap forward, please see our response to comment # 1.2 from the first reviewer where we more clearly describe the novel contributions of our study. Therefore, in the revised version of our paper, we plan to reorganize the introduction section for better highlighting the research gaps. We further plan to use the revised introduction section to reorganize the other parts of the paper, e.g., methods, results, and discussion. Thank you for your suggestions.
  2.3 Second, the methods are an issue. It is unclear why the methods are chosen. It feels like an application of a range of methods and see what comes out. I cannot find a clear strategy behind. Even more critical, by applying time invariant approaches for travel time distributions, the paper methodology is lacking a decade behind recent developments in the field (see the wide range of work, even cited, on time variant TTD and SAS functions). The young water fraction is state of the art though, but here the work again suffers from the lack of clear strategy. In addition, stable isotopes and tritum tracers should ideally be used in a joint calibration to obtain a travel time consistent for the tritium and stable isotope observations (cf. Rodriguez et al., 2021). You might even be able to calibrate the multimodal age distributions of your travel time doing so – however, this is just speculation. Yet, his could be a really nice contribution to the field of TTDs.
  Our response: Our aim was to apply the fraction of young water metric in conjunction with the mean transit time metric to improve scientific understanding of transient flow paths in high elevation mountain systems. That said, we respectfully disagree with the criticism that our methods are lagging behind recent developments in the field. Please see our response to comment # 1.2 from the first reviewer. We also highlight that Rodriguez et al. [2021] who used both stable water isotopes and tritium in jointly assessing a suitable TTD type, sampled both tracers under similar dynamic flow conditions, which contrasts our sampling conditions for tritium.
  2.4 Overall, I think that the manuscript would need a very major rework to be publishable. This would include a full adaption of the methods to the state of the art. I have doubt that this can be done within a major revision.
  Our response: Please see our responses to comments #1.1, 1.2, 2.2, and 2.3 above.
  References
  Rodriguez, N. B., L. Pfister, E. Zehe, and J. Klaus (2021), Testing the truncation of travel times with StorAge Selection functions using deuterium and tritium as tracers, Hydrol. Earth Syst. Sci., 25, 401–428.
  
  Citation: https://doi.org/10.5194/hess-2021-355-AC2
RC3:
'Comment on hess-2021-355, Anonymous Referee #', Anonymous Referee #3, 17 Oct 2021

General comments

According the reviewer the paper needs major changes for the following reasons: (1) the overall manuscript is confusing (2) the contributions of this study are not clear.

Despite some scientific contributions can be easily seen, it is necessary to make the contributions of this study clear. In general, the manuscript is difficult to read because there is no clear common thread, some sections seem to come out of the blue. Some sections are inconsistent with previous affirmation which reduces the truthfulness of the manuscript. The overall manuscript is too long, especially Methods and Results section.

General comment of each section.

The introduction section is confusing and does not show specific novelty and sound scientific value at global scale. I suggest to restructure the introduction reinforcing the state of the art of previous studies using isotopes and TT models and after explain the novelties of this paper. The term deep and shallow groundwater flow at mountain range is referred without previous explanation. Due to the complexity of this terms, I will recommend a previous definition of those. The term fractured rock system is referred in some occasion but it is not clear if this is the case of the study area. If this is the case, I would be appropriate to explain how fractures are going to be taken into account.

At some point it seems that the authors try to reproduce his previous work Dwivedi et al. (2021) but for the “deep groundwater” however, the author says that this has already been done by Ajami et al. (2011) and Dwivedi et al. (2019b). Again, a detailed reasoning about why is this paper a novelty is needed, it is confusing. The authors highlighted the contribution of using multiple year isotope data however only one year of 3H data is used, again confusing.

The Data section needs a better description of the collector type to understand the representativeness of the data. I strongly suggest to improve Fig1. In the manuscript Fig1 A and B are referred but there is no A and B in the figure. I suggest to incorporate the rivers, a standard scale (0, 0-5 and 1 km for example) and a higher resolution DEM, it looks poor.

The Methodology section seems to be a state of the art of the existing methods than something new. There are detailed descriptions of some methods that make the reader to lose the main goal of each approximation/estimation. I suggest to delete all dispensable information. The authors say at the introduction that one of the novelties is the use of multiple year data and only one year of 3H data is used. There is a repeated need to redefine the main goals and the novelty of this study.

The Results section is firstly organized by method, then change to shallow and deep groundwater and the mix between isotope type and TTD method and finally include FyW and Tyw. This section is dishful to follow.

Explaining which is the better method or the most reliable on in each case instead of only talking about the existence differences will strongly improve the Discussion section. It is not surprising to obtain different results with different methods. I will suggest to direct the discussion to explain line 577-579.

Specific comments

Line 52: Water stale isotopes: although this term has been used in other works, the term " stable water isotopes" is not correct. Water itself does not have isotopes. The correct term is stable 18O, 2H isotopes of water.

Line 61: Underestimating or overestimating transient times have other consequences than the correct understanding of the water chemistry. I will be appropriate to explain the most important ones.

Line 92: The second goal is not clear; I do not understand what are you trying to study.

Line180: â(ð) needs to be defined here instead of line 242.

Line 296: Why only one year period?

Line 426: I would say 2-3 years.

Line 440: 10.7 “mm”

Citation: https://doi.org/10.5194/hess-2021-355-RC3
- AC1: 'Reply on RC3', Ravindra Dwivedi, 27 Oct 2021
  
  RC3: 'Comment on hess-2021-355, Anonymous Referee #', Anonymous Referee #3
  General comments
  
  3.1 According the reviewer the paper needs major changes for the following reasons: (1) the overall manuscript is confusing (2) the contributions of this study are not clear.
  
  Despite some scientific contributions can be easily seen, it is necessary to make the contributions of this study clear. In general, the manuscript is difficult to read because there is no clear common thread, some sections seem to come out of the blue. Some sections are inconsistent with previous affirmation which reduces the truthfulness of the manuscript. The overall manuscript is too long, especially Methods and Results section.
  Our response: We appreciate Reviewer 3’s constructive review. These comments are consistent also with those of Reviewers 1 and 2, and so it is clear that we need to revise our introduction and justification. Please see our specific responses to comments # 1.1 and 1.2. If “Some sections are inconsistent with previous affirmation” is meant to refers to our previous TTD work [Dwivedi et al., 2021], we note that that study was based on wavelet analysis of high-density tracer-flux time series data, and that tritium tracer concentrations or fraction of young water metric were not used in that study.
  General comment of each section.
  
  3.2 The introduction section is confusing and does not show specific novelty and sound scientific value at global scale. I suggest to restructure the introduction reinforcing the state of the art of previous studies using isotopes and TT models and after explain the novelties of this paper. The term deep and shallow groundwater flow at mountain range is referred without previous explanation. Due to the complexity of this terms, I will recommend a previous definition of those. The term fractured rock system is referred in some occasion but it is not clear if this is the case of the study area. If this is the case, I would be appropriate to explain how fractures are going to be taken into account.
  Our response: Please see our response to comments # 1.1 and 2.1. In the revised version of our paper, we will clearly state our definition of both shallow and deep groundwaters. We note that as there no deep wells within our study site, and therefore the nature of the fracture network within the study area is not known. For this reason, any impacts of fracture network on groundwater flow paths in our study area are considered to be represented by the tritium TTD.
  3.3 At some point it seems that the authors try to reproduce his previous work Dwivedi et al. (2021) but for the “deep groundwater” however, the author says that this has already been done by Ajami et al. (2011) and Dwivedi et al. (2019b). Again, a detailed reasoning about why is this paper a novelty is needed, it is confusing. The authors highlighted the contribution of using multiple year isotope data however only one year of 3H data is used, again confusing.
  Our response: Please see our response to comment # 3.1 above. That said, please note that neither Ajami et al. [2011] nor Dwivedi et al. [2019] have used/evaluated the usefulness of the fraction of young water metric to reveal the dynamic nature of flow paths either at the same or similar sites. Please also note that our statement related to use of multiple years’ data is applicable to the fraction of young water metric because the data represent water years 2008 through 2012. While our compiled/collected data for tritium is sparse, it includes sampling dates from 2009 through 2018.
  3.4 The Data section needs a better description of the collector type to understand the representativeness of the data. I strongly suggest to improve Fig1. In the manuscript Fig1 A and B are referred but there is no A and B in the figure. I suggest to incorporate the rivers, a standard scale (0, 0-5 and 1 km for example) and a higher resolution DEM, it looks poor.
  Our response: Thank you. In response to your comment, Figure 1 will be improved and properly labeled in the revision version of the paper document. Additionally, the data section will be updated to describe the collector type.
  3.5 The Methodology section seems to be a state of the art of the existing methods than something new. There are detailed descriptions of some methods that make the reader to lose the main goal of each approximation/estimation. I suggest to delete all dispensable information. The authors say at the introduction that one of the novelties is the use of multiple year data and only one year of 3H data is used. There is a repeated need to redefine the main goals and the novelty of this study.
  Our response: Please see our response to comment # 3.3 above. In the revised version of the paper, the methodology section will be shortened and reorganized to remove non-essential information. Please see also our response to comments # 1.1 and 2.1.
  3.6 The Results section is firstly organized by method, then change to shallow and deep groundwater and the mix between isotope type and TTD method and finally include FyW and Tyw. This section is dishful to follow.
  Our response: In the revised version of the paper, the results section will be reorganized to ensure that our study findings are conveyed as clearly and accurately as possible.
  3.7 Explaining which is the better method or the most reliable on in each case instead of only talking about the existence differences will strongly improve the Discussion section. It is not surprising to obtain different results with different methods. I will suggest to direct the discussion to explain line 577-579.
  Our response: The revised version of the document will include explanation of method reliability when using multiple methods (e.g., Fraction of young water or Fyw). We appreciate your suggestions related to Fyw and its discharge sensitivity.
  
  Specific comments
  
  3.8 Line 52: Water stale isotopes: although this term has been used in other works, the term " stable water isotopes" is not correct. Water itself does not have isotopes. The correct term is stable 18O, 2H isotopes of water.
  Our response: Our use of the term stable water isotope is similar to use of this term in other studies (e.g., Ajami et al. [2011]; Heidbüchel et al. [2013]; Heidbüchel et al. [2012]). In the revised document, we will clearly state that by stable water isotopes we mean stable δ¹⁸O and δ²H isotopes of water.
  3.9 Line 61: Underestimating or overestimating transient times have other consequences than the correct understanding of the water chemistry. I will be appropriate to explain the most important ones.
  Our response: The sentence between lines 59 and 61 is rephrased to read “Underestimated transit times can have cascading impacts on our understanding of subsurface weathering rates, leading to incorrect understanding of stream water chemistry [Clow et al., 2018; Frisbee et al., 2013].”
  3.10 Line 92: The second goal is not clear; I do not understand what are you trying to study.
  Our response: Our second study objective is related to estimating fraction of young water metric and the corresponding subsurface storage for dynamic and slow flow components of a catchment system. This objective is achieved by using stable water isotope and tritium tracers sampled during different flow conditions. Please not that when using the fraction of young water metric, subsurface storage in terms of short-term storage can be estimated without knowing aquifer properties [Jasechko et al., 2016]. Thus, in a way, the subsurface storage supporting streamflow through baseflow to a high-elevation catchment can be estimated without knowing the effective aquifer properties for fracted bedrock aquifers, which is a great advantage for sites where fractured bedrock aquifer are not well characterized, e.g., our study site. In the revised document, we plan to more clearly state this goal.
  3.11 Line180: â„(ðœ) needs to be defined here instead of line 242.
  Our response: This will be properly addressed in the revised version of the paper document.
  3.12 Line 296: Why only one year period?
  Our response: We invoked an annual tracer cycle or period on line 296 of the original version of the main document because the previous literature on the use of Fyw has mostly focused on annual tracer cycles. However, using our proposed mathematical model, we evaluated Fyw for various periods using tritium tracers (e.g., Table S5 in the original version of the supporting information document).
  3.13 Line 426: I would say 2-3 years.
  
  Our response: Agreed, we will make this change.
  3.14 Line 440: 10.7 “mm”
  Our response: Thank you!
  References
  Ajami, H., P. A. Troch, T. Maddock, T. Meixner, and C. Eastoe (2011), Quantifying mountain block recharge by means of catchment-scale storage-discharge relationships, Water Resources Research, 47(4), 1-14.
  Clow, D. W., M. A. Mast, and J. O. Sickman (2018), Linking transit times to catchment sensitivity to atmospheric deposition of acidity and nitrogen in mountains of the western United States, Hydrological Processes, 32(16), 2456-2470.
  Dwivedi, R., T. Meixner, J. McIntosh, P. A. T. Ferré, C. J. Eastoe, G.-Y. Niu, R. L. Minor, G. Barron-Gafford, and J. Chorover (2019), Hydrologic functioning of the deep Critical Zone and contributions to streamflow in a high elevation catchment: testing of multiple conceptual models, Hydrological Processes, 33, 476-494, doi: 10.1002/hyp.13363.
  Dwivedi, R., C. Eastoe, J. F. Knowles, L. Hamann, T. Meixner, P. A. T. Ferre, C. Castro, W. E. Wright, G.-Y. Niu, R. Minor, G. A. Barron-Gafford, N. Abramson, B. Mitra, S. A. Papuga, M. Stanley, and J. Chorover (2021), An improved practical approach for estimating catchment-scale response functions through wavelet analysis, Hydrological Processes, 35(3), 1-20.
  Frisbee, M. D., J. L. Wilson, J. D. Gomez-Velez, F. M. Phillips, and A. R. Campbell (2013), Are we missing the tail (and the tale) of residence time distributions in watersheds?, Geophysical Research Letters, 4633–4637.
  Heidbüchel, I., P. A. Troch, and S. W. Lyon (2013), Separating physical and meteorological controls of variable transit times in zero-order catchments, Water Resources Research, 49(11), 7644-7657.
  Heidbüchel, I., P. A. Troch, S. W. Lyon, and M. Weiler (2012), The master transit time distribution of variable flow systems, Water Resources Research, 48(6), 1-19.
  Jasechko, S., J. W. Kirchner, J. M. Welker, and J. J. McDonnell (2016), Substantial proportion of global streamflow less than three months old, Nature Geoscience, 9(2), 126-129.
  
  Citation: https://doi.org/10.5194/hess-2021-355-AC1

Status: closed

RC1:
'Comment on hess-2021-355', Anonymous Referee #1, 14 Oct 2021
Recommendation to Editor:

I recommend this paper be rejected for publication in H(HESS) in its current form. I recommend the authors resubmit after major revision. The topic is certainly of great interest in scientific hydrology. The combination of data sets be better leveraged to make clearer inferences about the true range of water residence times in small headwater catchments. I provide three general criticisms here that are reiterated in a numbered list of specific comments about the manuscript and graphics.

One major criticism is that the work relies heavily on antiquated methodologies. Major portions of the results are based on application of the so-called lumped-parameter-transport model based on transit-time distributions (TTDs; equation 1). The TTDs have time-invariant parameters, which assumes that the distribution of flow pathways within the landscape and associated water velocities are constant in time. This assumption defies intuition, but was applied out of convenience for decades [e.g., most works reviewed by , 2006]. A theoretical basis for analysis based on TTDs was presented as early as Lewis and Nir [1978]. The theory has been advanced by many recent works [, 2010; 2011; , 2015; , 2011; , 2012]. That TTDs should be time-invariant was shown to be theoretically implausible for low-order watersheds with dynamic flow [, 2011]—a result that has been supported by empirical results from manipulative tracer experiments [e.g., , 2016b]. I don’t think our premier disciplinary journals should continue publishing results based on this antiquated approach.

A second major criticism is that there are significant shortcomings in the data, especially the measurements of Tritium in surface waters. There appear to be only 6 data points representing Tritium abundance in stream water. That’s not many, but the authors rely on that data to calibrate and compare a range of models. Also, the time interval over which precipitation is sampled is coarse. Much of the temporal dynamics of tracer concentration in that inflow will be lost. For these reasons, most of the parameters for the various TTD models are not uniquely identifiable. The different models generate markedly different estimates of different water age metrics. There is inadequate guidance, or rationale, for the reader to understand which, if any, should be considered correct.

The third major criticism is that the article does not clearly convey what outstanding question/problem in scientific hydrology is likely to be resolved through the elaborate set of methodologies employed here. The discussion section does not convey any new insights about flow processes in headwater catchments. Rather, that section seems to emphasize intricacies of the various technical approaches that lead to order-of-magnitude differences in water age metrics such as the fraction of young water and mean transit time.

My first recommendation for revision would include omission of methods and results that are based on lumped-parameter transport modeling using time-invariant TTDs. My second recommendation for revision would be a deep consideration of what is the specific gap in knowledge that the research would address, and a deeper discussion about what all these (somewhat abstract) age metrics tell us about flow processes, or the linkage between water age and catchment structure, that we didn’t already know. Emphasis should be placed on new knowledge that may be generalizable across watersheds. This is especially important since the study focuses on a single watershed where quite a lot of tracer-aided flow and transport studies have already been conducted [, 2013; , 2012; , 2008; 2009], including multiple previous works by the lead author.

Specific comments on content in the text:

Line 23: The phrase “single age tracer” is unclear and not conventional. Please omit or rephrase.

Lines 42-51: The rationale provided here is not very strong. To say that the processes being studied are “still incompletely understood” is not a very effective way to communicate (1) what aspects of the processes well understood (because certainly a lot of prior knowledge does exist), (2) what is/are the explicit knowledge gap/s, and (3) how this research is designed to specifically address that knowledge gap.

Lines 59-61: Doesn’t really make sense to say that an underestimated quantitative metric affects actual substrate weathering in Earth’s crust. The conjunctive phrase “As a result” in the following sentence seems out of place. Consider rephrasing.

Lines 70-71: I think you should temper the language here. Robust? That implies the result should be representative across a range of systems. Yet the papers you cite apparently rely on simulated experiments in synthetic landscapes. Any evidence from the real world that you could cite?

Lines 74-76: To help emphasis the knowledge gap, I think you need to clarify what exactly is meant by “only one period”.

Line 90: The question “what is the appropriate TTD type” is somewhat unclear. Precipitation is episodic. There is not a continuous inflow of water volumes with different ages entering any watershed. Therefore, the distribution of transit times (i.e., exit time – entry time) must also be discontinuous. This fact is illustrated by real TTDs observed from active tracer introductions [e.g., , 2016a]. They are quite messy and not continuous distributions. Any continuous function that is chosen as a TTD for application in lumped-parameter-transport modeling is therefore just an approximation of reality. If that is accepted as true, then it seems your question could be restated as “what mathematical distribution yields simulation results that best fit the data from this particular watershed?”. That is not a question of great relevance for scientific hydrology in general, in my opinion.

Lines 91-93: I have a very hard time interpreting this sentence. Please rephrase. Again, I would suggest carefully explaining, or omitting, the phrase “age tracers”. Are you meaning to distinguish stable isotopes from radioactive isotopes?

Lines 93-94: Suggest deleting “...as determined by stable water isotope tracers”. It implies to the reader that the answer to your more general question (i.e., “what is the discharge sensitivity of F_yw”) is somehow conditional on this particular data type? Is that in fact what you think? If so, it raises some concern about the generality of the results.

Lines 95-100: Suggest deleting all of this. A prelude to the methods elaborated on the following pages is unnecessary. The concluding paragraph of the introduction should highlight the identified knowledge gap then state the objectives of this study and how they address that gap. The final sentence raises some concern that the current work is partially redundant.

Lines 112-113: So it was a notably drier than average 9 years, or the PRISM results are biased high here?

Lines 130-131: This is a very coarse sampling resolution for the intended application of the data. Undoubtedly there are tremendous temporal dynamics in the stable-isotope composition of precipitation within and among individual storms that occur during 5-7 day intervals. The range of stable-isotope abundances in precipitation observed during individual storms may be comparable or greater to the range observed among monthly-aggregated samples collected across years [e.g., , 1993]. The true temporal dynamics of tracer concentration in precipitation are lost in a lumped sample that aggregates over 5-7 days. Any quantitative model that uses those tracer concentrations as input will be very limited in its ability to accurately simulate the temporal dynamics of the same tracer in the stream. That limitation seems very germane to the stated objectives of this study. Passive, sequential sampling devices are easy to make and deploy. Analysis of stable isotope abundances by laser spectrometry for large sample numbers is relatively inexpensive. This data limitation is hard to excuse.

Lines 149-151: I can’t quite understand what this means. Please consider rephrasing.

Line 164-166: Simplify the headings and sub-headings. Here and elsewhere there are sub-headings with no content underneath. Suggest deleting.

Line 171: When you say “thereafter”, do you mean over longer time increments than 1 month? Please rephrase to clarify.

Line 176: Some formatting inconsistencies with citations here and throughout the manuscript. Uneven use of open and closed parentheses and lack of spacing between cited papers within in-line citations. Proofread carefully. Suggest using “[(“ instead of duplicate parentheses. Also, I cannot find the Dwivedi 2019b entry in your bibliography. Is it missing? Put spaces between entries in the bibliography. It is terribly difficult to read through single spaced.

Line 177: “expand on these results” again seems to suggest this is somewhat redundant with the previous works from the same catchment.

Lines 185-190: Pretty sure h(tau) is the specified functional form of the TTD, but that is not stated in the paragraph.

Line 193: I am not familiar with the DownHill Simplex method. It is described in a single sentence, yet it is apparently the method for evaluating how appropriate is one versus the other TTD model. Could you please elaborate just a little bit on what this is for the unaffiliated reader? The KGE is used as the “model performance criteria” but you say that the Downhill Simplex was used to evaluate “the performance of each TTD”. This is confusing to me. Equation 1 is the model, but the variable performance of the model is due only to the selection of different functional forms of the TTD. So the performance of the model is a direct reflection of the performance of the function selected as TTD, no? Please clarify.

Line 228, equation 5: Use “C” with “Q” and “P” subscripted to indicate concentration in streamflow versus precipitation. You already adopted this notation in equation 1. Be consistent here and in subsequent equations.

Lines 348-357: What about all the other models? You only discuss PF and Gamma. The KGE of the 1d-ADE falls exactly between the values for the Gamma and PF models, yet the mTT estimated by the 1d-ADE is factors of 8-9 less than the mTT from those models, respectively. Why do you ignore the other models and what do you conclude from this order-of-magnitude difference? If I understand Figure 4 correctly, then only the “ADE-nx” and Exponential function as TTDs seem to generate uniquely identifiable parameters. Is that correct? Neither model is discussed at all here.

Lines 390-392: The data are also far too sparse to reliably fit the parameters for TTDs used in equation 1. Isn’t this confirmed by (1) the lack of unique solutions illustrated in most cases shown in Figure 4 and (2) the generally poor accuracy of all model simulations shown in Figure 5? I would argue yes.

Lines 400-414: The text makes no allusion at all to Figure 7D, which has unusual qualitative axes and cannot be easily interpreted by the reader. The figure caption only provides a citation to a previous work to explain the graphic. More explanation is needed in this section of the Results, or Figure 7B should be deleted.

Lines 437-440: I am unclear what is the importance or relevance of this concept of “short-term storage”. What is it and why does it matter? In any case, you present estimates of this metric based on three competing approaches that vary by a factor of approximately 125 (0.08, 0.22, and 10.7)! Which, if any, should we believe is correct, and why?

Lines 472-477: The results are highly dependent on the temporal resolution of the input time series. As I noted in a comment above, if a temporal dynamic in the tracer concentration in precipitation is hidden within a sample that accumulated over 5-7 days, then the model can’t possibly simulate the effects of that dynamic in streamflow. The results are entirely dependent on the resolution of sampling the tracer concentration in inflow, and the resolution used in this study is quite coarse.

Lines 499-501: You make a sweeping assumption here that the bedrock at several research sites is “water tight”. That seems quite speculative. What evidence supports this assumption? Most rocks are fractured and jointed to some extent. Even exposed, granitic plutons commonly have sufficient fracturing and water storage capacity to host woody-stemmed plant communities and support inter-storm flow from emergent springs. More support for this assumption is needed here, perhaps through more extensive synopsis of the geology of the sites used in these cited studies.

Lines 538-539 and 544-546: So, you’re saying the fraction of young water estimates are invalid when based on the use of Tritium as a tracer?

Lines 555-564: Here there are a series of sentences elaborating some intricate details of methodology which seem misplaced in the discussion. They lead to the ultimate conclusion at the end of the paragraph that “infiltration may activate deeper groundwater flowpaths”. That is not a novel conclusion in scientific hydrology, and it is not even stated definitively here (i.e., ..). This paper uses a wide ensemble of methodological approaches, which, from my view, has only created ambiguity in how the markedly contrasting results can be interpreted. I find no new insight into hydrological processes resulting from all this computational effort.

Comments on Figures and Tables:

Figure 3: Does the inset have a linear scale? If not, please make it linear. If so, please add more tick marks to the vertical axis so we can approximate the numeric values of the data points. Are these six data points all you have to calibrate the parameters of the TTD models used with equation 1? If so, that seems inadequate.

Figure 4: Is the vertical axis the KGE? If so, please label it that way. I am unclear what the variable “response surface” on the vertical axis indicates.

Table 1: Words and numbers should not be split between rows. Use emboldened lines, or no lines, to better delineate the content. This is not acceptable for a journal article. Please make it more presentable.

Figure 5: Y-axis labels should have “3” as a superscript preceding “H” to conform to established conventions of symbolizing isotopes. Would suggest compressing this into a single graph with a legend indicating the results from different TTD models. The gray dots are the same across all 5 subplots.

Figure 7: Here and elsewhere the font size is illegible. Please enlarge font on axes and in legends.

Works Cited:

Botter, G., E. Bertuzzo, and A. Rinaldo (2010), Transport in the hydrologic response: Travel time distributions, soil moisture dynamics, and the old water paradox, , , doi:W0351410.1029/2009wr008371.

Botter, G., E. Bertuzzo, and A. Rinaldo (2011), Catchment residence and travel time distributions: The master equation, , , doi:L1140310.1029/2011gl047666.

Harman, C. J. (2015), Time-variable transit time distributions and transport: Theory and application to storage-dependent transport of chloride in a watershed, , (1), 1-30, doi:10.1002/2014wr015707.

Heidbuchel, I., P. A. Troch, and S. W. Lyon (2013), Separating physical and meteorological controls of variable transit times in zero-order catchments, , (11), 7644-7657, doi:10.1002/2012wr013149.

Heidbuchel, I., P. A. Troch, S. W. Lyon, and M. Weiler (2012), The master transit time distribution of variable flow systems, , , doi:W0652010.1029/2011wr011293.

Kim, M., L. A. Pangle, C. Cardoso, M. Lora, T. H. M. Volkmann, Y. Wang, C. J. Harman, and P. A. Troch (2016), Transit time distributions and StorAge Selection functions in a sloping soil lysimeter with time-varying flow paths: Direct observation of internal and external transport variability, , (9), 7105-7129, doi:10.1002/2016WR018620.

Lewis, S., and A. Nir (1978), On tracer theory in geophysical systems in the steady and non-steady state. Part II. Non-steady state - theoretical introduction, , , 260-271.

Lyon, S. W., S. L. E. Desilets, and P. A. Troch (2008), Characterizing the response of a catchment to an extreme rainfall event using hydrometric and isotopic data, , (6), doi:W0641310.1029/2007wr006259.

Lyon, S. W., S. L. E. Desilets, and P. A. Troch (2009), A tale of two isotopes: differences in hydrograph separation for a runoff event when using delta D versus delta O-18, , (14), 2095-2101, doi:10.1002/hyp.7326.

McGuire, K. J., and J. J. McDonnell (2006), A review and evaluation of catchment transit time modeling, , (3-4), 543-563, doi:10.1016/j.jhydrol.2006.04.020.

Rinaldo, A., K. J. Beven, E. Bertuzzo, L. Nicotina, J. Davies, A. Fiori, D. Russo, and G. Botter (2011), Catchment travel time distributions and water flow in soils, , , doi:10.1029/2011wr010478.

Rozanski, K., L. Araguas-Araguas, and R. Gonfiantini (1993), Isotopic patterns in modern global precipitation, , , 1-36.

van der Velde, Y., P. J. J. F. Torfs, S. E. A. T. M. van der Zee, and R. Uijlenhoet (2012), Quantifying catchment-scale mixing and its effect on time-varying travel time distributions, , (6), W06536, doi:10.1029/2011wr011310.
Citation: https://doi.org/10.5194/hess-2021-355-RC1
- AC3:
  'Reply on RC1', Ravindra Dwivedi, 27 Oct 2021
  RC1: 'Comment on hess-2021-355', Anonymous Referee #1, 14 Oct 2021
  Recommendation to Editor:
  1.1 I recommend this paper be rejected for publication in Hydrology and Earth System Science(HESS) in its current form. I recommend the authors resubmit after major revision. The topic is certainly of great interest in scientific hydrology. The combination of data sets mightbe better leveraged to make clearer inferences about the true range of water residence times in small headwater catchments. I provide three general criticisms here that are reiterated in a numbered list of specific comments about the manuscript and graphics.
  Dear reviewer, Thank you for your in-depth review of our paper. We plan on addressing each of your comments related to novelty and organization of our paper in the revised version of the paper documents. Please see also our responses to the comments # 1.2 and #1.4 below.
  1.2 One major criticism is that the work relies heavily on antiquated methodologies. Major portions of the results are based on application of the so-called lumped-parameter-transport model based on time-invariant transit-time distributions (TTDs; equation 1). The TTDs have time-invariant parameters, which assumes that the distribution of flow pathways within the landscape and associated water velocities are constant in time. This assumption defies intuition, but was applied out of convenience for decades [e.g., most works reviewed by McGuire and McDonnell, 2006]. A theoretical basis for analysis based on time-variable TTDs was presented as early as Lewis and Nir [1978]. The theory has been advanced by many recent works [Botter et al., 2010; 2011; Harman, 2015; Rinaldo et al., 2011; van der Velde et al., 2012]. That TTDs should be time-invariant was shown to be theoretically implausible for low-order watersheds with dynamic flow [Botter et al., 2011]—a result that has been supported by empirical results from manipulative tracer experiments [e.g., Kim et al., 2016b]. I don’t think our premier disciplinary journals should continue publishing results based on this antiquated approach.
  Our response: We appreciate this concern but respectfully disagree. The papers cited in the above comment mostly focus on the highly dynamic and not the relatively stable part of a catchment’s flow system (e.g., baseflow through fractured bedrock aquifers). The papers cited above or those listed at the end of the reviewer’s comments or similar papers related to dynamic transit time distributions (TTDs) have used either stable water isotopes (e.g., Heidbüchel et al. [2013]; Heidbüchel et al. [2012])) or similar tracers (e.g., chloride tracer in Harman [2015]; Hrachowitz et al. [2016]), which are applicable on shorter time scales [Suckow, 2014]. In contrast, our study uses tritium sampled under low flow or baseflow conditions, and this tracer is applicable under a longer time scale than stable water isotope tracers [Suckow, 2014]. Most importantly, the use of any time-variable transit time method (e.g., rSAS functions of Harman [2015], master equation method [Botter et al., 2011; Heidbüchel et al., 2012] or wavelet analysis methods [Dwivedi et al., 2021]) requires high-frequency data on hydrologic fluxes and tracer concentrations in inflow and outflow. These high frequency datasets of hydrologic fluxes and tritium concentrations are generally not available on the time scale for which tritium-based TTDs are estimated; this includes our study site, Marshall Gulch catchment, as well as many global sites as acknowledged in Gleeson et al. [2015]. Therefore, several studies, when modeling TTD using tritium sampled under baseflow conditions or in groundwater, use steady state TTDs (e.g., Stewart et al. [2017]; a paper published recently in HESS). Even with a coarser resolution, tritium observations are able to shed light on long-period dynamics in the transit time distributions for deep fractured bedrock aquifer groundwaters. In the revised version of our paper, we plan to clearly highlight the aforementioned point as a rationale for using the steady state version of TTDs.
  1.3 A second major criticism is that there are significant shortcomings in the data, especially the measurements of Tritium in surface waters. There appear to be only 6 data points representing Tritium abundance in stream water. That’s not many, but the authors rely on that data to calibrate and compare a range of models. Also, the time interval over which precipitation is sampled is coarse. Much of the temporal dynamics of tracer concentration in that inflow will be lost. For these reasons, most of the parameters for the various TTD models are not uniquely identifiable. The different models generate markedly different estimates of different water age metrics. There is inadequate guidance, or rationale, for the reader to understand which, if any, should be considered correct.
  Our response: The purpose of this study was not to explicitly simulate high temporal resolution dynamics of tracer concentrations in stream water. In contrast, by using the stable water isotope tracers, our study seeks to estimate the fraction of young water metric in a time averaged sense to compare and contrast the value and information contained in that metric at our field site to literature-reported data from other sites, in order to better understand dynamic flow path behavior. The temporal resolution of the stable water isotope data in precipitation and stream water was sufficient to meet these study objectives, based on estimated Fyw values that were within the range reported in the literature using the TTD-based method. Please also note that the stable water isotope data (collected during water year 2008 through 2012) in our study were collected by the Santa Catalina Mountains and Jemez River Basin Critical zone observatory prior to this study. In our study, the TTD model performance was not only assessed by the value of the Kling-Gupta efficiency or KGE’, but also by: (i) evaluating the reliability of the estimated optimal model parameters by running the same model three times with separate initial model parameter guesses and (ii) determining whether the estimated model parameters are within the permissible parameter space. Thus, while KGE’ model performance may be lower than the KGE’ obtained from a Gamma TTD, the response surface for an ADE-1x TTD type (Figure 4C) suggests that the estimated model parameters were not reliable and sometimes at the edge of the permissible parameter space. In contrast, the estimated model parameters with a Gamma TTD were unique (Figure 4B vs. 4C). A similar explanation also applies when comparing a Gamma to an exponential TTD type (Figure 4A vs. 4B) where the response surface of an ADE-nx TTD type is similar to the response surface for the exponential TTD in the sense that the estimated model parameters are at the edge of their permissible parameter spaces. Therefore, ADE-1x, ADE-nx and Exponential TTD types do not meet our set criteria for selecting an appropriate TTD type and its parameters. For the sake of brevity, we only discussed piston flow and gamma TTD types in section 4.1.1. In section S3 in supporting information we provide more in depth information about the performance of each TTD type for three separate model runs. To address this comment, we plan on including some more details about the other TTD types in the revised version of this section. Please see also our response to comment # 1.2 above.
  1.4 The third major criticism is that the article does not clearly convey what outstanding question/problem in scientific hydrology is likely to be resolved through the elaborate set of methodologies employed here. The discussion section does not convey any new insights about flow processes in headwater catchments. Rather, that section seems to emphasize intricacies of the various technical approaches that lead to order-of-magnitude differences in water age metrics such as the fraction of young water and mean transit time.
  Our response: We used multiple metrics including the state-of-the-art fraction of young water (Fyw) metric and the mean transit time metric in conjunction with both young and old groundwater residence time tracers to better understand the dynamic nature of hydrologic flowpaths at a sub-humid mountain catchment in Arizona, USA. Please note that our work builds upon previous efforts [Kirchner, 2016a; b; Stewart et al., 2017] that show that spatial and temporal aggregation errors are lower when using the fraction of young water metric compared to the mean transit time metric. However, as acknowledged by the reviewer, these efforts are made using a virtual experimental setup. Therefore, the principal contribution of our study is co-application of these two metrics, i.e., fraction of young water and mean transit time, using multiple tracer types for a real world catchment. Additionally, our study makes the following specific contributions:
  1. Use of multiple methods to estimate Fyw that demonstrate the variability associated with sampling frequency, hydroclimate, the method used in its estimation, and the processes that dictate streamflow generation.
  2. As most of the existing literature on fraction of young water metric is focused on Fyw for annual or seasonal tracer cycles, in our work we estimated Fyw for not only annual cycles but also for several periods ranging from 2 days to 5 years when using stable water isotope tracers.
  3. Delineation of a consistent mathematical framework to estimate Fyw using both young- and old- groundwater age tracers. Our proposed framework is flexible and can be easily employed at sites with long-term observations of young and old groundwater age tracers in inflow and outflow
  4. Characterization and discussion of the limitations of tritium-based Fyw estimates due to a common lack of long-term data, coarse and/or sparse tritium concentration time series observations, and/or lack of measurement precision
  5. Description of an alternative approach to more reliably estimate deep subsurface storage
  6. Characterization of dynamic flowpaths that reorganize and restructure with catchment storage through the use of multiple metrics
  7. Identification of a threshold short-term storage that once reached, increases the propensity for precipitation to infiltrate and activate deeper flow paths
  In light of these contributions, we believe our study will be broadly useful to hydrological researchers and practitioners that rely on either Fyw or mean transit time metrics to understand the subsurface residence time of water, or that aim to constrain the links between water quantity and quality as water moves along subsurface flow paths. However, based on this reviewer’s comment, we now recognize that we did not do a sufficient job of communicating these contributions to hydrologic scientists, and will improve the discussion in this regard in the revision.
  1.5 My first recommendation for revision would include omission of methods and results that are based on lumped-parameter transport modeling using time-invariant TTDs. My second recommendation for revision would be a deep consideration of what is the specific gap in knowledge that the research would address, and a deeper discussion about what all these (somewhat abstract) age metrics tell us about flow processes, or the linkage between water age and catchment structure, that we didn’t already know. Emphasis should be placed on new knowledge that may be generalizable across watersheds. This is especially important since the study focuses on a single watershed where quite a lot of tracer-aided flow and transport studies have already been conducted [Heidbuchel et al., 2013; Heidbuchel et al., 2012; Lyon et al., 2008; 2009], including multiple previous works by the lead author.
  Our response: Thank you. We will revise accordingly. Please see our response to comments 1.1 and 1.2 above.
  Specific comments on content in the text:
  Line 23: The phrase “single age tracer” is unclear and not conventional. Please omit or rephrase.
  
  Our response: Thank you. The rephrased sentence between lines 21 and 23 now reads: “Current understanding of the dynamic flow paths and subsurface water storages that support streamflow in mountain catchments is inhibited by the lack of long-term hydrologic data and the frequent use of short residence time tracers that are not applicable to older groundwater reservoirs.”
  Lines 42-51: The rationale provided here is not very strong. To say that the processes being studied are “still incompletely understood” is not a very effective way to communicate (1) what aspects of the processes are well understood (because certainly a lot of prior knowledge does exist), (2) what is/are the explicit knowledge gap/s, and (3) how this research is designed to specifically address that knowledge gap.
  
  Our response: In the revised version of the main document, we plan on revising text between lines 42 and 51 to clearly state existing knowledge gaps and study objectives. Please see also our response to comment # 1.4 above.
  Lines 59-61: Doesn’t really make sense to say that an underestimated quantitative metric affects actual substrate weathering in Earth’s crust. The conjunctive phrase “As a result” in the following sentence seems out of place. Consider rephrasing.
  
  Our response: Our intention was to point out that an underestimated residence time can lead to inaccurate understanding or estimate of subsurface mineral weathering rate (as shown by Frisbee et al. [2013]). In the revised version of the document, we plan to revise text between lines 59 to 62 to address this comment.
  Lines 70-71: I think you should temper the language here. Robust? That implies the result should be representative across a range of systems. Yet the papers you cite apparently rely on simulated experiments in synthetic landscapes. Any evidence from the real world that you could cite?
  
  Our response: In the revised document, we used “more accurate” instead of the word “robust”. Please note that the use of “robust” in the original document refers to only one site and not to any range of systems. Our use of the word is also motived by the results reported by previous efforts [Kirchner, 2016a; b; Stewart et al., 2017] that show that the spatial and temporal aggregation errors are much less when using the fraction of young water metric, in contrast to mean transit time metric. However, as acknowledged by the reviewer, these efforts are made using a virtual experimental setup. Therefore, co-application of these two metrics, i.e., fraction of young water and mean transit time, using multiple tracer types for a real catchment, is the novel contribution of our study.
  Lines 74-76: To help emphasis the knowledge gap, I think you need to clarify what exactly is meant by “only one period”.
  
  Our response: Here, our intention was to highlight that most of the literature on fraction of young water metric is focused on Fyw for annual or seasonal tracer cycles. In our work, we estimated Fyw for not only annual time frames but also for shorter periods ranging from 2 days to 5 years (Figure 6A and B).
  Line 90: The question “what is the appropriate TTD type” is somewhat unclear. Precipitation is episodic. There is not a continuous inflow of water volumes with different ages entering any watershed. Therefore, the distribution of transit times (i.e., exit time – entry time) must also be discontinuous. This fact is illustrated by real TTDs observed from active tracer introductions [e.g., Kim et al., 2016a]. They are quite messy and not continuous distributions. Any continuous function that is chosen as a TTD for application in lumped-parameter-transport modeling is therefore just an approximation of reality. If that is accepted as true, then it seems your question could be restated as “what mathematical distribution yields simulation results that best fit the data from this particular watershed?”. That is not a question of great relevance for scientific hydrology in general, in my opinion.
  
  Our response: Note that our complete research question # 1 is “what is the appropriate TTD type and mTT for the deep groundwater system that supports streamflow?” Thus, our focus is estimating/finding an appropriate transit time distribution and distribution for deep groundwater. Please see also our response to comment # 1.2 above.
  Lines 91-93: I have a very hard time interpreting this sentence. Please rephrase. Again, I would suggest carefully explaining, or omitting, the phrase “age tracers”. Are you meaning to distinguish stable isotopes from radioactive isotopes?
  
  Our response: As most of the existing literature on the fraction of young water metric is based on the stable water isotope or chloride tracers, the aim here was to extend this literature by including tritium tracer-based fraction of young water estimates. The rephrased question 2 between lines 91 and 93 reads “What do the Fyw and storage estimates vary between shallow and deep groundwaters for a high elevation mountainous catchment?” Please see also our response to comment # 1.2 above.
  Lines 93-94: Suggest deleting “...as determined by stable water isotope tracers”. It implies to the reader that the answer to your more general question (i.e., “what is the discharge sensitivity of F_yw”) is somehow conditional on this particular data type? Is that in fact what you think? If so, it raises some concern about the generality of the results.
  
  Our response: Thank you. We plan on revising the third research question in the revised version of the main document.
  Lines 95-100: Suggest deleting all of this. A prelude to the methods elaborated on the following pages is unnecessary. The concluding paragraph of the introduction should highlight the identified knowledge gap then state the objectives of this study and how they address that gap. The final sentence raises some concern that the current work is partially redundant.
  
  Our response: In the revised version of the main document, we plan on revising the text between lines 94 and 100 to better highlight the knowledge gap and to more concisely state the study objectives and how these study objectives address the identified knowledge gaps.
  Lines 112-113: So it was a notably drier than average 9 years, or the PRISM results are biased high here?
  
  Our response: It was notably drier.
  Lines 130-131: This is a very coarse sampling resolution for the intended application of the data. Undoubtedly there are tremendous temporal dynamics in the stable-isotope composition of precipitation within and among individual storms that occur during 5-7 day intervals. The range of stable-isotope abundances in precipitation observed during individual storms may be comparable or greater to the range observed among monthly-aggregated samples collected across years [e.g., Rozanski et al., 1993]. The true temporal dynamics of tracer concentration in precipitation are lost in a lumped sample that aggregates over 5-7 days. Any quantitative model that uses those tracer concentrations as input will be very limited in its ability to accurately simulate the temporal dynamics of the same tracer in the stream. That limitation seems very germane to the stated objectives of this study. Passive, sequential sampling devices are easy to make and deploy. Analysis of stable isotope abundances by laser spectrometry for large sample numbers is relatively inexpensive. This data limitation is hard to excuse.
  
  Our response: The purpose of this study was not to explicitly simulate high temporal resolution dynamics of tracer concentrations in stream water. In contrast, by using the stable water isotope tracers, our study seeks to estimate the fraction of young water metric in a time averaged sense to compare and contrast the value and information contained in that metric at our field site to literature-reported data from other sites, in order to better understand dynamic flow path behavior. It is important to note that:
  Whatever sampling interval is chosen, there will be shorter periods of data variation that are not sampled, a characteristic Nyquist frequency, and a range of frequency responses that cannot be addressed. We have limited out interpretations to responses that can be addressed, using the data available.
  
  The physics of our system will act to filter out very high frequency variations is isotopes in precipitation. Soils are wet by rain and remain wet as more rainwater is added – mixing is inevitable. Runoff flowing in the main stem of a stream is a mixture of flow from small tributaries of different length, water held on wet leaves and water held in leaf litter or very shallow soil. Given that most of our summer storms are < 1 hour in length, mixing at shorter time scales is likely.
  
  High frequency variations (which cannot indeed be addressed in our study because of the 5-7 day sampling ) are of less interest than lower-frequency, i.e., longer-period, phenomena, as we seek estimates of mean transit time and fraction of young water metrics in a time-averaged sense.
  
  Lines 149-151: I can’t quite understand what this means. Please consider rephrasing.
  
  Our response: In the revised version of the main document, we plan on revising the sentence between 148 and 151 for a better readability.
  Line 164-166: Simplify the headings and sub-headings. Here and elsewhere there are sub-headings with no content underneath. Suggest deleting.
  
  Our response: In the revised version of the document, we plan on providing an improved structure of each section. We further plan to simplify section headings.
  Line 171: When you say “thereafter”, do you mean over longer time increments than 1 month? Please rephrase to clarify.
  
  Our response: On line 171, when we say “thereafter”, we meant for longer periods. We plan to rephrase this line for clarity in the revised version of the main document.
  Line 176: Some formatting inconsistencies with citations here and throughout the manuscript. Uneven use of open and closed parentheses and lack of spacing between cited papers within in-line citations. Proofread carefully. Suggest using “[(“ instead of duplicate parentheses. Also, I cannot find the Dwivedi 2019b entry in your bibliography. Is it missing? Put spaces between entries in the bibliography. It is terribly difficult to read through single spaced.
  
  Our response: Thank you. In the revised version of our paper, we plan to address this comment by: (i) using a consistent representation for multiple citations, (ii) providing the complete reference to Dwivedi et al. [2019] citation, and (iii) using dual-spacing for the whole main document for a better readability.
  Line 177: “expand on these results” again seems to suggest this is somewhat redundant with the previous works from the same catchment.
  
  Our response: Our statement regarding “expanding on these results” when referring to previous work of Ajami et al. [2011] or Dwivedi et al. [2019], is meant to say that our present work aims to further improve our understanding of deep groundwater flow paths by characterizing their transit time distributions and evaluating if the state-of-the-art fraction of young water metric is appropriate for deep groundwater. This has not been reported in the literature, and thus our work makes a novel contribution of assessing the appropriateness of Fyw metric for deep groundwater flow paths. Please also note that Ajami et al. [2011] have not considered either transit time distribution or Fyw for deep groundwater and Dwivedi et al. [2019] have not considered various TTD types for deep groundwater.
  Lines 185-190: Pretty sure h(tau) is the specified functional form of the TTD, but that is not stated in the paragraph.
  
  Our response: In our work, a TTD is referred to as h(τ) where τ is the transit time (in years). To address this comment, in the revised version of the paper h(τ) will be used between lines 185-190.
  Line 193: I am not familiar with the DownHill Simplex method. It is described in a single sentence, yet it is apparently the method for evaluating how appropriate is one versus the other TTD model. Could you please elaborate just a little bit on what this is for the unaffiliated reader? The KGE is used as the “model performance criteria” but you say that the Downhill Simplex was used to evaluate “the performance of each TTD”. This is confusing to me. Equation 1 is the model, but the variable performance of the model is due only to the selection of different functional forms of the TTD. So the performance of the model is a direct reflection of the performance of the function selected as TTD, no? Please clarify.
  
  Our response: We agree with the reviewer that the performance of the model, i.e., Equation 1 in our paper, is based on the performance of a selected TTD type. As far as the Downhill Simplex method is concerned, it a way to “search” for the model parameters (e.g., mean age and the shape parameters for a gamma TTD) such that the model performance is optimal, which in our work is assessed by using the modified Kling Gupta efficiency. In the revised version of our paper, we plan to explain this method in some more details for an unfamiliar reader.
  Line 228, equation 5: Use “C” with “Q” and “P” subscripted to indicate concentration in streamflow versus precipitation. You already adopted this notation in equation 1. Be consistent here and in subsequent equations.
  
  Our response: Please note that Equation (5) on line 228 in the original version of the paper is for input tracer flux, which is a product of tracer concentration (C) and precipitation flux (P). Similarly, Equation (6) is for tracer flux in stream water, which is the product of tracer concentration C and streamflow (Q).
  Lines 348-357: What about all the other models? You only discuss PF and Gamma. The KGE of the 1d-ADE falls exactly between the values for the Gamma and PF models, yet the mTT estimated by the 1d-ADE is factors of 8-9 less than the mTT from those models, respectively. Why do you ignore the other models and what do you conclude from this order-of-magnitude difference? If I understand Figure 4 correctly, then only the “ADE-nx” and Exponential function as TTDs seem to generate uniquely identifiable parameters. Is that correct? Neither model is discussed at all here.
  
  Our response: Please see our response to comment # 1.3 above.
  Lines 390-392: The data are also far too sparse to reliably fit the parameters for TTDs used in equation 1. Isn’t this confirmed by (1) the lack of unique solutions illustrated in most cases shown in Figure 4 and (2) the generally poor accuracy of all model simulations shown in Figure 5? I would argue yes.
  
  Our response: Between lines 390-392 in the main text, our emphasis is fitting sinusoidal curves to the tritium concentration data for tritium-based Fyw estimation. We noted that sinusoidal curve fitting was not appropriate to tritium tracer data due the coarse resolution of our dataset, which has also led to a higher standard error in the estimated tracer cycle amplitude. However, we were able to identify unique solutions for gamma distribution type TTD parameters when using tritium tracer under low-flow conditions (Figure 4B in the main document). Please note that Figure 4 shows the response surface for various TTD types. Therefore, neither applicability nor poor fit of a particular TTD type should be considered as an indication of data limitation alone. For example, assumptions implied during derivation of the equation model of a particular TTD type may also contribute to a poor data fit. A case to cite is the multiple-paths advection and dispersion model type TTD (ADE-nx) of Kirchner et al. [2001]. When deriving model for this TTD type, it is assumed that the recharge to an aquifer is spatially distributed. When tracer concentration prediction from this TTD is tested against the observations for our study site, the results show a very poor model fit (Figures 4D and 5D). Therefore, a poor fit for the ADE-nx model can be hypothesized as indicating that “recharge to the fractured bedrock aquifer is not spatially distributed at our field site”. This is an important finding, because such recharge pathways can lead to replenishment of deep subsurface storages that support mountain block recharge to valley fill aquifers.
  Lines 400-414: The text makes no allusion at all to Figure 7D, which has unusual qualitative axes and cannot be easily interpreted by the reader. The figure caption only provides a citation to a previous work to explain the graphic. More explanation is needed in this section of the Results, or Figure 7B should be deleted.
  
  Our response: We are confused by this comment because Figure 7 in our work only has A and B panels: Figure 7A is cited in section 3.3 (line # 302) and in section 4.4 (line # 400), Figure 7B is cited on line 549 and 550 in section 5.4.
  Lines 437-440: I am unclear what is the importance or relevance of this concept of “short-term storage”. What is it and why does it matter? In any case, you present estimates of this metric based on three competing approaches that vary by a factor of approximately 125 (0.08, 0.22, and 10.7)! Which, if any, should we believe is correct, and why?
  
  Our response: In contrast to the traditional approach for estimating subsurface storage that requires values for a subsurface property (e.g., porosity), the short-term storage can be estimated without requiring infromation on porosity by simply using the Fyw metric in conjunction with streamflow and threshold age for young water [Jasechko et al., 2016]. By quantifying this metric for our study site, we noted (between lines 562 and 564 of the orignal paper document) “Thus, after a threshold of 0.05 m short-term near-surface storage at MGC, the current study supports that infiltration may activate deeper groundwater flowpaths [Dwivedi et al., 2019].” Before our reported value of this storage, no such estimates are available/reported in literature at our study site. Please see also our response to comment # 1.1.
  Lines 472-477: The results are highly dependent on the temporal resolution of the input time series. As I noted in a comment above, if a temporal dynamic in the tracer concentration in precipitation is hidden within a sample that accumulated over 5-7 days, then the model can’t possibly simulate the effects of that dynamic in streamflow. The results are entirely dependent on the resolution of sampling the tracer concentration in inflow, and the resolution used in this study is quite coarse.
  
  Our response: We agree with the reviewer that we cannot ask a model to reveal a higher resolution pattern that the resolution of the data used in the model. Further, we also agree with the reviewer that the Fyw results are dependent on the temporal resolution of the data using in computing Fyw, which is the point we tried making in the lines cited by the reviewer. However, we respectfully disagree with the reviewer that “the resolution used in this study is quite coarse”. Please note that when using stable water isotopes for stable water isotopes-based Fyw, our data have a high resolution, i.e., daily for streamflow and approximately weekly for precipitation as it does not rain every day at our study site [Heidbüchel et al., 2012]. When using the tritium data for tritium-based Fyw, while our data have coarse resolution as we sampled low-flow conditions, they nonetheless facilitate a better understanding of the longer period component which will otherwise be hidden from sampling dynamic flow conditions from a catchment.
  Lines 499-501: You make a sweeping assumption here that the bedrock at several research sites is “water tight”. That seems quite speculative. What evidence supports this assumption? Most rocks are fractured and jointed to some extent. Even exposed, granitic plutons commonly have sufficient fracturing and water storage capacity to host woody-stemmed plant communities and support inter-storm flow from emergent springs. More support for this assumption is needed here, perhaps through more extensive synopsis of the geology of the sites used in these cited studies.
  
  Our response: Thank you for your comment. We concur with you completely. However, our use of the term “water tight” was a quotation from Gallart et al. [2020] who were describing their field site. We have made no assumptions about the tightness of the bedrock aquifer at other sites. Between lines 499 to 501 in the main document we stated the following “however, the fractured bedrock at MGC is functionally distinct than the "watertight" bedrock characterized by Gallart et al. [2020] and the majority of humid sites in Jasechko et al. [2016] that are comparable in size to MGC.”
  Lines 538-539 and 544-546: So, you’re saying the fraction of young water estimates are invalid when based on the use of Tritium as a tracer?
  
  Our response: To avoid ambiguity regarding the Fyw metric when using tritium for low flow conditions, we stated “A negligible F_yw at MGC calls into question of the suitability of the ³H-based F_yw approach for deeper groundwater.” Thus, our intent is to suggest (based on our study results) that this metric is unsuitable when applied to the baseflow or deeper groundwater component of a catchment’s flow system due to its longer residence time and significant mixing-amplitude dampening in the subsurface.
  Lines 555-564: Here there are a series of sentences elaborating some intricate details of methodology which seem misplaced in the discussion. They lead to the ultimate conclusion at the end of the paragraph that “infiltration may activate deeper groundwater flowpaths”. That is not a novel conclusion in scientific hydrology, and it is not even stated definitively here (i.e., may..). This paper uses a wide ensemble of methodological approaches, which, from my view, has only created ambiguity in how the markedly contrasting results can be interpreted. I find no new insight into hydrological processes resulting from all this computational effort.
  
  Our response: Thank you! The three cases mentioned in these lines will be placed in the methods sections in the revised main document. Please see also our response to comment # 1.1 above.
  Comments on Figures and Tables:
  Figure 3: Does the inset have a linear scale? If not, please make it linear. If so, please add more tick marks to the vertical axis so we can approximate the numeric values of the data points. Are these six data points all you have to calibrate the parameters of the TTD models used with equation 1? If so, that seems inadequate.
  
  Our response: The inset plot has a logarithmic scale as the main plot. This will be clearly stated in the revised version of this figure. Please see also our response to the comment # 1.2 above.
  Figure 4: Is the vertical axis the KGE? If so, please label it that way. I am unclear what the variable “response surface” on the vertical axis indicates.
  
  Our response: A revised version of this document that addresses your comment will be provided with the revised main document.
  Table 1: Words and numbers should not be split between rows. Use emboldened lines, or no lines, to better delineate the content. This is not acceptable for a journal article. Please make it more presentable.
  
  Our response: A revised version of Table 1 that addresses your comment will be provided with the revised version of the main document.
  Figure 5: Y-axis labels should have “3” as a superscript preceding “H” to conform to established conventions of symbolizing isotopes. Would suggest compressing this into a single graph with a legend indicating the results from different TTD models. The gray dots are the same across all 5 subplots.
  
  Our response: In the revised version of Figure 5, we plan on: (i) including all TTDs into a single plot and (ii) properly label the y-axis.
  Figure 7: Here and elsewhere the font size is illegible. Please enlarge font on axes and in legends.
  
  Our response: Thank you. This figure will be revised in the next version of the main document.
  Works Cited:
  Botter, G., E. Bertuzzo, and A. Rinaldo (2010), Transport in the hydrologic response: Travel time distributions, soil moisture dynamics, and the old water paradox, Water Resources Research, 46, doi:W0351410.1029/2009wr008371.
  Botter, G., E. Bertuzzo, and A. Rinaldo (2011), Catchment residence and travel time distributions: The master equation, Geophysical Research Letters, 38, doi:L1140310.1029/2011gl047666.
  Harman, C. J. (2015), Time-variable transit time distributions and transport: Theory and application to storage-dependent transport of chloride in a watershed, Water Resources Research, 51(1), 1-30, doi:10.1002/2014wr015707.
  Heidbuchel, I., P. A. Troch, and S. W. Lyon (2013), Separating physical and meteorological controls of variable transit times in zero-order catchments, Water Resources Research, 49(11), 7644-7657, doi:10.1002/2012wr013149.
  Heidbuchel, I., P. A. Troch, S. W. Lyon, and M. Weiler (2012), The master transit time distribution of variable flow systems, Water Resources Research, 48, doi:W0652010.1029/2011wr011293.
  Kim, M., L. A. Pangle, C. Cardoso, M. Lora, T. H. M. Volkmann, Y. Wang, C. J. Harman, and P. A. Troch (2016), Transit time distributions and StorAge Selection functions in a sloping soil lysimeter with time-varying flow paths: Direct observation of internal and external transport variability, Water Resources Research, 52(9), 7105-7129, doi:10.1002/2016WR018620.
  Lewis, S., and A. Nir (1978), On tracer theory in geophysical systems in the steady and non-steady state. Part II. Non-steady state - theoretical introduction, Tellus, 30, 260-271.
  Lyon, S. W., S. L. E. Desilets, and P. A. Troch (2008), Characterizing the response of a catchment to an extreme rainfall event using hydrometric and isotopic data, Water Resources Research, 44(6), doi:W0641310.1029/2007wr006259.
  Lyon, S. W., S. L. E. Desilets, and P. A. Troch (2009), A tale of two isotopes: differences in hydrograph separation for a runoff event when using delta D versus delta O-18, Hydrol. Process., 23(14), 2095-2101, doi:10.1002/hyp.7326.
  McGuire, K. J., and J. J. McDonnell (2006), A review and evaluation of catchment transit time modeling, J. Hydrol., 330(3-4), 543-563, doi:10.1016/j.jhydrol.2006.04.020.
  Rinaldo, A., K. J. Beven, E. Bertuzzo, L. Nicotina, J. Davies, A. Fiori, D. Russo, and G. Botter (2011), Catchment travel time distributions and water flow in soils, Water Resources Research, 47, doi:10.1029/2011wr010478.
  Rozanski, K., L. Araguas-Araguas, and R. Gonfiantini (1993), Isotopic patterns in modern global precipitation, American Geophysical Union Monographs, 78, 1-36.
  
  van der Velde, Y., P. J. J. F. Torfs, S. E. A. T. M. van der Zee, and R. Uijlenhoet (2012), Quantifying catchment-scale mixing and its effect on time-varying travel time distributions, Water Resour. Res., 48(6), W06536, doi:10.1029/2011wr011310.
  References
  Ajami, H., P. A. Troch, T. Maddock, T. Meixner, and C. Eastoe (2011), Quantifying mountain block recharge by means of catchment-scale storage-discharge relationships, Water Resources Research, 47(4), 1-14.
  Botter, G., E. Bertuzzo, and A. Rinaldo (2011), Catchment residence and travel time distributions: The master equation, Geophysical Research Letters, 38(11), 1-6.
  Dwivedi, R., T. Meixner, J. McIntosh, P. A. T. Ferré, C. J. Eastoe, G.-Y. Niu, R. L. Minor, G. Barron-Gafford, and J. Chorover (2019), Hydrologic functioning of the deep Critical Zone and contributions to streamflow in a high elevation catchment: testing of multiple conceptual models, Hydrological Processes, 33, 476-494, doi: 10.1002/hyp.13363.
  Dwivedi, R., C. Eastoe, J. F. Knowles, L. Hamann, T. Meixner, P. A. T. Ferre, C. Castro, W. E. Wright, G.-Y. Niu, R. Minor, G. A. Barron-Gafford, N. Abramson, B. Mitra, S. A. Papuga, M. Stanley, and J. Chorover (2021), An improved practical approach for estimating catchment-scale response functions through wavelet analysis, Hydrological Processes, 35(3), 1-20.
  Frisbee, M. D., J. L. Wilson, J. D. Gomez-Velez, F. M. Phillips, and A. R. Campbell (2013), Are we missing the tail (and the tale) of residence time distributions in watersheds?, Geophysical Research Letters, 4633–4637.
  Gallart, F., M. Valiente, P. Llorens, C. Cayuela, M. Sprenger, and J. Latron (2020), Investigating young water fractions in a small Mediterranean mountain catchment: Both precipitation forcing and sampling frequency matter, Hydrological Processes, 34(17), 3618-3634.
  Gleeson, T., K. M. Befus, S. Jasechko, E. Luijendijk, and M. B. Cardenas (2015), The global volume and distribution of modern groundwater, Nature Geoscience, doi: 10.1038/ngeo2590.
  Harman, C. J. (2015), Time-variable transit time distributions and transport: Theory and application to storage-dependent transport of chloride in a watershed, Water Resources Research, 51, 1-30.
  Heidbüchel, I., P. A. Troch, and S. W. Lyon (2013), Separating physical and meteorological controls of variable transit times in zero-order catchments, Water Resources Research, 49(11), 7644-7657.
  Heidbüchel, I., P. A. Troch, S. W. Lyon, and M. Weiler (2012), The master transit time distribution of variable flow systems, Water Resources Research, 48(6), 1-19.
  Hrachowitz, M., P. Benettin, B. M. van Breukelen, O. Fovet, N. J. K. Howden, L. Ruiz, Y. van der Velde, and A. J. Wade (2016), Transit times-the link between hydrology and water quality at the catchment scale, Wiley Interdisciplinary Reviews: Water, doi: 10.1002/wat2.1155.
  Jasechko, S., J. W. Kirchner, J. M. Welker, and J. J. McDonnell (2016), Substantial proportion of global streamflow less than three months old, Nature Geoscience, 9(2), 126-129.
  Kirchner, J. W. (2016a), Aggregation in environmental systems- Part 1: Seasonal tracer cycles quantify young water fractions, but not mean transit times, in spatially heterogeneous catchments, Hydrology and Earth System Sciences, 20(1), 279-297.
  Kirchner, J. W. (2016b), Aggregation in environmental systems -Part 2: Catchment mean transit times and young water fractions under hydrologic nonstationarity, Hydrology and Earth System Sciences, 20(1), 299-328.
  Kirchner, J. W., X. Feng, and C. Neal (2001), Catchment-scale advection and dispersion as a mechanism for fractal scaling in stream tracer concentrations, Journal of Hydrology, 254, 82-101.
  Stewart, M. K., U. Morgenstern, M. A. Gusyev, and P. Maloszewski (2017), Aggregation effects on tritium-based mean transit times and young water fractions in spatially heterogeneous catchments and groundwater systems, and implications for past and future applications of tritium, Hydrology and Earth System Sciences, 21, 4615–4627.
  Suckow, A. (2014), The age of groundwater – Definitions, models and why we do not need this term, Applied Geochemistry, 50, 222-230.
  
  Citation: https://doi.org/10.5194/hess-2021-355-AC3
RC2:
'Comment on hess-2021-355', Anonymous Referee #2, 15 Oct 2021

The work of Dwivedi et al. studies travel times in the Marshall Gulch research catchment, Arizona, for a better understanding of flow paths and storage in a mountain catchment. This is done through a strong data set of stable isotopes and tritium. The paper is mostly well written. I like and appreciate the combination of tritium and stable isotopes, I believe that this is an important endeavor. However, the current manuscript has a range of serious limitations.

First, the introduction reads like a patchwork of ideas and concepts but stays vague and thus not convincingly outline a limitation/research gap. Thus, the research question came somewhat out of the blue for me. I was not able to find any information if the research on these objectives is needed or not. After reading the full manuscript, I felt that this even more important as the work read like a compilation of applying methods without a clear strategy concluding that there are different results depending on tracer and methods.

Second, the methods are an issue. It is unclear why the methods are chosen. It feels like an application of a range of methods and see what comes out. I cannot find a clear strategy behind. Even more critical, by applying time invariant approaches for travel time distributions, the paper methodology is lacking a decade behind recent developments in the field (see the wide range of work, even cited, on time variant TTD and SAS functions). The young water fraction is state of the art though, but here the work again suffers from the lack of clear strategy. In addition, stable isotopes and tritum tracers should ideally be used in a joint calibration to obtain a travel time consistent for the tritium and stable isotope observations (cf. Rodriguez et al., 2021). You might even be able to calibrate the multimodal age distributions of your travel time doing so – however, this is just speculation. Yet, his could be a really nice contribution to the field of TTDs.

Overall, I think that the manuscript would need a very major rework to be publishable. This would include a full adaption of the methods to the state of the art. I have doubt that this can be done within a major revision.

Citation: https://doi.org/10.5194/hess-2021-355-RC2
- AC2: 'Reply on RC2', Ravindra Dwivedi, 27 Oct 2021
  
  RC2: 'Comment on hess-2021-355', Anonymous Referee #2
  2.1 The work of Dwivedi et al. studies travel times in the Marshall Gulch research catchment, Arizona, for a better understanding of flow paths and storage in a mountain catchment. This is done through a strong data set of stable isotopes and tritium. The paper is mostly well written. I like and appreciate the combination of tritium and stable isotopes, I believe that this is an important endeavor. However, the current manuscript has a range of serious limitations.
  Our response: We appreciate the reviewer recognizing the benefits to using multiple tracers. To address your concerns and comments, we plan to make major revisions to our paper described below. Please see also our specific responses to the comments # 2.2 and 2.3.
  2.2 First, the introduction reads like a patchwork of ideas and concepts but stays vague and thus not convincingly outline a limitation/research gap. Thus, the research question came somewhat out of the blue for me. I was not able to find any information if the research on these objectives is needed or not. After reading the full manuscript, I felt that this even more important as the work read like a compilation of applying methods without a clear strategy concluding that there are different results depending on tracer and methods.
  Our response: We acknowledge that some reorganization and restructuring of our paper is clearly warranted. As a roadmap forward, please see our response to comment # 1.2 from the first reviewer where we more clearly describe the novel contributions of our study. Therefore, in the revised version of our paper, we plan to reorganize the introduction section for better highlighting the research gaps. We further plan to use the revised introduction section to reorganize the other parts of the paper, e.g., methods, results, and discussion. Thank you for your suggestions.
  2.3 Second, the methods are an issue. It is unclear why the methods are chosen. It feels like an application of a range of methods and see what comes out. I cannot find a clear strategy behind. Even more critical, by applying time invariant approaches for travel time distributions, the paper methodology is lacking a decade behind recent developments in the field (see the wide range of work, even cited, on time variant TTD and SAS functions). The young water fraction is state of the art though, but here the work again suffers from the lack of clear strategy. In addition, stable isotopes and tritum tracers should ideally be used in a joint calibration to obtain a travel time consistent for the tritium and stable isotope observations (cf. Rodriguez et al., 2021). You might even be able to calibrate the multimodal age distributions of your travel time doing so – however, this is just speculation. Yet, his could be a really nice contribution to the field of TTDs.
  Our response: Our aim was to apply the fraction of young water metric in conjunction with the mean transit time metric to improve scientific understanding of transient flow paths in high elevation mountain systems. That said, we respectfully disagree with the criticism that our methods are lagging behind recent developments in the field. Please see our response to comment # 1.2 from the first reviewer. We also highlight that Rodriguez et al. [2021] who used both stable water isotopes and tritium in jointly assessing a suitable TTD type, sampled both tracers under similar dynamic flow conditions, which contrasts our sampling conditions for tritium.
  2.4 Overall, I think that the manuscript would need a very major rework to be publishable. This would include a full adaption of the methods to the state of the art. I have doubt that this can be done within a major revision.
  Our response: Please see our responses to comments #1.1, 1.2, 2.2, and 2.3 above.
  References
  Rodriguez, N. B., L. Pfister, E. Zehe, and J. Klaus (2021), Testing the truncation of travel times with StorAge Selection functions using deuterium and tritium as tracers, Hydrol. Earth Syst. Sci., 25, 401–428.
  
  Citation: https://doi.org/10.5194/hess-2021-355-AC2
RC3:
'Comment on hess-2021-355, Anonymous Referee #', Anonymous Referee #3, 17 Oct 2021

General comments

According the reviewer the paper needs major changes for the following reasons: (1) the overall manuscript is confusing (2) the contributions of this study are not clear.

Despite some scientific contributions can be easily seen, it is necessary to make the contributions of this study clear. In general, the manuscript is difficult to read because there is no clear common thread, some sections seem to come out of the blue. Some sections are inconsistent with previous affirmation which reduces the truthfulness of the manuscript. The overall manuscript is too long, especially Methods and Results section.

General comment of each section.

The introduction section is confusing and does not show specific novelty and sound scientific value at global scale. I suggest to restructure the introduction reinforcing the state of the art of previous studies using isotopes and TT models and after explain the novelties of this paper. The term deep and shallow groundwater flow at mountain range is referred without previous explanation. Due to the complexity of this terms, I will recommend a previous definition of those. The term fractured rock system is referred in some occasion but it is not clear if this is the case of the study area. If this is the case, I would be appropriate to explain how fractures are going to be taken into account.

At some point it seems that the authors try to reproduce his previous work Dwivedi et al. (2021) but for the “deep groundwater” however, the author says that this has already been done by Ajami et al. (2011) and Dwivedi et al. (2019b). Again, a detailed reasoning about why is this paper a novelty is needed, it is confusing. The authors highlighted the contribution of using multiple year isotope data however only one year of 3H data is used, again confusing.

The Data section needs a better description of the collector type to understand the representativeness of the data. I strongly suggest to improve Fig1. In the manuscript Fig1 A and B are referred but there is no A and B in the figure. I suggest to incorporate the rivers, a standard scale (0, 0-5 and 1 km for example) and a higher resolution DEM, it looks poor.

The Methodology section seems to be a state of the art of the existing methods than something new. There are detailed descriptions of some methods that make the reader to lose the main goal of each approximation/estimation. I suggest to delete all dispensable information. The authors say at the introduction that one of the novelties is the use of multiple year data and only one year of 3H data is used. There is a repeated need to redefine the main goals and the novelty of this study.

The Results section is firstly organized by method, then change to shallow and deep groundwater and the mix between isotope type and TTD method and finally include FyW and Tyw. This section is dishful to follow.

Explaining which is the better method or the most reliable on in each case instead of only talking about the existence differences will strongly improve the Discussion section. It is not surprising to obtain different results with different methods. I will suggest to direct the discussion to explain line 577-579.

Specific comments

Line 52: Water stale isotopes: although this term has been used in other works, the term " stable water isotopes" is not correct. Water itself does not have isotopes. The correct term is stable 18O, 2H isotopes of water.

Line 61: Underestimating or overestimating transient times have other consequences than the correct understanding of the water chemistry. I will be appropriate to explain the most important ones.

Line 92: The second goal is not clear; I do not understand what are you trying to study.

Line180: â(ð) needs to be defined here instead of line 242.

Line 296: Why only one year period?

Line 426: I would say 2-3 years.

Line 440: 10.7 “mm”

Citation: https://doi.org/10.5194/hess-2021-355-RC3
- AC1: 'Reply on RC3', Ravindra Dwivedi, 27 Oct 2021
  
  RC3: 'Comment on hess-2021-355, Anonymous Referee #', Anonymous Referee #3
  General comments
  
  3.1 According the reviewer the paper needs major changes for the following reasons: (1) the overall manuscript is confusing (2) the contributions of this study are not clear.
  
  Despite some scientific contributions can be easily seen, it is necessary to make the contributions of this study clear. In general, the manuscript is difficult to read because there is no clear common thread, some sections seem to come out of the blue. Some sections are inconsistent with previous affirmation which reduces the truthfulness of the manuscript. The overall manuscript is too long, especially Methods and Results section.
  Our response: We appreciate Reviewer 3’s constructive review. These comments are consistent also with those of Reviewers 1 and 2, and so it is clear that we need to revise our introduction and justification. Please see our specific responses to comments # 1.1 and 1.2. If “Some sections are inconsistent with previous affirmation” is meant to refers to our previous TTD work [Dwivedi et al., 2021], we note that that study was based on wavelet analysis of high-density tracer-flux time series data, and that tritium tracer concentrations or fraction of young water metric were not used in that study.
  General comment of each section.
  
  3.2 The introduction section is confusing and does not show specific novelty and sound scientific value at global scale. I suggest to restructure the introduction reinforcing the state of the art of previous studies using isotopes and TT models and after explain the novelties of this paper. The term deep and shallow groundwater flow at mountain range is referred without previous explanation. Due to the complexity of this terms, I will recommend a previous definition of those. The term fractured rock system is referred in some occasion but it is not clear if this is the case of the study area. If this is the case, I would be appropriate to explain how fractures are going to be taken into account.
  Our response: Please see our response to comments # 1.1 and 2.1. In the revised version of our paper, we will clearly state our definition of both shallow and deep groundwaters. We note that as there no deep wells within our study site, and therefore the nature of the fracture network within the study area is not known. For this reason, any impacts of fracture network on groundwater flow paths in our study area are considered to be represented by the tritium TTD.
  3.3 At some point it seems that the authors try to reproduce his previous work Dwivedi et al. (2021) but for the “deep groundwater” however, the author says that this has already been done by Ajami et al. (2011) and Dwivedi et al. (2019b). Again, a detailed reasoning about why is this paper a novelty is needed, it is confusing. The authors highlighted the contribution of using multiple year isotope data however only one year of 3H data is used, again confusing.
  Our response: Please see our response to comment # 3.1 above. That said, please note that neither Ajami et al. [2011] nor Dwivedi et al. [2019] have used/evaluated the usefulness of the fraction of young water metric to reveal the dynamic nature of flow paths either at the same or similar sites. Please also note that our statement related to use of multiple years’ data is applicable to the fraction of young water metric because the data represent water years 2008 through 2012. While our compiled/collected data for tritium is sparse, it includes sampling dates from 2009 through 2018.
  3.4 The Data section needs a better description of the collector type to understand the representativeness of the data. I strongly suggest to improve Fig1. In the manuscript Fig1 A and B are referred but there is no A and B in the figure. I suggest to incorporate the rivers, a standard scale (0, 0-5 and 1 km for example) and a higher resolution DEM, it looks poor.
  Our response: Thank you. In response to your comment, Figure 1 will be improved and properly labeled in the revision version of the paper document. Additionally, the data section will be updated to describe the collector type.
  3.5 The Methodology section seems to be a state of the art of the existing methods than something new. There are detailed descriptions of some methods that make the reader to lose the main goal of each approximation/estimation. I suggest to delete all dispensable information. The authors say at the introduction that one of the novelties is the use of multiple year data and only one year of 3H data is used. There is a repeated need to redefine the main goals and the novelty of this study.
  Our response: Please see our response to comment # 3.3 above. In the revised version of the paper, the methodology section will be shortened and reorganized to remove non-essential information. Please see also our response to comments # 1.1 and 2.1.
  3.6 The Results section is firstly organized by method, then change to shallow and deep groundwater and the mix between isotope type and TTD method and finally include FyW and Tyw. This section is dishful to follow.
  Our response: In the revised version of the paper, the results section will be reorganized to ensure that our study findings are conveyed as clearly and accurately as possible.
  3.7 Explaining which is the better method or the most reliable on in each case instead of only talking about the existence differences will strongly improve the Discussion section. It is not surprising to obtain different results with different methods. I will suggest to direct the discussion to explain line 577-579.
  Our response: The revised version of the document will include explanation of method reliability when using multiple methods (e.g., Fraction of young water or Fyw). We appreciate your suggestions related to Fyw and its discharge sensitivity.
  
  Specific comments
  
  3.8 Line 52: Water stale isotopes: although this term has been used in other works, the term " stable water isotopes" is not correct. Water itself does not have isotopes. The correct term is stable 18O, 2H isotopes of water.
  Our response: Our use of the term stable water isotope is similar to use of this term in other studies (e.g., Ajami et al. [2011]; Heidbüchel et al. [2013]; Heidbüchel et al. [2012]). In the revised document, we will clearly state that by stable water isotopes we mean stable δ¹⁸O and δ²H isotopes of water.
  3.9 Line 61: Underestimating or overestimating transient times have other consequences than the correct understanding of the water chemistry. I will be appropriate to explain the most important ones.
  Our response: The sentence between lines 59 and 61 is rephrased to read “Underestimated transit times can have cascading impacts on our understanding of subsurface weathering rates, leading to incorrect understanding of stream water chemistry [Clow et al., 2018; Frisbee et al., 2013].”
  3.10 Line 92: The second goal is not clear; I do not understand what are you trying to study.
  Our response: Our second study objective is related to estimating fraction of young water metric and the corresponding subsurface storage for dynamic and slow flow components of a catchment system. This objective is achieved by using stable water isotope and tritium tracers sampled during different flow conditions. Please not that when using the fraction of young water metric, subsurface storage in terms of short-term storage can be estimated without knowing aquifer properties [Jasechko et al., 2016]. Thus, in a way, the subsurface storage supporting streamflow through baseflow to a high-elevation catchment can be estimated without knowing the effective aquifer properties for fracted bedrock aquifers, which is a great advantage for sites where fractured bedrock aquifer are not well characterized, e.g., our study site. In the revised document, we plan to more clearly state this goal.
  3.11 Line180: â„(ðœ) needs to be defined here instead of line 242.
  Our response: This will be properly addressed in the revised version of the paper document.
  3.12 Line 296: Why only one year period?
  Our response: We invoked an annual tracer cycle or period on line 296 of the original version of the main document because the previous literature on the use of Fyw has mostly focused on annual tracer cycles. However, using our proposed mathematical model, we evaluated Fyw for various periods using tritium tracers (e.g., Table S5 in the original version of the supporting information document).
  3.13 Line 426: I would say 2-3 years.
  
  Our response: Agreed, we will make this change.
  3.14 Line 440: 10.7 “mm”
  Our response: Thank you!
  References
  Ajami, H., P. A. Troch, T. Maddock, T. Meixner, and C. Eastoe (2011), Quantifying mountain block recharge by means of catchment-scale storage-discharge relationships, Water Resources Research, 47(4), 1-14.
  Clow, D. W., M. A. Mast, and J. O. Sickman (2018), Linking transit times to catchment sensitivity to atmospheric deposition of acidity and nitrogen in mountains of the western United States, Hydrological Processes, 32(16), 2456-2470.
  Dwivedi, R., T. Meixner, J. McIntosh, P. A. T. Ferré, C. J. Eastoe, G.-Y. Niu, R. L. Minor, G. Barron-Gafford, and J. Chorover (2019), Hydrologic functioning of the deep Critical Zone and contributions to streamflow in a high elevation catchment: testing of multiple conceptual models, Hydrological Processes, 33, 476-494, doi: 10.1002/hyp.13363.
  Dwivedi, R., C. Eastoe, J. F. Knowles, L. Hamann, T. Meixner, P. A. T. Ferre, C. Castro, W. E. Wright, G.-Y. Niu, R. Minor, G. A. Barron-Gafford, N. Abramson, B. Mitra, S. A. Papuga, M. Stanley, and J. Chorover (2021), An improved practical approach for estimating catchment-scale response functions through wavelet analysis, Hydrological Processes, 35(3), 1-20.
  Frisbee, M. D., J. L. Wilson, J. D. Gomez-Velez, F. M. Phillips, and A. R. Campbell (2013), Are we missing the tail (and the tale) of residence time distributions in watersheds?, Geophysical Research Letters, 4633–4637.
  Heidbüchel, I., P. A. Troch, and S. W. Lyon (2013), Separating physical and meteorological controls of variable transit times in zero-order catchments, Water Resources Research, 49(11), 7644-7657.
  Heidbüchel, I., P. A. Troch, S. W. Lyon, and M. Weiler (2012), The master transit time distribution of variable flow systems, Water Resources Research, 48(6), 1-19.
  Jasechko, S., J. W. Kirchner, J. M. Welker, and J. J. McDonnell (2016), Substantial proportion of global streamflow less than three months old, Nature Geoscience, 9(2), 126-129.
  
  Citation: https://doi.org/10.5194/hess-2021-355-AC1

Ravindra Dwivedi, Christopher Eastoe, John F. Knowles, Jennifer McIntosh, Thomas Meixner, Ty P. A. Ferre, Rebecca Minor, Greg Barron-Gafford, Nathan Abramson, Michael Stanley, and Jon Chorover

Supplement

https://doi.org/10.5194/hess-2021-355-supplement

Ravindra Dwivedi, Christopher Eastoe, John F. Knowles, Jennifer McIntosh, Thomas Meixner, Ty P. A. Ferre, Rebecca Minor, Greg Barron-Gafford, Nathan Abramson, Michael Stanley, and Jon Chorover

Viewed

Total article views: 2,143 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
1,593	469	81	2,143	165	99	108

HTML: 1,593
PDF: 469
XML: 81
Total: 2,143
Supplement: 165
BibTeX: 99
EndNote: 108

Views and downloads (calculated since 08 Jul 2021)

Month	HTML	PDF	XML	Total
Jul 2021	208	64	3	275
Aug 2021	38	11	0	49
Sep 2021	37	10	0	47
Oct 2021	146	24	8	178
Nov 2021	58	23	1	82
Dec 2021	35	5	1	41
Jan 2022	38	9	0	47
Feb 2022	15	4	0	19
Mar 2022	14	4	1	19
Apr 2022	20	4	0	24
May 2022	12	8	1	21
Jun 2022	1	1	1	3
Jul 2022	22	1	0	23
Aug 2022	9	10	0	19
Sep 2022	16	8	0	24
Oct 2022	19	6	1	26
Nov 2022	37	4	7	48
Dec 2022	18	6	1	25
Jan 2023	18	15	2	35
Feb 2023	13	4	0	17
Mar 2023	12	8	1	21
Apr 2023	7	4	1	12
May 2023	8	5	2	15
Jun 2023	14	4	2	20
Jul 2023	31	13	1	45
Aug 2023	12	9	1	22
Sep 2023	13	5	3	21
Oct 2023	13	7	1	21
Nov 2023	9	3	1	13
Dec 2023	7	2	3	12
Jan 2024	14	2	1	17
Feb 2024	10	12	0	22
Mar 2024	26	17	1	44
Apr 2024	19	4	4	27
May 2024	8	3	4	15
Jun 2024	14	4	2	20
Jul 2024	17	2	19
Aug 2024	8	1	0	9
Sep 2024	6	1	0	7
Oct 2024	19	8	1	28
Nov 2024	15	3	0	18
Dec 2024	6	2	0	8
Jan 2025	11	4	2	17
Feb 2025	15	4	0	19
Mar 2025	21	7	3	31
Apr 2025	6	7	0	13
May 2025	22	5	1	28
Jun 2025	25	15	4	44
Jul 2025	12	9	1	22
Aug 2025	75	12	2	89
Sep 2025	282	11	1	294
Oct 2025	22	22	1	45
Nov 2025	14	10	4	28
Dec 2025	26	25	4	55

Cumulative views and downloads (calculated since 08 Jul 2021)

Month	HTML	PDF	XML	Total
Jul 2021	208	64	3	275
Aug 2021	38	11	0	49
Sep 2021	37	10	0	47
Oct 2021	146	24	8	178
Nov 2021	58	23	1	82
Dec 2021	35	5	1	41
Jan 2022	38	9	0	47
Feb 2022	15	4	0	19
Mar 2022	14	4	1	19
Apr 2022	20	4	0	24
May 2022	12	8	1	21
Jun 2022	1	1	1	3
Jul 2022	22	1	0	23
Aug 2022	9	10	0	19
Sep 2022	16	8	0	24
Oct 2022	19	6	1	26
Nov 2022	37	4	7	48
Dec 2022	18	6	1	25
Jan 2023	18	15	2	35
Feb 2023	13	4	0	17
Mar 2023	12	8	1	21
Apr 2023	7	4	1	12
May 2023	8	5	2	15
Jun 2023	14	4	2	20
Jul 2023	31	13	1	45
Aug 2023	12	9	1	22
Sep 2023	13	5	3	21
Oct 2023	13	7	1	21
Nov 2023	9	3	1	13
Dec 2023	7	2	3	12
Jan 2024	14	2	1	17
Feb 2024	10	12	0	22
Mar 2024	26	17	1	44
Apr 2024	19	4	4	27
May 2024	8	3	4	15
Jun 2024	14	4	2	20
Jul 2024	17	2	19
Aug 2024	8	1	0	9
Sep 2024	6	1	0	7
Oct 2024	19	8	1	28
Nov 2024	15	3	0	18
Dec 2024	6	2	0	8
Jan 2025	11	4	2	17
Feb 2025	15	4	0	19
Mar 2025	21	7	3	31
Apr 2025	6	7	0	13
May 2025	22	5	1	28
Jun 2025	25	15	4	44
Jul 2025	12	9	1	22
Aug 2025	75	12	2	89
Sep 2025	282	11	1	294
Oct 2025	22	22	1	45
Nov 2025	14	10	4	28
Dec 2025	26	25	4	55

Viewed (geographical distribution)

Total article views: 2,057 (including HTML, PDF, and XML) Thereof 2,057 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 28 Dec 2025

Short summary

This study applies multiple metrics including the fraction of young water and its discharge sensitivity and mean transit time using young as well as old groundwater age tracers to improve understanding of the dynamic nature of hydrologic flow paths at a sub-humid mountainous site. The results show that the aforementioned metrics yield unique information and they are helpful in understanding the nature of transient flow paths and observable storage volumes that contribute to streamflow.


Total:	0
HTML:	0
PDF:	0
XML:	0