|Thank the authors for taking the time to respond to the previous comments and revise the manuscript. However, there are several important points that I still can’t agree with the authors. Also, I still think that the authors miss some critical points and rather rely too heavily on their model results to support their conclusions. Those points obscure what readers can learn from this study. I believe that this manuscript still has potential, but some issues need to be resolved before it can be considered for publication.|
THE LIMITATIONS OF THE DATA AND THE STUDY SITE
I noticed that the authors mentioned many limitations of this study in one section pretty well, but there are some crucial limitations that can obscure their major arguments significantly. I list some of those limitations here with my opinions.
Figure 3 shows that there are no tritium samples at high flow conditions, and thus, one cannot learn transport dynamics at the high flow conditions using 3H no matter which model is used. Thus, I believe that any arguments based on the 3H-based model results at high flow conditions (e.g., the TTDs at high flow conditions) are risky because such results are just based on "extrapolations" that the model did. In their result interpretation, the authors mostly used the TTD weighted by discharge, which is in part based on the TTDs estimated at high flow conditions and even gives more weights to those TTDs. Thus, many (and most) of the arguments in the abstract and the conclusion such as "Tritium and stable isotopes both had the ability to reveal short travel times in streamflow", "The travel time differences were small compared to previous studies, and contrary to prior expectations, we found that these differences were more pronounced for young water than for old water", "our results highlight that stable isotopes and tritium have different information contents on travel times but they can still result in similar TTDs.", and "We conclude that stable isotopes do not seem to systematically underestimate travel times or storage compared to tritium" are based on the extrapolation. Drawing scientific conclusions based on extrapolation is risky and not a good practice.
Another problem is that travel time is relatively short in this catchment. It has been argued that the tritium is beneficial as it allows us to examine long time-scale transport dynamics (e.g.,> ~4 years in Stewart et al., 2010). However, in this catchment, the travel time is relatively short in general, and a considerable fraction of TTD (> ~90%) is defined over the travel time less than 5 years (based on Table 3). This short travel time obscures the relative importance of the use of 3H to examine longer time scale transport dynamics because longer time scale transport is less important (or negligible) in reproducing the tracer dynamics in this catchment. Therefore, I worry if the study site is adequate, and I'm not sure about the worth of most of their arguments such as "The travel time differences were small compared to previous studies, and contrary to prior expectations, we found that these differences were more pronounced for young water than for old water", "we did not find that stable isotopes are blind to old water fractions as suggested by earlier travel time studies ", "Based on the results in our experimental catchment in Luxembourg, we conclude that the perception that stable isotopes systematically truncate the tails of TTDs is not valid", "our results highlight that stable isotopes and tritium have different information contents on travel times but they can still result in similar TTDs.", and "We conclude that stable isotopes do not seem to systematically underestimate travel times or storage compared to tritium". Is a similar conclusion expectable for a catchment that has a longer travel time? Or is it just because the studied catchment has short water travel time in general? If the latter is the case, what can be said about the well-known importance of tritium tracer by studying this catchment?
THE MODEL AND THE RESULT INTERPRETATIONS
Again, their model cannot reproduce some short time scale transport dynamics (based on Figure 5 and the low NSE values). Such an inadequate model structure underestimates the information content in 2H by resulting in not well constrained posterior parameter distributions for the behavioral models. If they had a model that captures short time scale dynamics well, posterior distributions of the associated parameters of such a model could be more constrained than a 3H-based model. Thus, more information can be learned from 2H, compared to a 3H-based model, than what is described in this paper. Therefore, I disagree with the authors' argument that a better performing model would not change the conclusion of this study. For example, their argument "Tritium was slightly more informative than stable isotopes for travel time analysis despite a lower number of tracer samples" is susceptible to their model structure. Tritium was useless at wet conditions (because they have no samples at wet conditions), and the 2H-based model was not constrained well at wet conditions (which underestimate the information content of 2H in their analysis method) because of its structural problem.
Also, the authors reported that the 3H-based model learned more information (4.47 bits) compared to the 2H-based model (which learned 4.08 bits of information). The authors argued that this is because 3H informed the model about ET processes more, compared to 2H, based on the posterior distributions of the model parameters (in line 504). However, it didn't come with any scientific reason why 3H would inform more about the ET processes. I don't think that there is any literature on it, and I personally can't think of any reason. Without a scientific basis, the result seems just an artifact of their model structure and their method of analysis. Why would 3H inform more about the ET processes than 2H?
Line 424: Typo in ST ∈ [0,+∞ “[”.
Line 471: “The travel time and storage measures estimated from a joint use of 2H and 3H are the highest (tables 3 and 4).” This result is counter-intuitive. Why the joint use gives the highest travel time and storage, not something in the middle?
The 2010-2015 data that was used in the spin-up period need to be presented.