The manuscript by McGill et al. examines spatiotemporal variability of stream thermal sensitivities for two watersheds in Washington, USA. They collected water and air temperature data from 73 sites distributed across the Snoqualmie and Wenatchee basins. Most loggers ran for seven years. The data were used in statistical models to estimate thermal sensitivities (the slope coefficient of air temperature in a linear regression model). Seasonal models were applied, as well as time-varying coefficient models. Clustering analysis was performed to group sites that shared similar thermal sensitivities. These clusters were then used to explore how thermal sensitivity varied with climate and landscape variables. They argue that thermal sensitivity showed strongest relationships with elevation, snow water equivalent, and variables representing groundwater influence. Some variables which were expected to be related to thermal sensitivity, such as percent riparian forest cover, did not vary in a systematic way. Some key conclusions made by the authors are: (1) it is essential to acknowledge the non-stationarity of the relationship between air and water temperature, (2) snow and geological characteristics shape the relationships between air and water temperatures at the study site, and (3) classifying rivers based on thermal sensitivity is a powerful tool when planning for global change.
This is my first review of this manuscript (I did not review the original submission). Overall, the manuscript is generally well written and covers a topic suitable for HESS. My general feeling is that there is considerable amount of unexplained variability in the thermal sensitivity estimates. Reporting these sorts of noisy findings can be useful, but some of the key conclusions seem more inconclusive than how they are stated. I share a couple key comments, followed by some specific feedback. I reviewed the first round of reviewer comments and replies, and I echo some of those comments and feel as though the current version could still be improved with regards to structure and results/discussion.
1) Improving structure and flow of the manuscript
As the other reviewers pointed out, the presentation of the study would be much improved if structured around some key hypotheses. The authors replied that this is an exploratory study and that they want to avoid the use of null hypothesis significance testing. I am sympathetic to those concerns (I'm glad that a p-value is nowhere in sight), but the authors could focus on framing the study around scientific (distinct from statistical) hypotheses informed by the extensive literature on spatiotemporal variability of river temperature. Even exploratory studies will have hypotheses driving the direction of the exploration. The content for doing this is already more or less in the manuscript; however, the organization is challenging to follow. For example, some hypothesized drivers are listed in the hybrid Table 3, but this table isn't referenced until page 10. In addition, there is considerable geological context provided in the discussion that could be introduced earlier so that it doesn't feel so unexpected. I suggest using a paragraph or two at the end of the introduction to better frame the study and expected outcomes from the analyses.
2) Challenges in linking results to underlying controls
Much of the discussion tries to link the patterns in thermal sensitivities to underlying process controls. This is difficult since the study is exploratory, focuses on correlations, and uses statistical abstracts that can be a few steps removed from the actual observational data. For example, although the coefficient estimate associated with the air temperature term in the regression models gives you an idea of how water temperature co-varies with air temperature, you lose some information about the thermal regime at that site. Since there isn't a systematic relationship between air temperature and thermal sensitivity (Figure 3a), comparing thermal sensitivities can't tell you whether a particular stream is colder or warmer than another stream during the summer (for example). However, this can be important information for diagnosing key controls on stream thermal regimes (e.g., we might expect a colder stream, in summer, to have more groundwater influence, for example). This challenge is further compounded in this study because the thermal sensitivities are then used within clustering analyses and regression trees. The authors are familiar with these sites, and which streams have colder vs warmer or stable vs dynamic thermal regimes (or what the dominant geology of the site is), but as a reader, I find it difficult to follow these connections. Understanding these connections is crucial for interpreting how the results support the conclusions of this study. I challenge the authors to rethink how this information is presented. I provide some suggestions below, but it might be helpful to show more of the actual stream temperature time series for the individual sites (maybe in the supporting information) and referring back to those data when making interpretations.
Partly related to the above, in my own experience I have found that there can be considerable uncertainty in the estimate of the slope coefficient/thermal sensitivity. What kind of uncertainties were associated with these estimates for this study? Could these be added as uncertainty intervals to the figures?
Finally, there does not appear to be any assessment of the performance of the CART model. Many of the key conclusions rely on the results of this modelling; therefore, showing the overall performance of these models seems important. These CART models have a tendency to overfit and can be sensitive to individual data points. I recommend including some sort of evaluation of the models (e.g., leave-one-out cross validation).
Specific comments:
L54-56: I would perhaps qualify this as '... is often the most important...' since there are conditions when solar radiation is a secondary driver of river temperature (e.g., winter periods for well-shaded reaches - see Leach et al. (2023) and maybe references within for some examples).
L58-59: I'm not sure the Webb and Zhang (1999) or Mohseni and Stefan (1999) are the best references to support the statement that runoff composition and groundwater inflow are important influences on river temperature. The former focused on essentially point-scale heat budgets with an emphasis on energy exchanges at the air-water interface and the latter looked at air-water temperature relationships. A better reference might be Cadbury et al. (2008).
L77: Typo.
L85-86: The second objective is awkwardly worded. It seems to state whether clusters of air-water temperature correlations differ from clusters based on air and water temperature. Before reading the rest of the manuscript, this objective seemed to me to be asking the same thing. Consider rephrasing for clarity.
L98-109: This paragraph would benefit from some specifics. For example, provide mean January and July air temperatures and give some idea of precipitation amounts. 'Wenatchee receives a greater proportion of winter precipitation as snow' - how much greater? Figure 2 provides some context, but include some summary statistics within the text, as readers unfamiliar with this region will have little context for these general statements.
L116: Was air temperature also logged hourly?
L164: This question may not make sense, as I'm not familiar with TVCMs: What window size (in days) corresponds with a bandwidth of 0.2?
L220-221: How were clusters with mean Jaccard coefficients between 0.5 and 0.75 treated?
L240-242: I see you include these long-term air temperature and precipitation values here. As I noted above, I suggest moving some of these long-term values up to the study site description. Also, what do you mean by 'long-term'? Figure S1 seems to suggest 1901-2000, but this is not clear. Also, are these DayMet output? Weather station data (if so, which stations)? Please clarify where these values come from.
L257: I thought the data only focused on total SWE, but this statement suggests a relationship with 'snowmelt events'. It's not clear when and how the analysis focused on snowmelt events. Or are the authors assuming a single snowmelt event occurring in the spring? Is this reasonable to assume? My guess is that, given the region, these watersheds are located within a transient snow zone and snowpacks can form and melt multiple times per winter, but maybe that's an incorrect assumption?
L254-262: I was waiting to see if there was any explanation of how these landscape variables were calculated/estimated. There is a reference to Hill et al. 2016 in Table 1, but that citation is not in reference list. I would guess that mean slope and elevation were derived from a DEM, but I have no idea where a hydraulic conductivity estimate would come from. Is this estimate for the channel bed? Surficial geology of the upslope area?
L282-301: Can the number of sites within each cluster be included in the text? It's done for a few clusters, but including all of them would limit the need to reference back to the table.
L302: What is meant by 'hydrogeology' here?
L317-319: I don't understand this statement, especially '... reflect aspects of river dynamics not redundant with water and air temperature.' But aren't air temperature and climate related? Also, do the results of this study support this statement? It seems like most of the landscape variables (I assume some of these are what the authors mean by 'geology') have very weak correlations with thermal sensitivity. Even the CART analysis seems to suggest minimal explanatory power of these variables.
L335: Perhaps include Kelleher et al. (2021) here. Although focused on river temperature trends, not thermal sensitivity, they make a similar key point that seasonal trends can differ from annual or just summer patterns.
L356-357: How are the processes controlling river temperatures 'more diverse' in spring/summer than in fall/winter? I would argue all the same energy exchange processes are occurring (radiative and turbulent exchanges, advection, etc.), it is just the relative magnitudes that differ seasonally.
L372: This is the first mention of glacial influence in these watersheds. How much glacial coverage is there? Which sites had upstream glaciers? Why wasn't glacial coverage included as a landscape variable?
L387: This seems to be the first time that 'geologic controls' is clarified to mean baseflow index, hydraulic conductivity and soil depth. Although this may seem obvious to some readers, I think this should be clearly stated earlier in the manuscript. Baseflow index can be influenced by factors other than groundwater (e.g., persistent, high-elevation snowpacks, glaciers, or flow regulation - especially downstream of a dam/lake, which seems to be the case for some of these sites). In addition, there are no details on where these hydraulic conductivity and soil depth estimates come from and what they represent.
L393: Are 'groundwater metrics' clearly important? Some of the variables that could be associated with groundwater influence often have relative variable importance values of less than 10% - that doesn't seem very important to me. Also, there is no performance evaluation of the CART model.
L401-403: It is difficult to follow the logic here. The authors highlight that the relationships between thermal sensitivities and groundwater metrics were mixed (and in some cases they were counter-intuitive). They note uncertainty in using these metrics to capture groundwater influence, especially in mountain headwater streams. They then conclude that thermal sensitivity is a promising indicator of groundwater influence. I don't see how the results of this study support this statement.
L410-411: Looking at Figure 6, I can't tell that soil depth, hydraulic conductivity and baseflow index are high in streams that overlay the lower portion of the watershed. Can these be shown in a more clear and convincing way?
L404-432: A lot of geological context is suddenly provided in this section. Have the authors considered putting some of this context within the study area description? Also, are there maps to show where the measurements sites are relative to these geological features?
L471: Did the authors explore the sites that were located downstream of reservoirs and lakes? Could that explain some of the spatial variability observed in this study? A number of studies have highlighted that reservoirs and lakes can have a strong influence on downstream thermal regimes.
Figure 1: Why is there a dashed line for thermal sensitivity = 0.5?
Figure 2: Where were the SWE and precipitation data collected? How representative are these values for the entire watersheds?
Figure 3: Please label the subplots with (A), (B), and (C), as indicated in the caption. Also, it would be interesting to see thermal sensitivity plotted against mean summer stream temperature.
Figure 4 and 5: Can the number of sites within each cluster be shown on these figures (e.g., change the facet labels to show: 'Cluster 1 (n = XX)').
Table 1: What is the Hill et al. 2016 data source? It is not listed in the reference list.
Table 2: How were the data grouped to compute these metrics? This is not clear to me. Are these simply summaries of daily mean air and water temperatures grouped by site, season and year? Or are these the means of the inter-annual thermal sensitivities estimated by the time-varying coefficient models?
Figure S1: Please show precipitation anomaly in SI units.
References
Cadbury, S. L., Hannah, D. M., Milner, A. M., Pearson, C. P., & Brown, L. E. (2008). Stream temperature dynamics within a New Zealand glacierized river basin. River Research and Applications, 24(1), 68-89.
Kelleher, C. A., Golden, H. E., & Archfield, S. A. (2021). Monthly river temperature trends across the US confound annual changes. Environmental Research Letters, 16(10), 104006.
Leach, J. A., Kelleher, C., Kurylyk, B. L., Moore, R. D., & Neilson, B. T. (2023). A primer on stream temperature processes. Wiley Interdisciplinary Reviews: Water, e1643. |