Reply on RC1

RC: Lane et al. present an analysis of high flow metrics for 346 catchments in a future climate, considering 30 different parameter fields and 12 different RCMs. Although there is really nothing substantially wrong with the approach, especially when the goal is to support policy making, I think scientifically there are some missed opportunities which makes that the scientific gain of this study is currently limited.

Thank you for your helpful feedback and for taking the time to write a review. Whilst we agree the results of the paper will be useful to support policy making, it is primarily a science based paper. As such, the paper explores the complex issues of uncertainty evaluation within national catchment responses under climate change. It highlights the value in considering hydrological model parameter uncertainties (found to be especially important for catchments in southeast England), and investigates how predicted responses to climatic change differ over regions and national scales that also depend on catchment characteristics and geoclimatic regimes.

Response:
We do not agree with this assertion. Ultimately all climate impact modelling papers will have made different methodological choices and simplifications. Here, we focused on analysing uncertain climate impacts at the national scale, which is already computationally demanding within a high resolution (1.2km 2 average HRU scale) distributed model. Additionally we evaluate the model parameter uncertainties and impacts of these cascades on streamflow changes. This goes beyond previous national studies for Britain, which have not previously considered hydrological parameter uncertainties over such a large number of catchments.
We respond to comments on each choice separately below. In response to these comments we will add to the methods and data section to ensure that it is clear why our methodological choices were made.

RC: All
RCMs are based on one single GCM. Why only one, and why this one? It reads as if this is not necessarily the GCM that performs best in the region of GB. Only RCP8.5 is considered, why only this one and what would that imply for the results? Response: The aim of this study was to explore hydrological model parameter uncertainties within a national climate impact study. We selected the UKCP18 climate projections to help us meet this aim as they have many advantages over other products, including 1) they were the nationally recognised highest resolution RCM climate model outputs available for a continuous time period over GB, 2) they were specifically developed for the UK and previous UKCP products have formed the basis of UK climate policy (Murphy et al. 2018), 3) they include a measure of climate uncertainty through the use of an RCM ensemble, 4) as RCM projections they are high resolution (12km) and have full spatial and temporal coherence which is needed to evaluate future climate change impacts on high flows in a spatially distributed hydrological model, 5) they are the newest national climate projections for GB, including the latest developments in climate modelling capability and scientific understanding, and have not yet been comprehensively analysed in other studies.
The UKCP18 projections only included RCM simulations for a single GCM, but still explored some climate uncertainties through the use of an RCM ensemble. This approach was also used for the UKCP09 climate projections which have been used in many UK climate impact studies (e.g. Prudhomme et al. 2013, Bell et al. 2016, Kay et al. 2018). The RCM ensemble was considered sufficient for our aim of assessing the hydrological model uncertainties within a national climate impact study. Importantly, we also found that minor differences between the RCM runs resulted in a huge variation of hydrological implications, showing that the RCM parameterisations which may be expected to be less influential were crucial after all. We are aware that the use of a different GCM would produce differing results, and this limitation is recognised in our discussion (lines 445-452). In response to all reviewers commenting on the use of a single GCM, we will clarify why the UKCP18 product was chosen in section 2.3. We will also make the limitations of this clearer in the discussion, adding that other GCMs may result in different precipitation trends into the future.
UKCP18 also only included RCM projections for the RCP8.5 scenario. We considered this the most important scenario to look at for two reasons: 1) it shows the 'worst case' and so will most likely show the largest expected changes, 2) the emissions in RCP8.5 are in close agreement with historical total cumulative CO2 emissions and more and more are looking like a plausible future (Schwalm et al. 2020). But again we recognise that our results would have been different if an alternative scenario had been used, and we acknowledge that it is best to use multiple scenarios if the information is available. In response to reviewer comments, we will 1) add a sentence to section 2.3 emphasizing that the RCP8.5 was the only available scenario and gives a 'worst case' and at this time 'most likely' future projection, 2) expand on the discussion of missing uncertainty sources to include emissions scenario and the impact this might have.
We previously developed a modelling framework that uses spatial parameterisations which are nationally consistent and reflect core parametric uncertainties, constrained by available data using MPR (Lane et al. 2021). Our next step was to apply the model setup from Lane et al. (2021) to evaluate the hydrological modelling uncertainties when driving the model with a set of climate change scenarios. This manuscript presents the first GBwide study to include hydrological model parameter uncertainties alongside RCM uncertainties, and further explores the relationship between catchment characteristics, climatic changes and changing high flows across a large sample of catchments. For this climate impact study, it was important to use the DECIPHeR model structure based on topographic flow gradients as a primary metric to define hydrological similarity as it has already been evaluated and parameterised across Great Britain (Coxon et al. 2019;Lane et al. 2021). We would like to note that DECIPHeR is not TOPMODEL except that topography is an important driver of hydrological response.
Whilst testing model structural variability was not an aim of this study, we agree with you that DECIPHeR gives a great opportunity to test different model structures and this is an ongoing research goal. Adding additional structures for different geologies, general climate variations, land management practices and human impacts is ongoing research that will take time to develop from our findings of Lane et al. (2021). However, that does not detract from the importance and need of quantifying uncertainties in hydrological impacts with our recent national model simulations.

RC: Why were different datasets used for P and PET for the bias correction? Both datasets seem to have both variables.
Response: The CHESS dataset uses CEH-GEAR as its rainfall data -we refer to the original dataset as is standard practice and to make clear the source of the data. The CEH-GEAR rainfall dataset does not include PET.
RC: Why was snow not included as process in the hydrological model? I was surprised that T was not required as input for the hydrological model (indeed, to simulate snow processes), to only find at the end of the discussion that snow was not accounted for -with a reference to eastern Scotland where the snow fraction can be up to 0.17. There might be valid reasons for each of these choices, but they can be better explained.
Response: Snow was not included as a process in the hydrological model because it affects so few catchments (95% of the catchments included in this study have less than 6% of precipitation falling as snow ). The exception to this is in the Cairngorm mountains in Scotland where the fraction of precipitation falling as snow can reach 17% (Coxon et al. 2020).Consequently, we included this in the discussion so that readers were fully aware of this limitation. We will ensure that modelling decisions are better explained in the revised manuscript.

RC: At several places, it is mentioned that caution should be taken when
interpreting the results (related to snow, see later point, and to potential groundwater flow). Not even widely discussed are the catchments with reservoirs/regulated flow (they are only mentioned as being excluded for the analysis of the evaluation of the model chain, but how many of the 346 are (heavily) regulated? are they spatially clustered? and how can we somehow validate the simulation of these catchments if the regulation is not included in the model structure?). Taken into account these three factors (snow, groundwater, regulation) that require caution in the interpretation, it becomes a bit difficult to determine which numbers have meaning, and which don't. Given that uncertainty estimation is one aspect of this study, it are precisely these kind of aspects that might need more attention.

Response:
Of the 346 catchments analysed, the majority (60%) have no reservoirs in the catchment. 71 gauges (20%) have 1-5 reservoirs upstream, and 20% of gauges have more than 5 reservoirs upstream. While 40% of gauges do have a reservoir in the catchment, the capacity of the reservoirs is an additional important indicator of its impact on the flow time series as many of these reservoirs have a small capacity relative to the average precipitation and flow at the gauge. Of the 346 gauges, only 20 (5%) had a capacity greater than 10% of mean annual rainfall.
We excluded catchments with reservoirs/ regulated flows from the evaluation of the model chain because we would not expect the model to accurately simulate these flows without reservoir information in the model structure. However, in response to this comment we explored how the model error in simulating AMAX, Q1, Q10 and Q50 varied between the catchments with regulated flows/ reservoirs and those without. We found that there was not a reduction in performance between the catchments with regulated flows/reservoirs and those without -and will add these extra plots to the supplementary information.
To respond to this comment, we will include information about reservoirs in the supplementary information. Figure 7, one could expect that the parameters have influence on the non-linear relation between precip and Q1, while it is precisely the median of the parameter sets that is displayed here (same for the runoff coefficient). Therefore, after reading the manuscript, I still don't have the feeling I fully comprehend the uncertainty in the projections and their implications for the results.

RC: Besides that many choices are not well substantiated, I think there are some missed opportunities in analyzing and presenting the effect of uncertainty (in this case, introduced by RCMs and parameters). For instance in
Response: Again, we do not agree that 'many' of our choices are not well substantiated. When creating Figure 7 we explored using different hydrological parameter sets, and ended up selecting the median as the choice of parameter set made little difference to the overall picture. We will add plots to the supplementary information to demonstrate how this plot changes for different hydrological model parameter set selections, as well as to demonstrate how this plot differs for other precipitation quantiles and flow statistics. Response: Thank you for highlighting these papers -we will add references into the discussion to further support section 4.3. We note that two of these studies are focused on a single multi-nested river basin, albeit of considerable size. And that Melsen et al. focused purely on the sign of change and did not use any form of weightings to define the quality and fit of the ensembles of model simulations. We shall discuss these issues in our revised manuscript. RC: Last minor thing: In the discussion (l. 430) the selection of a metric is referred to as a source of uncertainty. I'm not sure I entirely agree with that. Different metrics will lead to different results, simply because they evaluate different things. That is, in my opinion, not uncertainty but simply the result of a (hopefully deliberate) choice. It does show, however, that it can be useful to evaluate multiple metrics.

RC: In spite of what is written in the introduction
Response: We agree with you on this. We will alter the text from "The selection of metrics used to explore climate impacts was a further source of uncertainty; the picture of climate change impacts on flows differed between the four metrics presented here." to "The overall picture of climate change impact on flows differed between the four selected metrics, showing the importance of metric selection and consideration of multiple metrics in model evaluation and impact studies." RC: l. 244; average flow is not necessarily equal to median flow.
Response: We agree and will change 'average' to 'median.' RC: If this review sounds a bit harsh, it is because I know the authors can do better. Most of the material is already there, therefore I am confident that the authors will be able to improve the manuscript such that it will add more to the scientific literature.