Comment on hess-2021-174

Bondy et al. investigate the role of soil storage capacity and field capacity on the longterm water balance of several catchments within the Budyko framework. They use a simple lumped hydrological model to explore how two parameters representing soil storage capacity and field capacity, respectively, affect the partitioning of precipitation into evapotranspiration and streamflow. The authors find that storage capacity is a strong control on modelled evapotranspiration, and to a lesser extent field capacity. They then discuss how these findings might relate to hydro-pedological processes and catchment coevolution.

Bondy et al. investigate the role of soil storage capacity and field capacity on the longterm water balance of several catchments within the Budyko framework. They use a simple lumped hydrological model to explore how two parameters representing soil storage capacity and field capacity, respectively, affect the partitioning of precipitation into evapotranspiration and streamflow. The authors find that storage capacity is a strong control on modelled evapotranspiration, and to a lesser extent field capacity. They then discuss how these findings might relate to hydro-pedological processes and catchment coevolution.
The paper is well written and within the scope of HESS. Understanding second-order controls on the long-term water balance is an important topic and the general idea of the paper is very interesting. I think, however, that the study requires major revisions before it can be published. Generally, I think it lacks some depth and some of the arguments or not sufficiently backed up by either the results or references. I will outline my major and minor comments below.

Major comments:
As you discuss briefly in Sect. 4.1.2., there are (significant) interactions between the model parameters. As the model is fairly simple, but most (all?) conclusions are based on varying the model parameters, I think it would be good to spend some more time on exploring parameter interactions (e.g. by performing a more thorough sensitivity analysis), not just of the two main parameters, but of all four parameters. How important are the other two parameters for the water balance of the model? How realistic is it to fix three parameters and vary the fourth with respect to real systems (in which potentially only certain combinations of parameters occur)? This is particularly important for your discussion of co-evolution later in the manuscript: is a model with three fixed parameters and one varied parameter really representing a meaningful evolutionary trajectory? I would also be more cautious when it comes to interpreting the parameter values. You look at a simple lumped model, so I wonder if there's a way to be more convincing about what these parameters actually mean. There is some discussion about their range being reasonable, but I think that's all a bit too general, and does not relate explicitly to any of the catchments you studied here. You could use some catchment attributes (e.g. soil texture) to see whether the optimised parameters match what you would expect from independent data or expert knowledge (e.g. are the calibrated parameter values different for catchments with sandy/silty/clayey soils?).
You justify using a (nowadays) rather small sample by saying that you want to focus on 16 distinctly different catchments. But then there is very little discussion or information about the catchments you study, except for the climatic attributes shown in Fig. 1. I miss a more detailed description of the catchments studied, in particular as a way to back up the model results with some real world system characteristics. Because if the catchments are selected based primarily on (widely available) climatic attributes, then I wonder why you didn't use a larger sample of catchments to obtain more robust conclusions. I am not convinced by the idea of co-evolution, at least how it is presented here. I think this section needs much more supporting evidence than a relatively simple modelling experiment (which mostly focuses one variable parameter). I think a more in-depth discussion about the catchments you study would help in that respect (see also my comment above). You cite a few papers, but Hartmann et al. (2020), for instance, look at pro-glacial moraines, and you use it to back up the very general statement that "Both porosity and the fraction of silt and clay increase with time". I am also wondering over what time scales such a co-evolution might happen. This is something many discussions about co-evolution do not mention explicitly. Many of the catchments you study are in areas that are perhaps not directly impacted by water abstractions etc. right now, but they've been impacted by humans for a long time. For example, the catchments in the south east of BW are in an area with a lot of agriculture, and also the Black Forest has been heavily impacted/managed in many areas. How does such a long history of agriculture or forestry affect co-evolution?

Minor comments:
There are a few typos, which I won't list individually, but I would suggest to reread the manuscript carefully L.9: I would write "Budyko framework" instead of "Budyko curve" L.10: I suggest writing validated instead of verified (also in L.36) L.10. "wide range of climates and settings" -what kind of settings?
L.17: I suggest defining evaporation ratio (as ET/P) here instead of introducing the acronym L.33: Perhaps a bit picky, but stating that Budyko "observed a considerable degree of clustering around the Budyko curve" sounds a bit odd, as he probably didn't think of it as the "Budyko curve". L.37 and following: I like the idea of presenting how you came up with the project by looking at Peruvian catchments, but in the end you only use one catchment in Peru. Weren't there more Peruvian catchments available? Table A-1: It would be interesting to also know the names of the rivers/gauging stations. Section 2.4: What exactly is the special similarity to HBV? The model just seems to be one (of many) bucket-type models.
L.223: Why did you not use daily values to calculate KGE?
Why did you not choose an independent evaluation period to ensure the model isn't overfitted (e.g. a 15y-15y split)?
You consider a water balance error of < 15% to be small -that does not sound that small to me, especially given that you study the long-term water balance.
It seems like in the arid catchments, you systematically overestimate the water balance by approx. 10%. Is that just more pronounced because streamflow is lower here and hence a similar absolute error appears bigger? What if you used relative error in ET instead of Q, since you mostly focus on ET? Table 2: The upper limit for Smax here is 800mm, but later you mention that you vary it from 1 to 2000mm L.292: "a clear clustering at 5-15%" -to me this doesn't look like a clear clustering as half of the points are above 20% (and two are not shown) Figure 5 (right): I would suggest writing the definition of the normalised storage volume (e.g. Smax/P) on the y-axis. Also consider using (a), (b), etc. as labels for subplots L.368 and 400: write "parameter values" instead of "parameterisation" L.440: "The lower the number of rainy days…" That's only true if the total rainfall depth is the same. If there are 50 rainy days with a total of 500mm, and 100 days with a total of 1000mm, the depth per event would be the same. It would be useful to perform a quick check, as it seems like total rainfall is correlated with the number of rainy days. L.573: I am not sure what you mean here. What's the alternative approach? A modelling experiment with artificial data? L.576: "tangible catchment properties"… that's an example of a statement I find too general, an issue I already raised in my major comments. Yes, you discuss the ranges of the parameter values, but you're looking at calibrated parameter values from a lumped model without any in-depth catchment analysis. I don't think that's necessarily a problem, but I think you should discuss more clearly what your study can tell us and what not.
Lastly one general comment about the role of storage: what about storage in (deeper) layers and the role of groundwater? While some plants might have access to groundwater, catchments underlain by permeable aquifers might also "lose" water, which might limit evapotranspiration.