Exploring the role of soil storage capacity for explaining deviations from the Budyko curve using a simple water balance model

The Budyko curve is a widely used framework for predicting the steady-state water balance –solely based on the hydro-climatic setting of river basins. While this framework has been tested and verified across a wide range of climates and settings around 10 the globe, numerous catchments have been reported to considerably deviate from the predicted behavior. Here, we hypothesize that storage capacity and field capacity of the root zone are important controls of the water limitation of evapotranspiration and thus deviations of the mean annual water balance from the Budyko curve. For testing our hypothesis, we selected 16 catchments of different climatic settings and varied the corresponding parameters of a simple water balance model that was previously calibrated against long-term data and investigated the corresponding variations of the simulated water balance in 15 the Budyko space. We found that total soil storage capacity –by controlling water availability and limitation of evapotranspiration– explains deviations of the evaporation ratio (EVR) from the Budyko curve. Similarly, however to a lesser extent, the evaporation ratio showed sensitivity to alterations of the field capacity. In most cases, the parameter variations generated evaporation ratios enveloping the Budyko curve. The distinct soil storage volumes that matched the Budyko curve clustered at a normalized storage capacity equivalent to 5-15 % of mean annual precipitation. The second, capillarity-related 20 soil parameter clustered at around 0.6-0.8, which is in line with its hydropedological interpretation. A simultaneous variation of both parameters provided additional insights into the interrelation of both parameters and their joint control on offsets from the Budyko curve. Here we found three different sensitivity patterns and we conclude the study with a reflection relating these offsets to the concept of catchment coevolution. The results of this study could also be useful to facilitate evaluation of the water balance in data-scarce regions, as they help constrain parameterizations for hydrological models a priori using the 25 Budyko curve as a predictor.


Introduction
Reliable a priori estimates of the catchment water balance based on minimum data requirements still represents the holy grail for many hydrologists (Sivapalan, 2003). Budyko (1974) postulated a framework to address exactly this issue, based on a top-30 down estimate of the steady state energy and water balance of hydrological systems. By relating the normalized actual evaporative release of water vapor to the atmosphere to the corresponding normalized atmospheric demand, using rainfall supply for normalization, he observed a considerable degree of clustering around the Budyko curve. Ever since, the Budyko framework has been successfully used for catchment classification studies at continental scale (Berghuijs et al., 2014;Wagener et al., 2007), for reducing equifinality in conceptual models by constraining the catchment water balance (Gharari et al., 2014), 35 or for verifying uncalibrated predictions of the catchment water balance using thermodynamic optimality approaches (Porada et al., 2011). Due to this widely reported success and its theoretical underpinning (Wang et al., 2015;Westhoff et al., 2016), it appeared straightforward to us to use the Budyko framework to constrain the mean water balance of data-scarce Peruvian catchments (subcatchments of Chillón and Lurín river), which contribute to the fresh water supply of the city of Lima.
However, upon comparing the water balance of a gauged catchment in the region with the Budyko curve, we noticed a 40 considerable offset. The question thus arose whether the deviation relates to poor data quality or whether they can be explained https://doi.org/10.5194/hess-2021-174 Preprint. Discussion started: 30 March 2021 c Author(s) 2021. CC BY 4.0 License. by physiographic catchment characteristics. While both climatic and physiographic factors control the steady-state water balance, evaporation itself is commonly conceptualized as either energy or water limited. Water limitation of evaporation, however, strongly relates to root zone storage supply and thus root zone storage capacity, because evaporation is two or three orders of magnitude slower than surface runoff. Root zone storage capacity determines the amount of plant-available water 45 and can be characterized by its total storage volume as well as capillarity-related properties like the storage at field capacity held against gravity. While free soil water above field capacity feeds groundwater and ultimately streamflow, the water content between field capacity and wilting point (effective field capacity) sustains evaporation. These catchment properties controlling root zone storage and recharge capacities are co-evolutionary fingerprints of climate and the geology setting (Gentine et al., 2012;Troch et al., 2015). 50 Budyko-type models have long been used for studies providing empirical relationships for first-order controls of the climate (dryness) on the mean annual water balance (e.g. Ol 'dekop, 1911;Schreiber, 1904). It was, however, the curve proposed by Budyko (1974) based on the analysis of over one thousand European catchments, that gained widespread attention and was used in numerous studies. Its underlying empirical equation corresponds to the geometric mean of the two aforementioned 55 studies by Ol'dekop and Schreiber. The Budyko curve describes the mean annual evaporation ratio (EVR = ETa/P) as a function of the climatic dryness index (ϕ = ETp/P), with mean annual actual/potential evapotranspiration ETa/ETp and mean annual precipitation P (all values represent long-term annual averages). The Budyko framework assumes that the macro-climate expressed by the dryness index can be used to classify and distinguish climates and biomes, and that it is a dominant control on the steady state partitioning of annual rainfall into runoff and evaporation. The Budyko framework uses a steady-state 60 supply-demand concept, in which the two-dimensional Budyko space is bounded by the water and energy limits that represent the physical boundaries for the mean evaporation flux. The framework was developed for long timescales and large spatial scales where the macro-climate dominates and subscale variability average out. While the Budyko curve explains a significant degree of the between-catchment variance of mean water partitioning, scattering around the curve with more or less considerable offsets lead to the question of how these deviations can be explained and if they can be generalized in an extended 65 similarity framework.
The offsets from the Budyko curve, occurring in both directions along the EVR axis, represent the combined effects of secondorder controls on the mean water balance. There is broad agreement that second-order controls and potentially resulting offsets from the Budyko curve are caused by both subscale climate variability and physiographic characteristics of the catchment. For 70 instance, Milly (1993) and Milly (1994) explored the influence of soil water storage on the annual average water balance, using a 1d vertical soil water balance model with a stochastical meteorological forcing. While Milly's approach was simplified with respect to variability of the forcing, it nevertheless explained 85 % of the variance in water balances in the contiguous USA east of the Rocky Mountains. Milly identified the dryness index, ratio of plant-available water holding capacity to annual average precipitation and number of precipitation events per year as main controls. 75 Another widely applied approach relies on parametric versions of a Budyko-type model, introducing a catchment-specific parameter (Choudhury, 1999;Fu, 1981). Numerous studies use these parametrized equations, relating parameter values to physiographic catchment characteristics by fitting the model to observation data (e.g., Abatzoglou and Ficklin, 2017;Bai et al., 2020;Li et al., 2018). Donohue et al. (2007) used the catchment-specific parameter to incorporate vegetation information (e.g., leaf area, photosynthetic capacity and rooting depth) in the Budyko framework in order to explain and correct offsets 80 from the original curve. Roderick and Farquhar (2011) used the parametrized equation to investigate changing of climate conditions and catchment characteristics. Physiographic catchment characteristics that affect the water balance are nevertheless manifold and interrelated, and it thus appears inappropriate to represent those by a single parameter. Reaver et al. (2020)  There is thus a need to better understand second-order controls on the long-term water balance and their relationship to the original, non-parametric Budyko curve.
Our main objective is to explore the role of specific soil characteristics in the steady-state water balance and we hypothesize that root zone storage is an important physiographic control of offsets from the Budyko curve. Instead of using a parameterized 90 version of the Budyko framework based on a lumped parameter, we propose a modeling approach to target specific model parameters that are more relatable to physiographic characteristics of a catchment. To that end, we selected 16 catchments covering a wide range of climate and landscape settings and calibrated a simple hydrological model (HBV-type) on the longterm water balance (30 years). Specifically, we used the calibrated models to investigate how variations in total soil storage and capillary storage fraction affect offsets from the Budyko curve, and to look for similarities in terms of the storage 95 configurations that match the Budyko curve.

2
Methods, data and model

Selection of study catchments
In order to represent a broad range of climate settings, we based our study on several publicly available datasets from around 100 the globe. Our choice was also conditioned by the type of available data (precipitation, potential evapotranspiration and streamflow), a minimum time series length of 30 years and the degree of preprocessing (especially spatial aggregation to catchment area) to allow for a multi-catchment approach. We finally selected 16 study catchments from the three datasets listed in Table 1. 105 -A wide range of climatic dryness indices: In order to integrate catchments covering a large climatic gradient, we 110 selected catchments spreading over a dryness index between ϕ = 0.3 to ϕ = 2. For extremer dryness values such as in desert regions or in extremely humid or cold regions (e.g., polar regions), rainfall partitioning into runoff and evaporation is not expected to relate to soil water storage characteristics. In the case of the MOPEX dataset, where several catchments at similar or same dryness indices are available, we picked a random subset of catchments.
-Catchment area: We selected lower mesoscale catchments ranging from around 50 to 1.000 km 2 . Larger catchments 115 potentially contain climate gradients and need to be represented by more complex distributed models. This hinders identification of clear causal relations.
-Minimum anthropogenic influence: In line with the Budyko framework that was developed on the basis of pristine catchments, we excluded catchments with significant anthropogenic disturbance from the study. The MOPEX dataset claims that its catchments are of little anthropogenic disturbance. The BaWue catchments were drawn from a 120 preselection where anthropogenic influences in the form of extractions or inlets were excluded. The selected headwater catchment in the Peruvian Andes is sparsely populated due to its elevation and only has a few smaller reservoirs not expected to alter the annual catchment water balance significantly.
-A closed water balance: We preferred catchments with a closed long-term water balance (within 5 % error), because this is a pre-condition to apply the Budyko framework and it facilitates water balance modeling. 125 -No significant snow/ice dynamics: We did not select catchments with significant snow and ice storage, to assure that water limitation is mainly controlled by storage in the root zone.

Data and preprocessing
The Budyko framework was derived empirically, and is applicable at steady state, climatological timescales at which inter-130 annual storage changes in the catchment become negligible. In terms of modeling input (meteorological forcing) and output (stream flow) for the study, daily data for 30 consecutive years were retrieved for each catchment to fulfil that premise. In the following we briefly outline the necessary preprocessing steps to prepare the different data sets for modelling.

Preparation of the BaWue dataset
The German Meteorological Service (DWD) provides 1x1 km Germany-wide raster datasets for several climatological 135 meteorological variables, stemming for example from the interpolation of point-wise monitoring data (e.g., from rainfall gauges) or from the processing in the framework of the spatially distributed agrometeorological AMBAV-model. For the BaWue dataset, catchment averages of daily precipitation (DWD, 2020a) and potential evapotranspiration (DWD, 2020b) were derived from the corresponding raster datasets. The potential evapotranspiration estimates are essentially based on the Penman-Monteith method. Stream flow data were obtained from the environmental agency of the State of Baden-Württemberg (LUBW, 140 2020).

Preparation of the MOPEX dataset (USA)
The MOPEX dataset (Duan et al., 2006) provides complete, catchment-averaged time series of precipitation, minimum/maximum daily temperature, NOAA climatological pan evaporation as well as stream flow data for a total of 438 catchments. Since NOAA climatological pan evaporation is based on seasonal averages with the same values recurring every 145 year, it was considered to be less suited as forcing data for a hydrological model. Instead, potential evapotranspiration was estimated based on daily minimum and maximum temperature (Samani, 2000).

2.2.3
Preparation of the Peruvian data set SENAMHI, the Peruvian Meteorological Service provides a national gridded precipitation data product (Aybar et al., 2020). 150 The precipitation interpolation model relies on a combination of ground-based rainfall gauges as well as remotely sensed information used to derive spatial and seasonal patterns of precipitation and cold cloud duration. The gridded PISCO data was https://doi.org/10.5194/hess-2021-174 Preprint. Discussion started: 30 March 2021 c Author(s) 2021. CC BY 4.0 License. used to calculate catchment average precipitation for the Obrajillo (P-1) catchment. Station data from the SENAMHI station "Canta" as well as a regionally calibrated Hargreaves-Samani model was used to estimate potential evapotranspiration. For the purpose of gap filling and obtaining catchment averages from the point-wise measurements, linear correlations to nearby 155 stations as well as elevation-dependency of the temperature were made use of. Streamflow data for the Obrajillo catchment was provided by SENAMHI. For this catchment, streamflow data was available only for roughly 19 out of the 30 years of meteorological data used to compute long-term water balances.

2.3
Characteristics of selected catchments 160 We finally selected 16 catchments for the study, seven from Germany (IDs: "B-x"), eight from the US (IDs: "M-x") and one from Peru ("P-1"). For the sake of readability, the original catchment/stream gauge IDs from the datasets were modified. show the geographic locations of the catchments. Figure 1 provides an overview over catchment and climate characteristics spanned by the selected catchments. The catchments cover areas between 50 and 1000 km 2 . The catchments in  Württemberg in Germany cover the most humid climate settings with dryness indexes of from 0.31 to 0.77, while some of the drier MOPEX catchments, mostly due to significantly higher potential evaporation, range between 1.05 and 1.55. In all catchments, annual total precipitation exceeds 750 mm/year. The most humid catchments in Germany reach annual totals of up to 1600 mm/year. The variation of the dryness index largely stems from the higher variations in energy supply. This is reflected in the spreading of the annual potential evapotranspiration between 500 mm/year and 1350 mm/year. Potential 170 evapotranspiration is quite evenly distributed among the catchments in Germany, whereas precipitation is more heterogenous.
Figure 1 also provides the number of rainy rays per year (a rainy day is defined as P > 1 mm/d). For most catchments, the number of rainy days correlates with mean annual precipitation. However, in the Peruvian catchment (P-1) 150 rainy days occur per year, a frequency similar to the far more humid catchments in Germany. In the more arid MOPEX catchments, the number of wet days per year is generally lower ranging between 80 and 100. Catchment M-7, however, has the lowest number 175 of rainy days, despite a total annual precipitation of 1075 mm.
Rainfall seasonality was calculated according to Walsh and Lawler (1981) where Pi is annual precipitation for year i and Pij is monthly precipitation for month j in year i. For the multiannual timescale the annual seasonality indexes were averaged. Rainfall seasonality is higher in the drier catchments, in particular in catchments 180 P-1 and M-7 ( Figure 1).

Hydrological modeling
The conceptual hydrological model we use for this study is a simplified version of the HBV model (Lindström et al., 1997).
HBV is a widely-used hydrological model, capable of reproducing catchment dynamics across numerous hydrological settings (e.g. Booij, 2005;Osuch, 2015;Uhlenbrook et al., 1999). In the following section we explain our slightly altered and simplified 190 derivate.

Conceptual model structure
Our modeling approach for the water balance is fully lumped and thus based on catchment-scale averaged values, with daily precipitation and potential evapotranspiration as meteorological forcing. The model consists of the HBV soil store to model runoff generation and actual evapotranspiration, and a single linear reservoir for daily streamflow ( Figure 2). 195 The soil store is characterized by the total storage volume Smax, its field capacity FC, and β-parameter. Smax corresponds to the product of effective porosity and soil depth, while FC describes the threshold below which actual evapotranspiration drops 200 below the potential one. The water balance of the water balance of the soil bucket is: with soil water storage SM (mm), precipitation P (mm/d), actual evapotranspiration ETa (mm/d) and direct runoff Qd (mm/d).
Direct runoff per time is calculated based on the relative saturation using a power law with β as parameter (Eq. 3). The remaining water infiltrates, and feeds evapotranspiration, while direct runoff goes to the linear reservoir: 205 Actual evapotranspiration is a linear function of soil moisture SM below FC as given by Eq. (4) and (5): Contrary to the usual reservoir series used in the original HBV model, we use a single linear reservoir to simulate streamflow. 210 It is characterized by a recession constant kres (1/d) and its reservoir storage S(t), as described by Eq. (6): This model is rather simple, but fits the purpose of annual water balance simulations (Uhlenbrook et al., 2010) and a multicatchment approach. Here, we focus on two qualitatively different types of storage. The model accounts for the capillaritybound storage fraction SM < FC and corresponding water limitation of evaporation, while for SM > FC evaporation is not 215 water limited. Runoff production increases nonlinearly with SM until Smax. In order to characterize the relative portion of both storage fractions we define the capillary storage fraction FCfrac as FCfrac = FC/Smax.

Model calibration and objective functions
In order to reproduce the catchment water balance, we had to calibrate the hydrological model's parameters. Meteorological 220 forcing data (P, ETp) and discharge data described in section 2.2 were used to optimize the model parameters. Due to the simple fully lumped model structure and our objective to reproduce the annual water balance, the model parameters were optimized for monthly discharge values using the Kling-Gupta-efficiency (KGE) (Gupta et al., 2009) as objective function.
An acceptable simulation of the water balance at the monthly scale was deemed acceptable for exploring the partitioning of rainfall into runoff and evapotranspiration at the annual and inter-annual scales. The calibration was performed on the entire 225 datasets covering 30 consecutive years, excluding the first year as model spin-up phase. In order to make a final catchment selection based on model performance, not only monthly KGE but also the resulting mean biased water balance error (MBE) was taken into account. The MBE was calculated as given by Eq. (8), with annually aggregated streamflows, Qi, respectively for the i-th hydrological year: While for the monthly KGE a threshold of 0.7 was set for acceptable model performance, a water balance error of MBE ≤ 15 % was considered sufficiently small.

Sensitivity of the water balance to soil storage parameters
We investigated the behavior of mean annual water balances across a wide range of catchments with different soil water storage properties. Therefore, the calibrated models with their optimized parameter sets were used to vary the two parameters characterizing soil water storage, Smax and FCfrac in the following within three different variation schemes: iii. Combined variation of both soil storage parameters: all possible parameter combinations of Smax and FCfrac, given the same boundaries and increments as in i. and ii. 250 For each parameter combination resulting from the iterative variation process, we ran a long-term simulation (30 years at a daily timestep) and calculated the mean annual evaporation ratio (EVR). Observed EVR were estimated based on the assumption that at multiannual timescales, catchment storage changes are negligible and that mean actual evaporation thus equals the difference between mean annual precipitation and mean annual observed discharge (ETa=P -Q).

Water balance simulations
The model performed acceptably for the selected study catchments, with monthly KGE > 0.8 and a water balance MBE within ±15 % (Figure 3). While for catchments with lower dryness indexes, the MBE is does not exceed 5 %, it is noticeably higher for the more arid ones, reaching errors close to +15 % indicating slight overestimations of the mean annual discharge. 260 The calibrated model parameters cover their predefined parameter ranges, without reaching the boundaries (Figure 3 and Table   C-1 in Appendix C). The calibrated β parameters varies between 0.8 and 4.7, indicating a large spread between strong to moderate growth of area contributing to runoff with relative saturation. Smax ranges between 70 mm and 800 mm. Assuming a porosity of e.g. 0.4, this corresponds to an average root zone depth between 0.175 and 2 m. Field capacity ranges between 40 270 90 % of total root zone storage, suggesting either a rather small or strong influence of capillarity on root zone storage. The kres parameter is quite uniformly distributed for the more humid catchments with values around 0.1-0.2, whereas it shows greater variability throughout the drier catchments with values between 0.26 and 0.77.

Variation of total storage volume Smax 275
The selected catchments spread across a dryness range from 0.30 to 1.55, while simulated evaporation ratios (EVRsim), caused by the incremental variation of Smax, range between 0.05 and 0.92 ( Figure 4). Generally, a higher total storage volume Smax corresponds a larger evaporative fraction, as visualized by the color code of the plots. At the minimal total storage volume of Smax = 1 mm, the catchments' evaporation ratios are around 0.1, almost independent of the dryness, as nearly 90% of the precipitation would run off. An increase in Smax by only 20 mm causes EVR to jump from 0.25 to 0.4. The total range of the 280 EVR varies for the different catchments, with smaller ranges for the more humid systems, which tend to approach the energy limit at a certain point. MOPEX catchment M-7 shows the largest EVR range.
The offset from the Budyko curve is a nonlinear function of total storage volume, normalized with annual precipitation, for most study catchments (Figure 5 left). The reduction of the initially negative offsets with increasing storage shows a steep 285 decline at small normalized storage volumes which flattens to an almost asymptotic curve at larger normalized storage volumes. This appears plausible, as the EVR is bound by the energy limit as an asymptote. When the latter is reached, the curve becomes horizontal as can be seen for the humid catchments reaching the energy limit. Exceptionally, catchment M-7 is characterized by a gradual and steady increase in EVR, with a quasilinear development up to a Smax/Pann-avg ratio of about 0.5, never really reaching this asymptotic tendency. 290 The EVR offset of most catchments is zero at a distinct normalized total storage volume. A comparison of these distinct total storage volumes revealed a clear clustering at 5-15 % of the annual rainfall ( Figure 5 (right)). Exceptions are the Peruvian catchment P-1 as well as the U.S. catchment M-1, which do not reach the Budyko curve at all. It is also important to note that the catchment with the highest dryness index, M-8, meets the Budyko curve at a normalized total storage volume of 1.2.   keeping Smax constantly at the calibrated value. The lower the FCfrac parameter, the more water evaporates -being subject to water limitation in the soil-which implies higher evaporation ratios in the Budyko space ( Figure 6). The total spreading of 320 EVR is generally smaller, when compared to the variation of the total storage volume. The min-max extent of simulated EVR varies throughout the catchments, the majority of which generate EVR ranges scattering in a narrow envelope around the https://doi.org/10.5194/hess-2021-174 Preprint. For humid catchments, many of those are located close to the energy limit. Catchments with dryness indices above one also 325 reach high evaporation ratios. For instance, catchments M-8 and M-4 show simulated EVR values of around 0.9-0.95 at their lowest FCfrac values, which is close to the water limit.
For most catchments, the gradual increase of capillary storage fraction FCfrac causes a decrease in simulated EVR, which is initially quite slow at low FCfrac values, indicating little sensitivity in this parameter range (Figure 7). At higher FCfrac values 330 of around 0.4-0.6 the reduction becomes steeper. Note that 50 % of the catchments, mostly humid ones, reach the Budyko curve at distinct capillary storage fractions clustering between 0.6 and 0.75. For another group of four catchments this distinct capillary storage fraction cluster at 0.9, which corresponds to the maximum. For two other catchments, the Peruvian P-1 and the German B-1, the distinct capillary storage fractions are around 0.2. Both show a quasilinear dependency of the evaporation ratio on FCfrac. The M-7 catchment, as in the previous exercise, does not reach the Budyko curve. 335

Simultaneous parameter variation
The simultaneous variation of the total storage volume and capillary storage fraction revealed three main types of 2dimensional Budyko offset and EVR sensitivity pattern. Each type is visualized using representative catchments in Figure 8.
-Type 1: humid, close to energy limit 340 Almost all parameter combinations result in an evaporation ratio close to the Budyko curve with the exception of the minimum Smax: value of 1mm).

-Type 2: intermediate dryness, little seasonality
There is a parameter domain whose combinations result in an evaporation ratio close to the Budyko curve (hereinafter referred to as "Budyko domain", with EVR offsets within ±0.05 from Budyko curve). At low FCfrac values, the Budyko 345 domain is very sensitive to an increase of Smax, while at higher FCfrac values, this sensitivity is inverted. In between the two extremes, there is a transition zone with intermediate sensitivity of the Budyko domain to both sort of parameter changes. In this example, this transition zone extends from roughly 15-50 % of normalised total storage volumes and capillar storage fractions between 0.55-0.8 (see yellow square in Figure 8).

-Type 3: dry ( >1) catchments with pronounced seasonality 350
The two catchments with stronly seasonal climate M-7 and P-1 revealed similar EVR patterns. The Budyko domain is reached for normalized total storage volumes of more than 60 % and even 90 % of annual rainfall, respectively, at

Discussion
In this study, we used a conceptual hydrological model to conduct a systematic variation of two soil storage-related parameters (Smax, FCfrac) for selected catchments across a variety of climate and landscape settings. The main goal was to investigate their 365 role as second-order controls on the steady-state catchment water balance and in particular their suitability to explain offsets from the Budyko curve.
We start our discussion with the performance of the water balance modeling, the calibrated parameterizations, and the implications of the approach taken for the presented results. Secondly, we reflect on the sensitivity of the water balance to soil storage variations, and the related findings for different groups of catchments. Thirdly, we discuss the relative offsets from the 370 Budyko curve, and the clustering we found in the distinct soil characteristics for matching the Budyko curve. Finally, we interpret our results in terms of catchment coevolution in the Budyko framework, concentrating on patterns that emerged during the simultaneous parameter variations.

Model performance and hydrologic processes representation
The hydrological model, despite its simplicity, proved capable of reproducing monthly discharge dynamics as well as the catchments' annual and interannual water balance in the 30 years of simulation period. The usefulness of similar HBV model versions for simulating discharge and water balance dynamics has been shown throughout numerous studies at comparable spatio-temporal scales (e.g. Lindström et al., 1997;Osuch et al., 2015;Seibert, 1999;Uhlenbrook et al., 2010). The 380 performance of the model was slightly inferior for more arid catchments, perhaps due to more interannual variability in the https://doi.org/10.5194/hess-2021-174 Preprint. Discussion started: 30 March 2021 c Author(s) 2021. CC BY 4.0 License. annual water balances (and potentially also the rainfall-runoff mechanisms), which is more likely in drier climates (Koster and Suarez, 1999).
The hydrological model conceptualizes and simplifies hydrological processes. The chosen modeling approach is primarily focused on catchments where soil water storage plays a crucial role in the partitioning of rainfall into runoff and 385 evapotranspiration, which includes a large number of catchments around the globe. For other settings, e.g., with considerable impact of snow cover or Hortonian overland flow, the dominant processes would not be well represented by the model used for this study. On the other hand, our soil storage-based reasoning is not as relevant for these types of catchments, since the water partitioning is conditioned by influences not related to soil storage volumes. Other processes that are not explicitly modelled include soil moisture redistribution due to percolation and capillary rise, and the effective cutting of 390 evapotranspiration below the permanent wilting point. The effects of these processes on the water balance, however, are potentially compensating, depending on the individual conditions in a catchment. While percolation into deeper layers and the introduction of a permanent wilting point are likely to reduce evapotranspiration, capillary rise would rather increase evapotranspiration. The successful calibration suggests that the model yields robust estimates of mean annual water partitioning, the lack of process detail notwithstanding. The simple hydrological model we used thus seemed adequate for the 395 main purpose of the study.

Catchment parametrizations and parameter interrelations
The hydrological setting of the studied catchments is represented by their calibrated parameter combinations. We found notable variability in the calibrated parameterizations not only between the three global regions, but also among the catchments within 400 one region, for example in the geographically limited region of Baden-Württemberg in Germany (B-x). The catchments thus cover a range of relevant conditions, while the limited number allowed us to keep track of more detailed characteristics of each catchment. The systematic selection helped us explore the influence of storage volume-related parameters on mean hydrologic partitioning and their relationship to the Budyko curve for different hydroclimates and catchments in a more direct way than it would have been possible in a statistical analysis on an unsystematic collection including as many catchments as possible. 405 Nonetheless, the selection of catchments is limited, and does not cover all possible meteorological forcings and hydrologic responses. For instance, the drier regimes used in this study have climates with high potential evapotranspiration. The datasets and the selection process did not yield any catchments in the dry regime with low annual precipitation (cf. section 2.3).
Including such catchments in future studies would show if additional variability in the calibrated parameterizations would also increase the sensitivity to the subsequent parameter variation conducted on that basis. 410 In this regard, the issue of parameter correlation also needs to be addressed. The calibration results show strongly contrasted β and Smax values for the drier catchments included in the study, suggesting that the interplay of the two parameters affects the monthly (monthly KGE calibration) and thus likely also the annual water balances. Evidently, also total root zone storage capacity and its field capacity are closely related, as both increase with increasing fraction of silt and clay in the soil. This naturally implies that the model parameters Smax and FCfrac are interrelated as well and interact with respect to the sensitivity 415 to EVR (Figure 8). The separate variation of both parameters yields, nevertheless, information about their relative importance in controlling EVR. The significantly larger EVR ranges in the results showed that total storage capacity dominated against the subdivision of the total storage volume in free and capillary controlled fractions. The simultaneous variation of both parameters provides a better understanding of the interactions and helps to infer distinct combinations that match the Budyko curve and find behavioral parameter sets (Schaefli et al., 2011). 420 https://doi.org/10.5194/hess-2021-174 Preprint. Discussion started: 30 March 2021 c Author(s) 2021. CC BY 4.0 License.

4.2
The role of soil storage characteristics for the evaporation ratio

Variation of the total storage volume
Soil water storage hydrologically acts as a control for direct runoff generation and it buffers water to feed the much slower evapotranspiration process from intermittent rainfall. When storage capacity increases, the soil is less likely to be water-425 saturated, leading to a higher saturation deficit and thus infiltration potential (1 -(SM/Smax) β ) during rainfall events. This causes an increased water stock in the root zone which feeds evapotranspiration. In the extreme case of zero soil storage capacity, corresponding to impervious soil surface, nearly all precipitation would run off as overland flow, and the EVR would tend to zero. This is also shown by the low evaporation ratios in the corresponding simulations with the lowest Smax value of 1 mm. When Smax was increased from the minimum to small and moderate values, water partitioning was very sensitive to 430 changes in total soil storage capacity, with evaporation ratio ranges ΔEVR from 0.1 to 0.3 for most catchments. The observed variations in sensitivities among the catchments, however, suggests additional controls on the EVR resulting from the interplay of the meteorological forcing and the parametrization.
A correlation analysis revealed that the number of rainy days per year explains 93 % of the variability in the total EVR ranges 435 (beyond minimal Smax = 1 mm) that occurred during the variation of total storage volumes ( Figure 9). Interestingly, the catchment M-7 with the highest EVR range of ΔEVR = 0.6, is characterized by a comparably small number of rainy days, which indicates, given the total rainfall amount, rather intense rainfall events (catchment characteristics in Figure 1). This is in line with Milly (1994), who found that the number of rainy days is a sensitive variable in terms of the role of soil storage and for the mean annual water balance. The lower the number of rainy days, the higher the mean rainfall depth of the events, 440 the more storage and infiltration capacity a soil requires to retain the water and expose it to the atmospheric demand for evaporation. For almost all catchments, the sensitivity to further increase of storage capacity vanished beyond a critical normalized total storage volume, with negligible to no changes in mean evaporation (see Figure 5). For humid, energy-limited systems, a further increase of total soil storage cannot increase evapotranspiration anymore, once the energy limit is reached. The other systems reach this quasi-asymptotic behavior when two competing soil moisture influences are balanced in the model. On the one hand, 450 a further increase of Smax leads to a lower relative saturation, causing a higher infiltration potential and thus providing more water for subsequent evaporation. On the other hand, this decrease of relative soil moisture leads to an increasing reduction of https://doi.org/10.5194/hess-2021-174 Preprint. Discussion started: 30 March 2021 c Author(s) 2021. CC BY 4.0 License. the evaporation flux (imitating capillary forces), which in turn retains moisture longer in the soil and limits further decrease of soil moisture. The critical Smax value for reaching this behavior depends on forcing characteristics and parametrization.
Catchment M-7, characterized by an exceptionally low number of rainy days per year, does not reach this near-asymptotic 455 behavior within the bounds of the Smax variation, again underlining the aforementioned influence of rainfall frequency on the importance of total water storage volume regarding mean annual partitioning. This finding is again in line with Milly (1994), who found a maximum value of water-holding capacity, beyond which mean evapotranspiration does no longer increase significantly.

Variation of the capillary storage fraction in soil
The mean annual water balance was also sensitive to changes of the capillary storage fraction of the soil, FCfrac, spanning EVR ranges of around ΔEVR=0.3 and ΔEVR=0.5 and with notable between-catchment variability reflecting different hydrologic behaviors. In general, higher FCfrac caused lower mean evapotranspiration ratios, because FCfrac determines the onset of water limited evapotranspiration. This reflects in a simplified and linearized manner the decrease in capillary matric potential and 465 reduction of capillary supply of upper soil layers losing water to sustain evapotranspiration. The conceptual soil water balance models used by Milly (1994) and Potter et al. (2005) neglect this effect. For the most humid systems in particular, the decline in mean evaporation ratio occurs at capillary storage fractions ≥ 0.6. In these humid climate regimes, relative soil moisture tends to be high and in combination with lower capillary storage fractions, evapotranspiration occurs mostly without water limitation. With increasing FCfrac values, evapotranspiration is more likely to fall below the threshold to reduce evaporation, 470 the soil retains more soil moisture and in turn enhances direct runoff production during rainfall. For other catchments, the sensitivity of mean EVR remains nearly uniform within the variation range of FCfrac, in particular for the two seasonal catchments (P-1 and M-7). This may be due to the fact that these drier, seasonal catchments tend to have relative soil moistures even below lower FCfrac values, making their mean water balance sensitive to FCfrac across the whole variation range.

Soil storage characteristics matching the Budyko curve
For most study catchments, the modelled EVR ranges intersect the Budyko curve at distinct values of Smax and FCfrac, respectively. The distinct storage parameters that made the systems reach the evaporation ratio predicted by the Budyko curve showed a clear clustering, and can be interpreted hydropedologically. For most of the systems, in particular for the humid 480 catchments, the distinct normalized total storage that matches the Budyko curve is between 5-15 % of the mean annual precipitation. In case of a uniform annual rainfall regime this corresponds roughly to the monthly precipitation amount and to an Smax = 60-180 mm given a mean annual precipitation of P=1200 mm/a. When recalling that Smax equals the product of soil depth and porosity, and assuming porosity values of around 0.3-0.5, this suggests soil depths ranging between 120-540 mm. This is in the range of the root zone depths found in vegetated systems (Gentine et al., 2012). The distinct capillary storage 485 fractions for matching the Budyko curve scattered between 0.18 and 0.90 of the total storage volume, which is the range to be expected for sandy and clayey soils, respectively. For half of the catchments, the distinct FCfrac ranged between 0.6 -0.8, equivalent to loamy soils. There was, however, no clear trend of distinct capillary storage fractions with dryness index.
From our findings, we can conclude that soil storage characteristics are important second-order controls on the mean annual water balance, which can help explain observed offsets of catchments from the Budyko curve. The fact that some catchments 490 did not reach the Budyko curve through independent variation of the soil storage parameters, however, also underlines that other second-order controls such as for example temporal variability and seasonality of the forcing, or their interplay with soil storage, can play an important role for hydrologic partitioning. Among our catchments, the two drier and in particular seasonal catchments, P-1 and M-7, stand out in this respect. While Fu and Wang (2019) show that seasonality can indeed have a https://doi.org/10.5194/hess-2021-174 Preprint. Discussion started: 30 March 2021 c Author(s) 2021. CC BY 4.0 License. significant influence on the position in the Budyko space, Potter et al. (2005) pointed out in his study on Australian catchments, 495 that seasonality by itself was not able to explain the inter-catchment variance in the observed mean evaporation ratio.
Another possibility is that a catchment's evapotranspiration can be transport-limited when the vapor pressure gradient in the lower boundary layer is low and the air is moisture-saturated. It is conceivable that a strongly seasonal environment presents a more favorable setting for transport limitation, since the atmosphere is more likely to be moisture-saturated when the entire annual precipitation occurs within a limited number of months during the rainy season. A straightforward supplement analysis 500 of relative humidity data of stations in the vicinity of our study catchments revealed that mean relative humidity during the rainy season for the Peruvian catchment was around 85 %, and higher than the year-round average for all other catchments.
Assuming that most of the annual evapotranspiration in the Peruvian catchment occurs during the period of abundant soil moisture storage in the rainy season, it stands to reason that transport limitation can play a role in the impediment of the mean evaporation flux. This reasoning, however, does not apply to the other seasonal catchment (M-7), where different 505 meteorological conditions in the boundary layer (e.g., in terms of advection) might counteract or limit the potential impediment of upward transport.

Interpretation in terms of catchment coevolution and behavioral model parameterization
Our findings fit well into the perspective that a catchment's form and functioning are co-evolutionary (Troch et al., 2015), 510 which also implies that the development of total storage volume and of capillary storage fraction are not independent of each other. The successive variation of the two soil storage parameters sheds light on the role of the soil formation process resulting from two weathering mechanisms. While the first one (Smax) represents the generation of soil storage volume, i.e., porosity, the second one (FCfrac) relates to the transformation of coarse to increasingly fine-grained material with higher capillary forces.
Both porosity and the fraction of silt and clay increase with time (Hartmann et al., 2020). 515 The parameter variations corresponding to these two mechanisms resulted in opposite effects on the mean evaporation ratio.
This could imply that catchments with their related soil formation processes converge towards an optimal state with regard to hydrologic partitioning. While in early stages of a catchment's evolution -probably starting far off Budyko-the development of total storage Smax is likely to dominate the evolution, at later stages both parameters could continue to evolve simultaneously within the Budyko domain, thus keeping water balance in a steady-state in accordance with Budyko. 520 The idea of finding underlying organizing principles for the steady-state hydrologic partitioning described by the Budyko curve has been addressed by multiple studies in the past (cf. Berghuijs et al., 2020). Westhoff et al. (2016) showed in a backward approach that the Budyko curve can be derived using the Maximum Power principle as a constraint. Porada et al. (2011) simulated the water balance of the 35 largest basins on Earth using the SIMBA model and inferred parameters controlling root water uptake by maximizing entropy production. Simulations were in line with the Budyko framework. Milly 525 (1994), referring also to similar conclusions by Milly and Dunne (1994), stated that simulated threshold values of waterholding capacities, beyond which evaporation does not change significantly anymore, were in proximity to the observed ones, which lead him to hypothesize that ecosystems strive to maximize evapotranspiration. The Budyko curve could thus also represent the strategy to maximize evapotranspiration by approaching the supply and demand limit, yet not reaching them due to limiting factors such as climate variability (Berghuijs et al., 2020). 530 If catchments were in fact to coevolve towards an optimal state of hydrologic partitioning, it would still remain difficult to infer which stage of coevolution a catchment actually is in. The plots shown in Figure 8 can be helpful in this respect, as they connect a wide range of total storage volumes and capillary storage fractions to the resulting offsets of simulated EVR from the Budyko curve. This space represents all possible system configurations with respect to these two soil storage-related parameters, and thus which soil states a catchment might potentially go through. 535 We found groups of catchments emerging in terms of the "Budyko domain", which clustered with respect to their climate setting and their parameter combinations in a close range of ΔEVR = ±0.05 around the Budyko curve. The most humid with the sensitivity of one parameter to the mean water balance being strongly conditioned by the other parameter. The domain 540 highlighted by the yellow square in Figure 8 represents a parameter subspace where both parameters could develop whilst remaining within the Budyko domain. A catchment in that subset could evolve at "moderate pace" in terms of soil storage, while the water balance partitioning in terms of the Budyko framework would roughly remain constant. In the drier range of catchments, the two seasonal ones (P-1, M-7) are of particular interest. In both cases, the Budyko evaporation ratio is only reached at high Smax values and low FCfrac values. According to observed discharges and precipitation data, both are currently 545 not inside the Budyko domain. Troch et al. (2015) introduced catchment forming factors (CFF), as quasi-independent drivers (boundary conditions) of catchment coevolution: climate, bedrock weatherability, tectonics and time, and discussed the concept of hydrologic age as the result of their combined effect. The latter is related to the amount of energy that has flown through the catchment and to the amount of physical work expended thereby. In this context one might speculate that catchments' hydrologic aging in a highly seasonal climate is slower than in humid settings. The two dryness defining variables 550 (P, ETp) and their corresponding mediators -water and energy -are interacting simultaneously only during 4-6 months during the rainy season, which could lead to a slower evolution towards the Budyko state in these catchments. Fu and Wang (2019) showed a positive correlation between runoff coefficients and rainfall seasonality for a number of catchments, and that such seasonal catchments tend to yield evaporation ratios below the Budyko curve, which supports this point of view.
Our results illustrate the importance of soil storage volume characteristics for the position of a catchment in the Budyko space. 555 When making model-based predictions or when assessing the water balance in ungauged basins, the Budyko curve can be used as a landmark for long-term simulations. Schaefli et al. (2011) and Li et al. (2014) both used the Budyko curve to determine "behavioral" parameter combinations. In a similar manner, if taking potential deviations due to soil storage volume into account, model parameterizations could be oriented and constrained based on the behavior of the "Budyko domains" identified in section 3.4 for different climate types. Doing that, one could take into consideration that in catchments with high soil storage 560 capacity, the actual water balance might exceed the evaporation ratio given by Budyko, or vice versa for catchments with little soil storage capacity (below 5-10 % of mean annual precipitation in humid climates). For example, highly erosive terrains with a steep topography could present a setting where soil storage is underdeveloped. Further research is needed, however, to address potential other second-order influences and their relative importance in comparison to soil storage characteristics for explaining offsets from the Budyko curve. 565

Conclusions
The attempt to use the empirical Budyko curve to evaluate observed water balances or to constrain them in modeling in datascarce regions motivated us to better understand second-order controls on the steady-state water balance and potentially resulting offsets from the Budyko curve. To that end, we conducted a model study to explore the relationship between two 570 parameters related to the soil storage volume and the Budyko curve. The modeling approach was built on observation data (P, ETp, Q) and thus did not purely take place in the realm of simulations. The fully-lumped and simple hydrological model, similar to the widely used HBV model, proved to be an effective and efficient tool to simulate multiple catchment water balances at scales between 50-1000 km 2 . Instead of using parametrized Budyko models based on a lumped parameter integrating all physiographic catchment characteristics, our study singles out specific root zone characteristics, namely total 575 soil storage volume and the capillary storage fraction. This approach allows to relate them to tangible catchment properties and judge their physical meaningfulness.
We show the important role of soil storage as second-order controls on the mean annual water balance and potential offsets from the water balance predicted by the Budyko curve. In most cases, the parameter variations generated evaporation ratio envelopes enclosing the Budyko curve; in a few cases the Budyko curve was not reached through the variations. As suggested 580 https://doi.org/10.5194/hess-2021-174 Preprint. Discussion started: 30 March 2021 c Author(s) 2021. CC BY 4.0 License. by other studies, the number of rainy days per year appeared to be a sensitive climate characteristic in the role soil storage plays for the water balance. We observed a clustering in terms of normalized soil storage required to match the Budyko curve at around 5-15 % of mean annual precipitation, which translates roughly to the monthly precipitation, and which reasonably corresponds to soil storage capacities commonly found in nature. Similarly, also the second parameter (capillary storage fraction) clustered in a range that agrees well with hydropedological interpretation. 585 Not unexpectedly, some catchments deviated from the behavior of the other catchments. In particular, the two strongly seasonal catchments, among them the Peruvian one, stood out repeatedly in the course of the analysis. Thus, while the soil storage characteristics are likely to be part of the reason for the significant offset found in the observation data of the Peruvian catchment, other second-order influences seem to be of importance as well. Several potential explanations for the deviating behavior of the seasonal catchments were elaborated, from transport limitation due to prevailing atmospheric moisture 590 conditions to a coevolution-based argument that these catchments are still evolving towards the Budyko state. Given the outstanding role seasonal catchments played in our study, it would be interesting to conduct further research regarding the different mechanisms how precipitation seasonality (and in combination also runoff seasonality) can influence the position in the Budyko space.
In terms of potential applications, the results of this study could be helpful in the evaluation of water balances in data-scarce 595 regions, if soil storage volumes are known to be particularly limited or, inversely, particularly abundant in a catchment.
Analysis of climate type-dependent patterns of two-dimensional parameter spaces and the relative position of the Budyko domain can help provide parameter constraints for hydrological models.

Author contributions 600
Jan Bondy conceptualized the study, conducted the analysis and wrote the paper. Jan Wienhöfer provided insights into interpretation and discussion, and improved the structure of the paper. Laurent Pfister contributed with discussions on the Budyko framework and helped shape the final manuscript. Erwin Zehe conceived of the general idea and supervised the study.