Evapotranspiration enhancement drives the European water-budget deficit during multi-year droughts

In a warming climate, periods with below-than-average precipitation will increase in frequency and intensity. During such periods, known as meteorological droughts, sparse but consistent pieces of evidence show that the decline in annual runoff may be proportionally larger than the corresponding decline in precipitation (e.g., -40% vs. -20%). Reasons behind this exacerbation of runoff deficit during dry periods remain largely unknown, which challenges generalization at larger scales (i.e., beyond the 5 single catchment), as well as the predictability of when this exacerbation will occur and how intense it will be. Here, we tested the hypothesis that runoff-deficit exacerbation during droughts is a common feature of droughts across climates and is driven by evapotranspiration enhancement. We support this hypothesis by relying on multidecadal records of streamflow and precipitation for more than 200 catchments across various European climates, which distinctively show the emergence of similar periods of exacerbated runoff deficit identified in previous studies, i.e., runoff deficit on the order of -20% to -40% less than what expected 10 from precipitation deficit. The magnitude of this exacerbation is two to three times larger for basins located in dry regions than for basins in wet regions and is qualitatively correlated with an increase in annual evapotranspiration during droughts, on the order of 11% and 33% over basins characterized by energyand water-limited evapotranspiration regimes, respectively. Thus, enhanced atmospheric and vegetation demand for moisture during dry periods induces a nonlinear and potentially hysteretic precipitation-runoff relationship for low-flow regimes, which results in an unexpectedly large decrease in runoff during periods 15 of already low water availability. Forecasting onset, magnitude, and duration of these drops in runoff availability has paramount societal implications, especially in a warming climate, given their supporting role for water, food, and energy security. The outcome that water basins are prone to this exacerbation of runoff deficit for various climates and evapotranspiration regimes, compounded by the lack of specific parametrizations of this process in the majority of hydrological and land-surface models, make further understanding of its patterns of predictability an urgent priority for water-resource planning and management in 20 a warming and drier climate. 1 https://doi.org/10.5194/hess-2021-230 Preprint. Discussion started: 27 April 2021 c © Author(s) 2021. CC BY 4.0 License.


Introduction
Timing and seasonality of runoff (Q) from a river basin are dictated by the interaction across incoming precipitation (P ), atmospheric and vegetation water use (evapotranspiration, ET ), and the variation in water stored in the basin ( S): Q = P ET S (Bales et al., 2018). While changes in precipitation will ultimately affect runoff, processes driving the precipitation-25 runoff relationship (Saft et al., 2016b) are complicated by the nonlinear, and often delayed response of ET and S (Bales et al., 2018;Avanzi et al., 2020). Depending on the direction of precipitation change, evapotranspiration-precipitation feedback mechanisms may comprise vegetation expansion and/or mortality (Senf et al., 2020;Choat et al., 2018), wildfires (Bowd et al., 2019), a shift in vegetation water-use strategies (i.e., isohydric to anisohydric prevalent species), and depletion of regolith water storage and rock moisture (McDowell et al., 2008;Hahm et al., 2020;Rungee et al., 2019; Goulden and Bales, 2019; Klos

Multi-year drought definition
Multi-year droughts were identified based on the precipitation deficit. The reason for using precipitation to characterize the multi-year drought period is that we are interested in analyzing the runoff response, therefore it was not used to define the 120 drought (Saft et al., 2016b). In particular, we used the indications of Saft et al. (2016b) to define a multi-year drought periods, but we relied on the Standardized Precipitation Index (SPI, McKee et al., 1993) rather than on precipitation anomalies. The following procedure was adopted: 1. Calculation of the SPI, by fitting annual precipitation accumulations with a variety of distribution (i.e., normal, lognormal, exponential, generalized extreme, Fisk, Weibull, Gamma) and selecting the one that best fit data (i.e., having the 125 maximum pvalue obtained with the Kolmogorov-Smirnov test). SPI was calculated both on the mean annual precipitation and on precipitation smoothed with a three-year moving window. Smoothing was applied to avoid single wet years to interrupt a long and significantly dry period.
2. To reduce the blurring effect of the moving window, the exact end date of the dry period was determined through analysis of the unsmoothed SPI data from the last negative three-year anomaly. The end year was set as the last year of this three- 130 year period unless: there was a year with a positive SPI >0.15, in which case the end year was set to the year prior to that year; or if the last two years had slightly positive SPI (but each < 0.15), the end year was set to the first year of positive anomaly; 3. The first year of the drought remained the start of the first three-year negative SPI. 4. To ensure that the dry periods were sufficiently long and severe, we only used dry periods with the following characteristics: i) length over three years; ii) mean dry period anomaly < -0.8.
By defining drought in this way we ended up with 210 basins out of 1043 having experienced at least one multi-year drought episode over the available period of record. Although relaxing the procedure for the multi-year drought definition would have brought to a larger sample of basins, we preferred to maintain this approach to have consistent results with previous studies 140 (Saft et al., 2016b) and because doing so guaranteed that the period analysed coincided with a period of a severe precipitation deficit. The above procedure resulted in a satisfactory multi-year drought definition (see Figure 2) that was validated with data found in the literature (Parry et al., 2012) with a minimum of three years to a maximum value of eight years for few basins (median duration of four years).
2.5 Shift in the precipitation-runoff relationship 145 We detected shifts in the precipitation-runoff relationship by fitting a multivariate regression across annual cumulative streamflow (target variable), basin-wide annual precipitation, and a categorical variable denoting drought and non-drought years 5 https://doi.org/10.5194/hess-2021-230 Preprint. Discussion started: 27 April 2021 c Author(s) 2021. CC BY 4.0 License. (Avanzi et al., 2020;Saft et al., 2016b): where I is a categorical drought variable (1 for years characterized by multi-drought and 0 otherwise, b 0 , b 1 , and b 2 are 150 regression coefficients, ✏ is noise, and Q BC is annual streamflow transformed according to a Box-Cox transformation following the arguments in (Avanzi et al., 2020): where has been estimated from data to ensure linearity and heteroscedasticity (i.e., the that maximizes the log-likelihood function, Box and Cox (1964) (2008)).
The relative magnitude of the shift in precipitation vs. runoff (M Q ) for each basin was calculated by using the approach suggested in Saft et al. (2016a): where Q dry,PI is the (predicted annual) runoff for a representative precipitation during dry periods according to the shifted precipitation-runoff relationship (1, I = 1), while Q dry,P is the full-natural flow for the same precipitation according to the non-shifted relationship (Eq. 1, I = 0). We assumed as representative annual precipitation the mean between average and minimum annual precipitation across the entire period of record.

165
In this study, I in Eq. 1 was estimated by using SPI calculated based on ERA5 precipitation (I=1 during multi-year drought and I=0 for the other years), while the annual precipitation P was calculated based on E-OBS precipitation dataset. There are three reasons for that: i) we wanted to maintain as much as possible the independence between the drought definition (i.e., I) and the annual precipitation (P ) in Equation 1, as to avoid influences on the fitting of Eq. 1. ii) we wanted to have consistent ERA5-based drought definition evapotranspiration anomalies (both coming from the same dataset), and, iii) we wanted to rely 170 on a higher spatial resolution product (i.e., E-OBS) for relating precipitation and runoff within the basin. (which were not taken into consideration here), it had a limited initial spatial extent and coherence on a regional basis, with

185
During these periods of severe precipitation deficit, 69 out of the considered 210 basins with at least one multi-year drought (i.e., 33%) showed a statistically significant shift in the water balance (i.e., a negative shift in the precipitation-runoff relationship, see Figure 3). This means that these 69/210 basins experienced statistically significant less runoff than would be expected based solely on the historical functional dependency of runoff with precipitation. This so-called negative shift is in contrast with experiencing no shift or a positive shift, where the runoff deficit during droughts would be equal to or smaller than that year drought definition and in the fitting of the precipitation-runoff relationship. However, given the relatively high number of basins used here, the fact that only 2 basins show statistically positive shift is an index of the high control of the experiment and the high quality data used.

Evapotranspiration enhancement, catchment aridity, and water budget-deficit exacerbation
The distribution of basins with a statistically significant shift shows no obvious pattern of variability with the aridity index (see 210 Figure 3c). Note that we assume the aridity index, calculated as the ratio between precipitation and potential evapotranspiration, The magnitude of runoff-deficit exacerbation during droughts is strongly related to mean annual runoff, being larger for drier basins (Figure 4). This outcome qualitatively agrees with earlier findings related to the pre-drought aridity index being 215 an important predictor of shifts in the precipitation-runoff relationship in Australia (Saft et al., 2016b). Runoff exacerbation occurs in both rainfall-and cryosphere-dominated basins as defined by the month of maximum daily discharge (see again Figure 4). Exacerbation occurs both in energy-and water-limited regimes, as delimited using a standard Budyko framework (Budyko and Miller, 1974;Maurer et al., 2021), Figure 4b. This demonstrates that catchments may experience a shift in the precipitation-runoff relationship and so an exacerbation of runoff deficit during droughts regardless of the predominant local 220 climate (i.e., as defined by their long-term aridity index). Indeed, we found a statistically significant shift for 25% of the basins within the water-limited domain and for 35% of the basins in the energy-limited one (Figure 4b), including snow-dominated basins characterized by annual-runoff peak during late spring and summer. Nonetheless, drier catchments experience a larger runoff reduction during multi-year droughts than wetter catchments: shift magnitude asymptotically tends to -20% for wet catchments, while drier basins reach shift magnitudes as large as -80% (Figure 4).

225
Given the annual water balance (Q = P ET S), we explain this relationship between shift magnitude and aridity with the potentially enhanced contribution of evapotranspiration to the annual water budget, particularly for water-limited regimes during droughts. In basins located over water-limited regimes, atmospheric demand for moisture is generally well above the available water storage needed to support evapotranspiration, so that the latter will have a significant impact on already low runoff, especially at the beginning of a multi-year drought when water storage is comparatively large. In energy-230 limited environments, instead, evapotranspiration is mainly controlled by the available energy and may play a minor role in the annual allocation of incoming precipitation (Seneviratne et al., 2010).
The distribution of actual-evapotranspiration anomalies does show enhanced evapotranspiration during multi-year droughts compared to the remainder of the years, for both catchments located over energy-and water limited regimes ( Figure 5).
Catchments located in a water-limited regime show a larger increase compared to those located in the energy-limited one (33 235 % vs 12%). A two-sample Kolmogorov-Smirnov test carried out between the distributions of evapotranspiration anomalies during droughts vs. during non-drought years confirms that the anomaly during droughts is statistically different (p<0.01) from that during non-drought years. This anomaly is generally larger for basins with the largest shifts (in absolute values, see Figure S2 in the supplementary material). Note that in Figure S2 we divided basins between those located above and those located below 50 N, because we did not observe basins with a positive anomaly in evapotranspiration below than 30% in the 240 water-limited domain (we assumed that northern basins are mainly energy limited).
This regime of enhanced evapotranspiration during droughts was previously suggested by Teuling et al. (2013) and points to generally warmer conditions during droughts leading to additional demand for moisture, as also suggested by Mastrotheodoros (not shown) was also found by calculating evapotranspiration as ET = P Q and thus neglecting the contribution of the change in storage, as in Teuling et al. (2013).
The distribution of evapotranspiration anomalies in Figure 5 shows a larger spread during droughts than during non-drought years. We attributed this increased variability in evapotranspiration during droughts to the regulation operated by energy (that is, vapor pressure deficit) and available water (that is, storage) during these water-scarce periods. Figure 6a and b shows 250 two such examples, which also iterate how a positive actual evapotranspiration anomaly is intimately coupled with runoff exacerbation (precipitation-runoff relationships for these two basins are shown in Figure S1). Figure 6a shows a multi-year drought period in the northern UK (1989UK ( -1994; this drought was characterized by both negative precipitation anomalies (-95% on average) and a positive anomaly in potential evapotranspiration (+79% on average). The result of this dry and warm period was a positive actual evapotranspiration anomaly (+18%) and a markedly negative runoff anomaly (-106%).

255
This situation significantly differs from 1996, a single dry year with i) much less precipitation than observed during many of the multi-year-drought years (e.g., 1990, 537 mm/y vs 368 mm/y) and, importantly, ii) a substantially lower potential evapotranspiration anomaly (-127%) denoting a much colder year with respect to 1989-1994. This cold-dry 1996 resulted in a negative actual evapotranspiration anomaly (-228%), which translated into a much smaller runoff deficit than the multi-year drought (-79%, as opposed to -118 % in 1990). This demonstrated that in such energy-limited environments, the emergence of 260 an enhanced-evapotranspiration regime during droughts is regulated by the available energy: if this is not sufficient, then actual evapotranspiration will not increase.
Similar conclusions can be drawn for the basins located in central Spain (Figure 6b), with some notable differences in this water-limited region. The multi-year drought period 1991-1995 in this area was characterized by a close-to-zero anomaly in potential evapotranspiration (-2% on average) and a below-than-average precipitation (-98 %). This dry-mild period signifi-265 cantly differs from another single-dry and warm year, 2012 (+255 % of potential evapotranspiration and -183% precipitation).
Despite the much warmer and drier 2012, we observed a relatively larger runoff deficit during the multi-year drought period (-99% on average, 25.9 mm/year) than in 2012 (-44%, 46 mm/year). Differently from the basin located in the northern UK (i.e., in an energy-limited region), the emergence of an enhanced-evapotranspiration regime in a water-limited region is much more complex and regulated by both energy and available water storage (that can even result from carryover from previous years). 270 Here, demand for moisture may also trigger plant-stomata closure thus reducing transpiration. Therefore, in water-limited regimes the year-to-year comparison of runoff deficit and evapotranspiration anomaly is not straightforward and can be further complicated by the precipitation variability typical of Mediterranean regions (Seager et al., 2019) . In any case, if storage is not sufficient, and/or other feedback mechanisms like stomata closure occur, then actual evapotranspiration will not increase and runoff may be substantially higher than in relatively wetter periods.

275
As basin storage (i.e., S) plays an important, but frequently neglected role in modulating runoff deficit via sustaining evapotranspiration during multi-year droughts (Van Loon and Laaha, 2015), we compared the average rooting depth and the total available water content (TAWC) distribution for basins characterized by significant versus non-significant shifts (see sample Kolmogorov-Smirnov test with p-value<0.05). Because basins with a statistically significant shift show both a slightly deeper rooting depth and a larger TAWC, these findings tally with the enhanced ET anomaly for shifting basins in Figure 5, because a deeper rooting depth may provide access to deeper storage during water stress and so sustain evapotranspiration even during dry periods. Nonetheless, these findings are only of qualitative nature, given that distributions in Figure 7 overlap.

285
In this study, we showed that exacerbation of runoff deficit compared to precipitation during droughts is a common feature of water basins across contrasting evapotranspiration regimes and aridity indices. This leads us to accept our initial hypothesis.
Runoff exacerbation is related to an increase in evapotranspiration occurring under two defined and concurrent preconditions: i) water storage can support ET during the drought period, and ii) there is a sufficient vapor-pressure deficit (mainly driven by the temperature increase) to generate evapotranspiration (this is always true over water-limited regimes, while it may not happen 290 over energy limited regimes, hence larger shifts in drier regions). When both circumstances are verified, then the catchment water balance shifts toward a new regime in which ET proportionally weights more than during wet periods. The macroscopic, bulk effect of this regime change is the shift in precipitation-runoff relationship as observed earlier (Avanzi et al., 2020). This shift is more pronounced in drier catchments, because evapotranspiration tends to be proportionally higher as long as enough water is available to sustain atmospheric and vegetation demand for moisture. It is noteworthy that these drier catchments are 295 areas of the world where water planners and ecosystem services are already challenged by limited water resources.
These results were obtained from an empirical, strictly data-based analysis, but are in line with earlier findings (Saft et al., 2016a(Saft et al., , 2015Avanzi et al., 2020), as well with those inferred from blending data with mechanistic modelling across the European Alps (Mastrotheodoros et al., 2020). The fact that we provided pieces of evidence that basins may develop a new hydrological regime in response to evapotranspiration enhancement during droughts across aridity indices requires an un-300 derstanding of catchment behaviour that goes beyond the assumption that runoff fluxes stationarily fluctuate as a function of precipitation variability, including the need to more comprehensively acknowledge the role of evapotranspiration for the long-term streamflow patterns (Mastrotheodoros et al., 2020).
Achieving such a more holistic understanding of the water budget during droughts is relevant from both a scientific and an operational perspective. Conceptual rainfall-runoff models are still widely used in operational practice, as well as for many 305 scientific purposes like climate-change studies, because they are parsimonious and computationally efficient, meaning they are easy to run in real time and provide a timely response (Pagano et al., 2014). Yet, these predictive tools may be inadequate tools during periods of runoff exacerbation like those we found here across Europe. The first reason is that calibration of these models remains inevitable (Beven and Freer, 2001), due to their typically oversimplified process representations (Gupta et al., 2012), the presence of data errors (Montanari and Di Baldassarre, 2013), and recurring epistemic uncertainty related to heterogeneity 310 of hydrologic processes across the landscape. However, the available observation period is often very limited both in time and space due to the significant decline of global river discharge monitoring over the past few decades (Crochemore et al., 2020), especially for medium-to-small basins. Thus, these "historically calibrated" parameters implicitly contain an assumption of https  et al., 2012;Merz et al., 2011;Vaze et al., 2010;Chiew et al., 2014;Fowler et al., 2020), as it can happen during multi-

315
year droughts. Shifts in precipitation-runoff relationship and the associated exacerbation in runoff deficit may thus determine unreliable runoff projections and a low efficacy of water planning and management measures if calibration did not adequately include shift-inducing drought periods.
The second reason goes beyond calibration uncertainty. Indeed, previous work found that standard hydrologic models are prone to drops in modeling accuracy during shifting droughts even if some of these droughts were included in the calibration 320 window (Avanzi et al., 2020). Thus, standard hydrologic models may be exposed to inaccuracy during periods of exacerbation of runoff deficit regardless of the calibration protocol used. These drops in accuracy may thus be more related to conceptual rather than to parametric uncertainty; in other words, commonly used hydrologic models may lack the representation of specific