Diagnosis toward predicting mean annual runoff in ungauged basins

Prediction of mean annual runoff is of great interest but still poses a challenge in ungauged basins. The present work diagnoses the prediction in mean annual runoff affected by the uncertainty in estimated distribution of soil water storage capacity. Based on a distribution function, a water balance model for estimating mean annual runoff is developed, in which the effects of climate variability and the distribution of soil water storage capacity are explicitly represented. As such, the two parameters in the model have explicit physical meanings, and relationships between the parameters and controlling factors on mean annual runoff are established. The estimated parameters from the existing data of watershed characteristics are applied to 35 watersheds. The results showed that the model could capture 88.2 % of the actual mean annual runoff on average across the study watersheds, indicating that the proposed new water balance model is promising for estimating mean annual runoff in ungauged watersheds. The underestimation of mean annual runoff is mainly caused by the underestimation of the area percentage of low soil water storage capacity due to neglecting the effect of land surface and bedrock topography. Higher spatial variability of soil water storage capacity estimated through the height above the nearest drainage (HAND) and topographic wetness index (TWI) indicated that topography plays a crucial role in determining the actual soil water storage capacity. The performance of mean annual runoff prediction in ungauged basins can be improved by employing better estimation of soil water storage capacity including the effects of soil, topography, and bedrock. It leads to better diagnosis of the data requirement for predicting mean annual runoff in ungauged basins based on a newly developed process-based model finally.


Introduction
Hydrologists have a long-standing interest in mean annual water balance modeling and prediction. The factors controlling mean annual runoff have been studied in literature. Mean climate has been identified as the first-order control on mean annual runoff and evaporation and it has been quantified by climate aridity index, which is defined as the ratio between the mean annual potential evapotranspiration (E p ) and precipitation (P ) (Turc, 1954;Pike, 1964). Other controlling factors include the temporal variability of climate Troch et al., 2002;Fu and Wang, 2019), vegetation (Zhang et al., 2001;Donohue et al., 2007;Gentine et al., 2012;Li et al., 2013), soil (Atkinson et al., 2002;Yokoo et al., 2008;Li et al., 2014), and topography (Woods, 2003;Abatzoglou and Ficklin, 2017). Mean annual runoff or evaporation has been modeled as a function of climate aridity index, and the equation is usually called the Budyko equation (Budyko, 1958). The effects of other factors are represented by, including a parameter to Budyko equation (Fu, 1981;Yang et al., 2008;Wang and Tang, 2014). Among these factors, climate, including its mean and temporal variability, and soil water storage capacity including its mean and spatial variability, are dominant catchment characteristics controlling mean annual runoff, especially for those catchments dominated by saturation excess runoff generation (Milly, 1994).
Intra-and inter-annual climate variability introduces nonsteady-state conditions to finer timescale water balances and the non-steady-state effect could propagate to the mean annual runoff. The effects of seasonal variations of precipitation and potential evaporation on long-term runoff have been studied in several studies. Milly (1994) showed that seasonality tends to increase mean annual runoff through a Published by Copernicus Publications on behalf of the European Geosciences Union. stochastic soil moisture model. The seasonality effects have been demonstrated through a top-down model by Hickel and Zhang (2006) and a classification study by Berghuijs et al. (2014). Mean annual water balance also receives impacts from climate variability at the inter-annual and daily timescales. Li (2014) showed that the inter-annual variability of precipitation and potential evaporation could increase the mean annual runoff up to 10 % based on a stochastic soil moisture model. Shao et al. (2012) found that daily precipitation with a larger variation potentially increases mean annual runoff especially in the catchments where infiltration excess runoff is prevalent. Yao et al. (2020) quantified the relative contribution of daily, monthly, and inter-annual climate variabilities to mean annual runoff and showed that the contribution decreases, on average, from monthly to inter-annual scales and then daily scale.
Soil water storage capacity is the maximum storage capacity from land surface to bedrock, which exerts a powerful control on mean annual runoff (Konapala and Mishra, 2016). A smaller soil water storage capacity creates favorable conditions for runoff generation, because the precipitation in excess of the available storage capacity would be lost as runoff directly, while catchments with a larger soil water storage capacity could hold more precipitation for evaporation (Sankarasubramanian and Vogel, 2002;Porporato et al., 2004;Chen et al., 2013). Soil water storage capacity is closely related to vegetation since the root structure of vegetation could affect soil water storage capacity significantly. Research has been conducted to reveal the role of soil water storage capacity through the linkage of vegetation and model parameter (Yang et al., 2008;Chen and Wang, 2015). Gerrits (2009) developed equations for transpiration and interception by considering the root zone and interception storage capacity as two of the most important catchment characteristics affecting evapotranspiration. In addition to the magnitude of the average soil water storage capacity, the spatial variability of soil water storage capacity within a catchment also influences precipitation partitioning at the event scale and further influences the cumulative runoff at the mean annual scale (Moore, 1985;Jothityangkoon et al., 2001;Gao et al., 2016). It has also been suggested that the spatial variability of soil water storage capacity could suppress the actual evaporation, because the maximum evaporation in areas with soil water storage capacity less than E p will be smaller than E p ; therefore, the average evaporation over the entire catchment is smaller than E p even though the average storage is greater than E p , resulting in more runoff generation compared to the situation when the soil water storage capacity is spatially uniform (Yao et al., 2020). Therefore, climate variability and soil water storage capacity need to be explicitly incorporated into the model for predicting mean annual runoff. The effect of climate variability could be taken into account by driving the model with daily precipitation and potential evaporation which are usually available. The spatial distribution of soil water stor-age capacity could be modeled by a distribution function, and it is usually modeled by the generalized Pareto distribution (Moore, 1985;Zhao, 1992). The distribution function includes two parameters, i.e., the shape parameter and the maximum storage capacity over the watershed. In ungauged basins, soil water storage capacity and its spatial variability need to be estimated directly from available data. Gao et al. (2014) adopted the mass curve technique, which has been used for designing the storage capacity of reservoir, to estimate the average water storage capacity of the root zone using precipitation and potential evaporation data. The shape parameter of the distribution function has been estimated from soil data (Huang et al., 2003). However, the estimated parameters from these methods bring much uncertainty in runoff estimation, and the two parameters of the generalized Pareto distribution are usually estimated by model calibration using observed streamflow data (Wood et al., 1992;Kibler, 2018, 2019).
The objective of this paper is to develop a nonparametric mean annual water balance model for predicting mean annual runoff in ungauged basins, which has not yet been fully understood (Blöschl et al., 2013). The mean annual water balance model is forced by daily precipitation and potential evaporation; therefore, the climate variability at different timescales is represented explicitly in the climate input. The runoff generation is quantified by a distribution function for describing the spatial distribution of soil water storage capacity (Wang, 2018). The mean and the shape parameter of the distribution function need to be estimated from the available data in ungauged basins. Therefore, the model serves as a diagnosis tool for evaluating the data requirement for estimating soil water storage capacity. The mean soil water storage capacity is estimated from curve number and climate, because soil water storage capacity consists of the antecedent soil water storage and the potential maximum soil moisture retention which can be calculated through the Soil Conservation Service (SCS) curve number method. The estimation of the shape parameter is diagnosed in terms of the data requirement including soil, land surface topography, and bedrock topography. Section 2 introduces the new mean annual water balance model and the study watersheds. Results and discussion are presented in Sect. 3, followed by Sect. 4 for conclusions.

Mean annual runoff model
Climate variability is defined as the temporal variations of precipitation (P ) and potential evapotranspiration (E p ), including their intra-monthly, intra-annual, and inter-annual variations. For example, the deviations of daily P or E p from its monthly mean values are defined as the intra-monthly variations (Yao et al., 2020). As discussed in the introduc-tion section, the mean annual runoff model takes daily precipitation and potential evaporation as inputs; therefore, climate variability is explicitly included in the model. The developed model calculates daily soil wetting (infiltration) and evaporation by tracking the soil water storage. Mean annual runoff is estimated by aggregating the daily values. The daily soil wetting is calculated using the concept of saturation excess runoff generation by modeling the spatial variability of soil moisture and soil water storage capacity. To facilitate the parameter estimation of storage capacity distribution in ungauged basins, the following distribution function is used for modeling the spatial distribution of storage capacity (Wang, 2018): where F (C) is the cumulative distribution function (CDF), representing the fraction of the watershed area for which the soil water storage capacity is equal to or less than C; a is the shape parameter of the distribution and varies between 0 and 2; and S b is the average soil water storage capacity over the watershed (i.e., the mean of the distribution). As shown in Wang (2018), this distribution function leads to the SCS curve number (SCS-CN) method when the initial storage is set to zero. Therefore, there is a linkage between S b and the "potential maximum retention after runoff begins" in the SCS-CN method, denoted as S CN . Daily soil wetting and runoff generation is computed as a function of daily precipitation (P ), initial storage (S 0 ), a, and S b . As shown in Wang (2018), the average soil wetting (W ) is computed by . Setting S 0 = 0 and dividing P on both sides of Eq. (2), a Budyko-type equation, representing W P as a function of S b P , is obtained (Wang and Tang, 2014), which has been used to model long-term soil wetting (Tang and Wang, 2017). Therefore, Eq. (2) can be interpreted as a non-steady-state Budyko equation which accounts for the effect of water storage. Daily evaporation (E d ) is computed as (Yao et al., 2020) The first component on the right-hand side of Eq. (3), W +S 0 S b , is the percentage of storage, and the second component is the evaporation for the condition when the entire watershed is saturated, i.e., the spatial distribution of soil water storage is the same as that of storage capacity (Yao et al., 2020). Dividing by W + S 0 on both sides, Eq. (3) represents E d W +S 0 as a function of E p S b , and the function is the same as the Budykotype equation derived by Wang and Tang (2014). Mean annual evaporation (E) is computed by aggregating the daily evaporation, and mean annual runoff (Q) is computed as the difference of mean annual precipitation and evaporation: where Y is the number of years, and D y is the number of days in year y; y and d represent year y and day d, respectively. Note that the mean annual runoff includes surface runoff and baseflow, and both are impacted by climate variability (e.g., intra-annual variability) (Berghuijs et al., 2014;Fan et al., 2007). This mean annual water balance model applies two nonsteady-state Budyko-type equations at the daily scale: one for daily soil wetting and the other for daily evaporation. Runoff routing is not necessary since the model is prepared for longterm water balance analysis. As a result, the mean annual water balance model includes two parameters, i.e., the shape parameter (a) and the average soil water storage capacity (S b ). For studies where a one-parameter Budyko equation is applied to long-term scale directly, the effects of climate variability (seasonality, inter-annual variability, and daily storminess) on mean annual water balance are attributed to the single parameter of the Budyko equation (e.g., Fu, 1981;Zhang et al., 2001). This creates the challenge to estimate the single parameter in ungauged basins, whereas the mean annual water balance model used in this paper takes daily precipitation and potential evaporation as inputs, and the effects of climate variability are taken into account explicitly. To achieve the goal of predicting mean annual runoff in ungauged basins, a and S b need to be estimated in ungauged basins.

Average soil water storage capacity
Under a given soil moisture condition, soil water storage capacity is the sum of actual water storage and the remaining (or effective) storage capacity. The effective storage capacity corresponding to the normal antecedent moisture condition defined in the SCS-CN method, S CN (mm), is computed as a function of CN (SCS, 1972;Bartlett et al., 2016): where CN is the composite curve number based on land use and land cover (LULC) and hydrologic soil group (HSG) for each watershed. The LULC data can be obtained from the National Land Cover Database (Homer et al., 2015), and the HSG data can be extracted from the Gridded Soil Survey Geographic (gSSURGO) database with a spatial resolution of 10 m (USDA, 2014). In HSG, soils are assigned to one of the four groups (A, B, C, and D) and three dual classes (A/D, B/D, and C/D) according to the rate of infiltration when the soils are not protected by vegetation and receive precipitation from long-duration storms. For the cells characterized by dual classes, the CN value is calculated as the average of the two CN values corresponding to the two soil groups. The average soil water storage capacity (S b ) is the sum of the actual storage under the normal condition (S) and its corresponding effective storage capacity: The physical meaning of S b is the mean value of the soil water storage capacity over a watershed which is defined as the maximum storage from land surface to bedrock in this study rather than the storage capacity from shallow soils.
Since the "normal antecedent moisture" can be interpreted as the steady-state soil moisture condition, S is the long-term average storage over the watershed. The values of S for 59 MOPEX (MOdel Parameter Estimation Experiment) watersheds are estimated based on the long-term water balance model in Yao et al. (2020), and these watersheds do not include any watersheds studied in this paper. The long-term water balance model used in their study has a similar model structure but the two parameters, i.e., the mean value of the soil water storage capacity and its shape parameter in the distribution function, were obtained by model calibration. The ratio between S and S b is defined as the long-term storage ratio S S b . It is found that the values of S S b for all the watersheds were larger than 0.5. As shown in Fig. 1, S S b has a linear relationship with the climate aridity index: where is the climate aridity index. Substituting Eqs. (6) and (7) into Eq. (8), one can estimate the average soil water storage capacity as a function of curve number and climate aridity index:

Shape parameter
The spatial variability of storage capacity is determined by the spatial distribution of point-scale pore space across the watershed. The volume of soil pores at the point scale can be determined by soil thickness and porosity in different soil layers. The porosity (θ s ) for each layer is calculated from the soil bulk density: where j denotes the j th soil layer; ρ b (j ) is the bulk density of the j th soil layer; and ρ is the particle density (2.65 g/cm 3 ). After obtaining the porosity, the point-scale storage capacity can be calculated as the following equation (Huang et al., 2003): where C is the point-scale soil storage capacity; n is the number of soil layers; and z j and θ s (j ) are the thickness and porosity of the j th soil layer, respectively. In the gSSURGO database, the soil thickness and bulk density for each layer are available for shallow soil from the land surface to ∼ 2 m soil depth. The total soil thickness at each point is the elevation difference from land surface to fresh bedrock. However, the bedrock topography is difficult to obtain especially at the watershed scale. Alternatively, it is assumed that the spatial distribution of the actual soil water storage capacity is the same as the spatial distribution of water storage capacity computed from the gSSURGO database. In order to compare the shape parameter evaluated from the soil data with its counterparts evaluated from other methods, the point-scale storage capacity is normalized with the average storage capacity over the watershed, and Eq. (1) is rewritten as where x is the normalized storage capacity C S b at the point scale; a is the shape parameter describing the spatial variability of soil water storage capacity. The shape parameter a is then estimated by fitting the point-scale storage capacity data obtained from Eq. (11). A nonlinear programming solver using the derivative-free method (i.e., MATLAB function "fminsearch") was used to calculate the optimal shape parameter by minimizing the root mean square error (RMSE). To demonstrate the sensitivity of mean annual runoff to the value of shape parameter, Fig. 2 presents mean annual runoff versus shape parameter based on the mean annual water balance (Yao et al., 2020). It can be found that mean annual runoff decreases significantly as the shape parameter increases, especially when shape parameter approaches its upper limit (i.e., 2). The negative relationship between the mean annual runoff and the shape parameter can be attributed to the fact that the larger shape parameter indicates that less watershed area has small values of point-scale storage capacity (Wang, 2018), and more precipitation could be retained underground for evaporation.

Study watersheds
The estimations of mean annual runoff in 35 watersheds are diagnosed in this paper. The number 35 was determined due to the consideration of the data availability including soil (hydrologic soil group), land cover and land use, DEM, and the minimum snow effect and human activities (Wang and Hejazi, 2011), as well as to keep the efforts of gSSURGO data processing to a reasonable level while still having a sufficient number of samples for the watersheds. The drainage area of the watersheds varies from 2044 to 9889 km 2 . Table 1 shows the USGS (United States Geological Survey) gauge number and climate aridity index of these watersheds. The saturation excess is the dominated runoff generation in these watersheds. Daily precipitation and streamflow data during 1948-2003 are extracted from the MOPEX dataset (Duan et al., 2006), and the daily potential evaporation during this period is calculated based on the Hargreaves method (Hargreaves and Samani, 1985) by using the daily maximum, minimum, and mean temperature. The average soil water storage capacity and the shape parameter for these watersheds are estimated from the available data of climate, LULC, soil, and topography, and the predictions of mean annual runoff are diagnosed.

Estimated average soil water storage capacity
The potential maximum retention (S CN ) is calculated based on the average CN in each watershed (Table 1) Since the Spoon River watershed has a higher percentage of agricultural land and lower soil permeability, its average CN is higher than that for the Fox River watershed. Correspondingly, the calculated S CN in the Fox River watershed (162 mm) is higher than that in Spoon River watershed (71 mm). The values of S CN over the study watersheds vary from 56 mm (Auglaize River watershed) to 182 mm (Chattahoochee River watershed) as shown in Table 1.
The average soil water storage capacity is estimated based on the computed S CN and climate aridity index shown in Eq. (8). For examples, the climate aridity index in the Fox River watershed is 1.12 which is the same as that in the Spoon River watershed. The estimated S b is 721 mm in the Fox River watershed and 314 mm for the Spoon River watershed. As shown in Table 1, the estimated S b varies from 177 mm (Chikaskia River watershed) to 1559 mm (Chattahoochee River watershed) over the study watersheds. Figure 4a shows the spatial distribution of the estimated S b . Watersheds with higher S b are mostly distributed in the eastern US, where the aridity index is relatively lower than that in the other watersheds.

Estimated shape parameter
The shape parameter (a) for the distribution of soil water storage capacity is estimated based on the soil data in the gSSURGO database. For examples, the black circles in Fig. 5 show the normalized storage capacity for the Fox River watershed (Fig. 5a) and the Spoon River watershed (Fig. 5b) based on the soil data in the gSSURGO database. As shown in Fig. 5, the normalized CDF for both watersheds shows an S shape. The estimated shape parameter is 1.996 for the Fox River watershed (RMSE = 0.58) and 1.990 for the Spoon River watershed (RMSE = 1.27) by fitting to the soil data. A higher value of shape parameter indicates less spatial variability; therefore, the spatial variability in the Spoon River watershed is higher than that in the Fox River watershed. The mean value of RMSE for the 35 study watersheds is 0.06. Figure 4b shows the estimated shape parameters for the study watersheds, which vary from 1.830 to 1.998.

Diagnosing mean annual runoff prediction
The estimated values of S b and a based on climate, LULC, and soil data are applied to the mean annual water balance model. The comparison of simulated and observed mean an-nual runoff for the study watersheds is shown in Fig. 6a. The RMSE for estimated mean annual runoff is 80 mm/yr. The water balance model captures 88.2 % of the mean annual runoff across the 35 study watersheds; therefore, the methods for estimating S b and a based on the available data are promising for predicting annual runoff in ungauged basins. The water balance model with the estimated values of S b and a underestimates the mean annual runoff in some watersheds, and the relative underestimation error is 11.8 % on average among all the study watersheds. The underestimation of mean annual runoff could be due to the biased estimation of the shape parameter. As described in Sect. 3, the spatial variability of soil water storage capacity is assumed to be equal with the spatial variability of the pore space in the shallow soil. The pore space at the point scale is calculated through the porosity and soil thickness. The thickness of the shallow soil in the gSSURGO database is quite uniformly distributed across the watershed, i.e., around 2 m, whereas the actual soil thickness including the weathered bedrock is the elevation difference between the land surface and fresh bedrock, and it can be highly heterogeneous due to the variable land surface and bedrock topography over the watershed.
To diagnose the effect of land surface and bedrock topography on mean annual water balance, the shape parameter is calibrated using the observed streamflow. The streamflow data during 1948-2003 are divided into three periods: (1) the warm-up period (1948)(1949)(1950)(1951)(1952)(1953), (2) the calibration period (1954)(1955)(1956)(1957)(1958)(1959)(1960)(1961)(1962)(1963)(1964)(1965)(1966)(1967)(1968)(1969)(1970)(1971)(1972)(1973), and (3) the validation period . During the calibration, the estimated S b based on CN is used, and a is the only free parameter to be calibrated. The calibration is conducted by minimizing the absolute error of the observed and simulated mean annual runoff through a global optimization method, i.e., the shuffled complex evolution method (Duan et al., 1992). As shown in Fig. 6b, most of the calibrated a values are smaller than the estimated a based on soil data only. The performance of predicted mean annual runoff (during the validation period) is improved with the calibrated shape parameter (Fig. 6c). The average of absolute error for the mean annual runoff is 7.1 %.
The overestimation of shape parameter based on the soil porosity data underestimates the area percentage of low soil water storage capacity compared with the calibrated one as shown in Fig. 5a for the Fox River watershed and Fig. 5b for the Spoon River watershed. The slope at the normalized soil water storage capacity around 1 for the estimated shape parameter is higher than that for the calibrated one. Therefore, the calibrated shape parameter indicates a larger spatial variability. The underestimation of catchment area with low soil water storage capacity could result from neglecting the effect of land surface and bedrock topography, which cannot be referred from the soil database (gSSURGO) where the point-scale soil thickness is around 2 m.
To explore the impact of land surface topography on the spatial distribution of soil water storage capacity, the soil data (i.e., porosity) are combined with the height above the nearest drainage (HAND) method proposed by Gao et al. (2019). HAND is the vertical elevation difference from a point to its  nearest drainage point. The distribution of HAND was used for estimating the shape parameter of the spatial distribution of storage capacity. Therefore, the HAND method uses land surface topography data only for estimating the shape parameter. In our analysis, the porosity of the soil beyond the bottom layer in the soil database is assigned with the same value as the bottom layer. For example, if the HAND for a grid cell is 10.0 m and the porosity and depth of the bottom soil layer in the gSSURGO database is 0.2 and 2.0 m, respectively, then the porosity for the soil from 2.0 to 10.0 m depth is assigned with 0.2. Finally, the total volume of pores is calculated for each grid cell based on the soil porosity obtained from the gSSURGO database and the HAND value based on land surface topography.
The control of land surface topography on the hydrologic process has also been widely quantified through the topographic wetness index (TWI) of TOPMODEL (Beven and Kirkby, 1979). The spatial variability of soil storage capac- Figure 6. (a) Observed versus simulated mean annual runoff using shape parameter based on soil data; (b) Soil data-based versus calibrated shape parameter; and (c) observed versus simulated mean annual runoff using shape parameter based on calibration.
ity based on the TOPMODEL assumption has been demonstrated as a beneficial representation of the conceptual model (Sivapalan et al., 1997). Therefore, the heterogeneity of TWI in a watershed was proposed to be another surrogate of the heterogeneity of the soil storage capacity in this study, and the shape parameter estimated by fitting TWI against Eq. (12) through minimizing the root mean square error (RMSE) for the Maquoketa River in Iowa was compared with those obtained from other methods.
The dashed blue line in Fig. 7 shows the porosity-HANDbased CDF of normalized soil water storage capacity for the Maquoketa River in Iowa (gauge #05418500). The stream initiation threshold used for calculating HAND is 40 km 2 , which is 1 % of the maximum flow accumulation (Maidment, 2002). The threshold affects the value of HAND, but this is beyond the scope of this paper. The best fit value of a for the porosity-HAND-based CDF is 1.779, which overestimates the spatial variability of storage capacity compared with the calibrated shape parameter (a = 1.905). This is due to the assumption of the HAND method that the bedrock between a specific point and its nearest drainage point is horizontal and intercepts with the channel bed. However, the bedrock topography may have various slopes in a watershed (Troch et al., 2002). Therefore, the true value of a (indicated by the calibrated one) potentially falls between the a obtained from soil data and the a based on soil and HAND. The bedrock topography from observation or models is needed to accurately estimate the shape parameter. The dashed dotted red line in Fig. 7 displays the CDF of the normalized soil storage capacity based on TWI, and the corresponding value of a is 1.967. The TWI-based a value also presents a larger spatial variability than that derived from soil data solely, confirming the importance of topography in determining the heterogeneity of soil water storage capacity. The deviation of the TWIbased a value from its calibrated counterpart could be due to the fact that the bedrock topography is not considered in TWI. Figure 7. The effects of soil, land surface topography, bedrock topography, and topographic wetness index (TWI) on the shape parameter of the spatial distribution of soil water storage capacity.

Conclusions
A mean annual water balance model based on the concept of saturation excess runoff generation is used for diagnosing the potential for nonparametric modeling of mean annual runoff in ungauged basins. The model takes the effect of climate variability into account explicitly since it is driven by daily precipitation and potential evapotranspiration at the daily time step. The distribution function, which leads to the SCS curve number method, is used for describing the spatial distribution of soil water storage capacity. The mean (i.e., average soil water storage capacity) and the shape parameter (i.e., the spatial variability of soil storage capacity over the watershed) of the distribution function can be estimated from the available data. Based on the linkage of the distribution function and the SCS curve number method, a new method based on the existing observed data of watershed characteristics is proposed for estimating the average soil water storage capacity. The average soil water storage capacity (S b ), as one of the parameters in the model, was estimated as a function of climate aridity index and curve number which is calculated based on land cover and soil data.
The developed mean annual water balance was applied to diagnose the estimation of shape parameter (a) in this study. The shape parameter, describing the spatial variation of soil water storage capacity, was first estimated based on the porosity and soil thickness data in the soil database (gSSURGO). The estimated values of a were tested in 35 watersheds. The results showed that the model with the estimated values of S b and a underestimated the mean annual runoff by 11.8 % on average over all the study watersheds. The underestimation of runoff is mainly caused by the underestimation of the spatial heterogeneity of soil thickness over the watershed. The height above the nearest drainage (HAND) was then calculated as the total soil thickness for estimating the total volume of the pore space. The result showed that topography is of great importance for determining the spatial variability of soil water storage capacity. The estimated shape parameter from porosity-HAND overestimated the spatial variability of the storage capacity compared with the calibrated a, which may result from the assumed bedrock in the HAND method. The topographic wetness index (TWI)-based shape parameter further indicated the importance the topography including the land surface topography and bedrock topography. Future research will investigate alternative methods for better estimating the spatial variability of soil water storage capacity over watersheds and quantify the impacts of vegetation and climate variability (e.g., distribution of rainy days, the magnitude and the seasonality of climate variables).
Author contributions. DW designed the study; contributed to the methods, results, and discussion; and modified the text. YG quantified the parameters of the model and prepared the article with contributions from all co-authors. LY developed the model code, quantified the parameters, performed the simulations, and prepared the article with contributions from all co-authors. NBC contributed to the introduction and modified the text.
Competing interests. The authors declare that they have no conflict of interest.