Vegetation controls on soil moisture dynamics are challenging to measure and
translate into scale- and site-specific ecohydrological parameters for simple
soil water balance models. We hypothesize that empirical probability density
functions (pdfs) of relative soil moisture or soil saturation encode
sufficient information to determine these ecohydrological parameters.
Further, these parameters can be estimated through inverse modeling of the
analytical equation for soil saturation pdfs, derived from the commonly used
stochastic soil water balance framework. We developed a generalizable
Bayesian inference framework to estimate ecohydrological parameters
consistent with empirical soil saturation pdfs derived from observations at
point, footprint, and satellite scales. We applied the inference method to
four sites with different land cover and climate assuming (i) an annual
rainfall pattern and (ii) a wet season rainfall pattern with a dry season of
negligible rainfall. The Nash–Sutcliffe efficiencies of the analytical
model's fit to soil observations ranged from 0.89 to 0.99. The coefficient of
variation of posterior parameter distributions ranged from

The movement of water from soils, through plants, and back to the atmosphere via transpiration is a critical component of local and global hydrologic cycles and is the largest surface-to-atmosphere water pathway (Good et al., 2015). A realistic analytical description of soil moisture dynamics is key to understanding ecohydrological processes that regulate the productivity of natural and managed ecosystems. Rodriguez-Iturbe et al. (1999) introduced a simple framework using a bucket model of soil-column hydrology forced with stochastic precipitation inputs where soil water losses are only a function of relative soil moisture or soil saturation. Given this ecohydrological framework, the analytical equation for the probability density function (pdf) of soil saturation depends on simple abiotic characteristics such as average climate and soil texture, and biotic characteristics including soil saturation thresholds at which vegetation can influence soil water losses. However, the shapes of analytical soil saturation pdfs are generally not consistent with observations when literature values for model parameters are used (Miller et al., 2007). Some parameters such as field capacity and wilting point do not correspond to conventional definitions, because of simplifications made to describe soil water loss processes in the model, and need to be calibrated (Dralle and Thomspon, 2016). To our knowledge, parameters of the analytical soil saturation pdfs have not been directly calibrated to empirical pdfs derived from measurements beyond the point scale. Observation networks provide freely available point-scale, spatially integrated soil moisture observations, while remotely sensed soil moisture observations are available through satellite products. These data sources create an opportunity to (i) evaluate whether analytical soil saturation pdfs are consistent with observations across a range of scales, and (ii) determine average ecohydrological parameters relevant to each scale.

Estimates of ecohydrological parameters are used in a large range of applications for which the stochastic soil water balance framework has been used and adapted, including the effects of climate, soil, and vegetation on soil moisture dynamics (Laio et al., 2001a; Rodriguez-Iturbe et al., 2001; Porporato et al., 2004); ecohydrological factors driving spatial and structural characteristics of vegetation (Caylor et al., 2006; Manfreda et al., 2017); soil salinization dynamics (Suweis et al., 2010); biological soil crusts (Whitney et al., 2017); vegetation stress; optimum plant water use strategies and plant hydraulic failure (Laio et al., 2001b; Manzoni et al., 2014; Feng et al., 2017); vertical root distributions (Laio et al., 2006); plant pathogen risk (Thomspon et al., 2013); streamflow persistence in seasonally dry landscapes (Dralle et al., 2016); and soil water balance partitioning (Good et al., 2014, 2017). A survey of nearly 400 ecohydrology publications revealed that 40 % of studies relied heavily on simulation, rarely integrated empirical measurements, and were almost never coupled with experimental studies, suggesting a critical need to combine modeling and empirical approaches in ecohydrology (King and Caylor, 2011). Only a few studies have directly confronted the governing equations of the stochastic soil water balance model with observed soil moisture data, and even fewer studies have attempted to optimize model parameters to best fit soil moisture observations. Miller et al. (2007) calibrated soil saturation pdfs to project vegetation stress in a changing climate. Dralle and Thompson (2016) developed an analytical expression for annually integrated soil saturation pdfs under seasonal climates and then calibrated soil saturation thresholds between which evapotranspiration is maximum and zero to compare the model to soil moisture observations at a savanna site. Chen et al. (2008) related evapotranspiration observations at the stand scale to soil moisture values using a Bayesian inversion approach, and Volo et al. (2014) calibrated the soil moisture loss curve to investigate effects of irrigation scheduling and precipitation on soil moisture dynamics and plant stress. The functional form of the soil moisture losses was approximated using conditionally averaged precipitation (Salvucci, 2001; Saleem and Salvucci, 2002) and remotely sensed data (Tuttle and Salvucci, 2014). The timescale of soil moisture dry-downs, derived from the soil moisture loss equations, was parameterized using evapotranspiration measured at micro-meteorological stations (Teuling et al., 2006) and space-borne near-surface soil moisture observations (McColl et al., 2017). These studies indicate that the ecohydrological soil water balance framework is consistent with ground and larger-scale remotely sensed measurements.

Parameters representative of larger-scale observations are necessary to characterize ecohydrological processes at ecosystem scales and are more relevant to ecohydrological modeling. These larger-scale parameters integrate a range of ecohydrological interactions that are poorly understood and difficult to measure. Abiotic controlling factors of soil water balance including rainfall and soil texture can generally be assessed from readily available data, including site measurements, regionalized maps, and satellite observations, but vegetation controls on soil water dynamics are largely unknown and difficult to measure at hydrologically meaningful scales (Li et al., 2017). Vegetation water-use traits are generally observed at the species level and are not easily translated to the simple parameters necessary in soil water balance models. The rate of soil water losses from the near-surface soil layer, where soil moisture measurements are generally made, do not precisely correspond to evapotranspiration observed or calculated from meteorological stations. We thus focused on estimating parameters that are not directly observable, particularly the soil saturation thresholds at which vegetation controls soil water losses and the maximum rate of evapotranspiration from a near-surface soil layer. We use an inverse modeling approach and data that are commonly collected at environmental monitoring sites or measured from satellites. We present an inference framework that provides a means to quantify and compare the sensitivity of soil moisture dynamics at varying scales through estimates of simple ecohydrological parameters.

A number of studies have combined inverse modeling approaches with ground and remotely sensed soil moisture data to extract meaningful hydrologic information (Xu et al., 2006; Miller et al., 2007; Chen et al., 2008; Volo et al., 2014; Wang et al., 2016; Baldwin et al., 2017). Bayesian inference methods are effective in relating prior pdfs of observations to posterior estimates of model parameters (Xu et al., 2006; Chen et al., 2008; Baldwin et al., 2017). The soil water balance model provides a direct analytical equation for soil saturation pdfs that is convenient to use with the Bayesian paradigm because it is a low parameter model with few data inputs. We selected a Bayesian inversion approach instead of a least-squares or maximum likelihood approach because it quantifies the inference uncertainty and improves upon the work of Miller et al. (2007), which used a least-squares approach to calibrate soil saturation pdfs. Measures of inference uncertainty and parameter convergence diagnostics provided by the Bayesian approach can be used to evaluate the validity of model inversion and develop criteria to generalize the presented framework.

We assume that if a sufficient range of soil moisture values are observed at a site, the shape of the empirical soil saturation pdf is constrained by the ecohydrological factors driving soil moisture dynamics. We hypothesize that key information required to determine these ecohydrological factors is encoded in empirical soil saturation pdfs and that this information can be extracted by calculating the inverse of the commonly used stochastic soil water balance. The analysis of soil saturation pdfs is a more robust and integrated approach to investigate ecohydrological factors of soil water dynamics than is time series analysis. Soil saturation pdfs are less sensitive to the many sources of uncertainty, sensor noise, and common gaps in soil moisture observations and do not require high-quality, co-located, and concurrent hydrologic measurements that are often lacking. We tested three key assumptions embedded in the proposed method. (i) The analytical soil saturation pdfs properly describe empirical soil saturation pdfs observed in annual data. Annual soil moisture records can be affected by transitional dynamics between wet and dry seasons, and the appropriate level of model complexity must be used. We compare parameter identifiability using an annual and a seasonal formulation of the analytical soil saturation pdfs. (ii) Parameter estimates and their uncertainty at point, footprint, and satellite scales are different and reflect variability in soil water dynamics. We determine whether the inference approach can be applied at point, footprint, and satellite scales to provide appropriate scale-specific parameters for ecohydrological modeling. (iii) The range of realizable soil moisture values is captured by the selected time series and the soil saturation pdf determined from these observations is not truncated. We determine whether the inference method based on soil saturation pdfs is robust against reduced data availability by repeating the model inversions on subsets of the soil moisture time series and show that the method can be applied to sparse datasets.

Our goal was to match empirical soil saturation pdfs derived from point-, footprint-, and satellite-scale observations to a commonly used analytical model. We demonstrate the use of a Bayesian inversion framework to calibrate the ecohydrological parameters of a simple stochastic soil water balance model that best fit empirical soil saturation pdfs. We first present data sources, define the analytical model for soil saturation pdfs including parameter assumptions, and detail the algorithm used in the Bayesian inversion. Then, we present a summary of the goodness of fit of optimal analytical soil saturation pdfs and estimated parameter uncertainty. We evaluated results to test key method assumptions including model complexity and data availability. Finally, we discuss the potential of the approach to provide a simple means to investigate variability in ecohydrological controlling factors at varying spatial scales. Our work combines modeling and empirical approaches in ecohydrology to provide more realistic analytical descriptions of soil moisture dynamics. Estimates of ecohydrological parameters consistent with observed soil saturation pdfs, from point to ecosystem scales, are needed to better characterize site-specific ecohydrological processes.

We used daily soil moisture observations from three data products at three
spatial scales. We used point-scale soil moisture data at a depth of 10 cm
from the FLUXNET2015 data product
(

We selected four sites with soil moisture and rainfall data available for the
2012 calendar year (Fig. 1, Table 1). Selected sites spanned a range of land
cover types, including crop and grasslands, oak savanna, deciduous forest and
pine forest. We determined the dominant soil texture of the upper soil layer
from the Harmonized World Soil Database (HWSD) (version 1.2)
(FAO/IIASA/ISRIC/ISS-CAS/JRC, 2012) for each site. We used soil porosity values, derived from the HWSD
available as ancillary data through the ESA-CCI data product, for the
satellite-scale analysis. We used the maximum soil moisture observation
during the year 2012 as a site-specific soil porosity estimate for point- and
footprint-scale data products. We used soil porosity for each site to
calculate soil saturation

Soil saturation and rainfall time series from

Selected study sites.

Latitude and longitude in parentheses correspond the centroid of
the satellite area associated with the site location; MAT, mean annual
temperature from long-term FLUXNET2015 data; MAP, mean annual precipitation
from long-term FLUXNET2015 data; soil texture taken from the HWSD;

Our framework is based on a standard bucket model of soil column hydrology at a point forced with stochastic precipitation inputs and in which soil water losses are a function of soil saturation. We followed the simple formulation of soil water losses in Laio et al. (2001a). We applied two associated analytical formulations for the soil saturation pdf detailed below and derived under the assumption of steady state, wherein parameters are constant for a given period of time. The annual model assumed an annual rainfall pattern and the seasonal model accounted for a wet season rainfall pattern and a dry season of negligible rainfall.

The soil water balance model is defined at a point and a daily time step,
for a soil with porosity

We adopted Dralle and Thompson's (2016) framework to account for transient
dynamics between wet and dry seasons. We defined the dry season as a period
of duration

We chose readily available data for rainfall characteristics (

We calculated rainfall characteristics

We related

We used the Metropolis–Hastings Markov chain Monte Carlo (MH-MCMC) technique
to estimate the posterior distribution of

The MH-MCMC technique converges to a stationary distribution according to the
ergodicity theorem in Markov chain theory. The sampling algorithm consisted
of repeating two steps: (i) a proposing step, in which the algorithm
generates a new model

We did not have direct measurement to validate the parameters

Convergence of the Bayesian inversion: a GR diagnostic

Low uncertainty in parameter estimates: the posterior distributions of
parameter estimates are physically plausible and have coefficients of
variations

Goodness of fit: a quantile-level Nash–Sutcliffe efficiency (NSE)
(Müller et al., 2014)

Major assumptions and limitations embedded in the proposed inference
framework were tested through the analysis detailed below. We assume, for
each scale and location, that the shape of empirical the soil saturation
pdfs is controlled by the physical constraints used to parameterize the
analytical model of soil saturation pdfs, these parameters can be determined
with some certainty and reflect variability in soil water dynamics. We
expect that estimated soil saturation thresholds have greater certainty when
the empirical soil saturation pdf is defined around those values and greater
uncertainty when fewer soil saturation values are observed around the
thresholds. We acknowledge that pre-defined rainfall characteristics and
physical soil parameters based on observations or literature values may not
be exactly representative of the processes at each location or scale and
could also create biases and uncertainties in the fitted parameters of
interest. We used model evaluation criteria (Sect. 2.4) to investigate the
applicability of the inference framework with varying model complexities,
scales, locations and data availability.

Analytical expressions for soil saturation pdfs were derived under the assumption of steady state. Annual soil moisture records can be affected by transitional dynamics between wet and dry seasons, and the appropriate level of model complexity must be used. We applied the inversion framework to annual soil saturation using variations of the analytical model for soil saturation pdfs of increasing complexity: (i) the annual model in Eq. (2) and (ii) the seasonal model in Eq. (3). We determined whether the added complexity of the dry season pdf increases the identifiability of ecohydrological parameters or if the simpler annual model is sufficiently consistent with annual empirical soil saturation pdfs.

We compared co-located parameter estimates and their uncertainty at point, footprint, and satellite scales for each site. We determine whether the inference approach can provide appropriate scale-specific parameters for ecohydrological modeling at each location.

We assumed that the whole range of realizable soil saturation values was captured within the selected time series at each scale and that the resulting soil saturation pdf was not truncated. If the range of observed values is not representative of the soil saturation pdf because it is truncated or affected by noise in the data, parameter estimates may be biased. Minimum and maximum observed soil saturation values during 2012 (Table 1) indicate the range of observed soil saturation values we used to estimate ecohydrological parameters. We determine whether the inference method based on soil saturation pdfs is robust against reduced data availability by repeating the model inversions on subsets of the soil saturation time series and show that the method can be applied to sparse datasets. We performed the model inversion using subsets of each soil saturation record by randomly resampling fractions of the data down to 10 % of the annual timeseries and computed goodness of fit statistics between the resulting analytical models and the empirical models based on the full annual record. We determined the number of data points necessary to infer converging model parameters that best match observations and whether the proposed inference method based on soil saturation pdf can be reliably used to identify ecohydrological parameters from sparse datasets.

Estimated ecohydrological parameters and goodness of fit of analytical soil saturation pdfs.

Values in parentheses
correspond to the coefficient of variation of the posterior parameter
estimates in percentage.

For each of the four locations (Table 1), we obtained optimal analytical soil
saturation pdfs consistent with the empirical pdfs derived from soil
saturation observations using the Bayesian inversion framework and a MH-MCMC
algorithm. Model inversions for each site and scale and for both annual and
seasonal models met the evaluation criteria (see Sect. 2.4). Our results
indicated that the framework of Dralle and Thompson (2016) can be applied to
sites with low (US-MMS) and high (US-Ton) seasonality in rainfall patterns.
Posterior probability distributions of soil water balance
parameters (

Empirical versus modeled cumulative density functions (CDFs) and
soil saturation probability distribution (

Empirical versus modeled CDFs and soil saturation probability
distribution (

Empirical versus modeled CDFs and soil saturation probability
distribution (

Empirical versus modeled CDFs and soil saturation probability
distribution (

Goodness of fit and ecohydrological parameters inferred with decreasing
number of soil saturation observations (annual model). For each subsample
category, the median results of 10 repeats are plotted and results between the
90th and 10th percentiles are shaded. Colors correspond to the four sites in
the legend. KS, Kolmogorov–Smirnov statistic; NSE, quantile-level Nash–Sutcliffe
efficiency;

Parameter estimates were most constrained for scales and locations at which
soil water dynamics are more sensitive to the fitted ecohydrological
parameters of interest. In these cases, convergence of the model inversion
was attained less rapidly, but ultimately provided better goodness of fit.
Soil saturation states at drier sites may be more controlled by soil water
loss parameters, while soil saturation states at wetter sites may also be
controlled by rainfall characteristics. Estimated soil saturation thresholds
had greater certainty if the empirical soil saturation pdfs were defined
around those values and had greater uncertainty if there were fewer soil
saturation values observed around the thresholds. For example, uncertainty
of

Parameter uncertainty for satellite and footprint scales was greater than for the point scale. Estimates of larger-scale soil water balance parameters are more relevant to regional ecohydrological dynamics. Differences in parameter estimates among scales within a site may be associated with differences in soil texture properties, such as porosity and field capacity, that were determined separately for each record. Co-located and concurrent soil saturation pdfs are different at each scale (Figs. 2–5) and suggest variability in observed soil water dynamics at each scale. Differences in driving processes among scales were specifically determined from the model inversion for each scale and provided robust scale-specific parameters for ecohydrological modeling.

For each spatial scale and site, the annual model was inversed, using random subsamples of 100 to 10 % of the 2012 time series (Fig. 6). For all sites and scales the number of observations did not significantly impact model inference. The NSE, Kolmogorov–Smirnov statistic, and parameter estimates were stable down to about 100 observations. Fitted model parameter values and the variability of parameter estimates among the 10 repetitions in each subsample category were not sensitive to the number of observations used. Results indicate the identifiability of ecohydrological parameters through the inversion of the analytical model of soil saturation pdfs was robust because the mean and standard deviation of the randomly selected subsets of annual data were representative of the full record. There was no correlation between the small differences in the mean and standard deviations of the subsamples and the model goodness of fit. The proposed inference method based on soil saturation pdfs can therefore reliably be used to identify ecohydrological parameters from sparse datasets. Inference methods, which do not require continuous data, are particularly relevant to large-scale soil moisture measurements, such as satellite products, that are not continuous.

We document a generalizable Bayesian inversion framework to infer parameter
values of the stochastic soil water balance model and their associated
uncertainty using freely available rainfall and soil moisture observations
at point-, footprint- and satellite-scales. Empirical pdfs derived from soil
saturation observations provided key information to determine unknown
ecohydrological parameters

We provide a method based on a parsimonious soil water balance model, requiring a minimum level of data inputs to estimate ecohydrological characteristics that are not directly observable and for which established estimation methods are not available. Our methods can be applied in future studies to better understand differences in soil water dynamics at different scales and to improve scaling of ecohydrological processes. Results demonstrate the value of large-scale near-surface soil moisture observations to improve characterization of soil water dynamics at ecosystem scales. Relations between the soil saturation threshold values inferred from the near-surface soil moisture data and dynamics in the full active rooting zone are unknown. The datasets we used are freely available from sensor networks and global satellite products, and methods can therefore be applied to a large range of sites or to global analyses to improve understanding of spatial patterns in ecohydrological parameters relevant for local and global water cycle analyses.

We downloaded all datasets from publicly available
sources. Point-scale soil moisture and rainfall data are available through
FLUXNET2015 (

The authors declare that they have no conflict of interest.

We thank Minghui Zhang, Marc Müller, David Dralle, Xue Feng, and editor Sally Thomspon for their thoughtful reviews and useful feedback on an earlier draft of this paper. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under grant no. 1314109-DGE. Stephen P. Good acknowledges the financial support of the US National Aeronautics and Space Administration (NNX16AN13G). This work used the Extreme Science and Engineering Discovery Environment (XSEDE) via allocation DEB160018, supported by National Science Foundation grant number ACI-1548562. This work used data acquired and shared by the FLUXNET community, including these networks: AmeriFlux, AfriFlux, AsiaFlux, CarboAfrica, CarboEuropeIP, CarboItaly, CarboMont, ChinaFlux, Fluxnet-Canada, GreenGrass, ICOS, KoFlux, LBA, NECC, OzFlux-TERN, TCOS-Siberia, and USCCC. The FLUXNET eddy covariance data processing and harmonization were carried out by the European Fluxes Database Cluster, AmeriFlux Management Project, and Fluxdata project of FLUXNET, with the support of CDIAC and the ICOS Ecosystem Thematic Center and the OzFlux, ChinaFlux, and AsiaFlux offices. Edited by: Sally Thompson Reviewed by: Xue Feng, David Dralle, Marc F. Muller, and Minghui Zhang