Characterizing coarse-resolution watershed soil moisture heterogeneity using fine-scale simulations

Introduction Conclusions References


Introduction
Representation of the structure and dynamics of fine-scale spatial structure in hydrological states and fluxes has been shown to significantly influence coarse-scale surface energy budgets (e.g., evapotranspiration; Wood, 1997Wood, , 1998;;Vivoni et al., 2010), runoff and streamflow (Arrigo and Salvucci, 2005;Vivoni et al., 2007;Barrios and Frances, 2012), regional-scale feedbacks with the atmosphere (Nykanen and Foufoula-Georgiou, 2001), and biogeochemical responses (Dai et al., 2012;Zhang et al., 2012).It has been argued that the relevant spatial scale for hydrological state and flux heterogeneity is on the order of 100 m (Wood et al., 2011), while for biogeochemical dynamics it may be as small as 1 m (Burt and Pinay, 2005;Groffman et al., 2009;Frei et al., 2012;McClain et al., 2003).The current suite of land models representing coupled hydrological and biogeochemical cycles and used for analyses of water resources and water quality (e.g., HydroGeoSphere (Li et al., 2008), CATHY (Weill et al., 2011), PIHM (Qu and Duffy, 2007), tRIBS (Ivanov et al., 2004), Noah-MP + CATHY (Niu et al., 2013), GSFlow (Markstrom et al., 2008), LEAF-Hydro-Flood (Miguez-Macho and Fan, 2012), GEOtop (Rigon et Published by Copernicus Publications on behalf of the European Geosciences Union.al., 2006), MIKE-SHE (McMichael et al., 2006), WEP-L (Jia et al., 2006), and PAWS (Shen, 2009;Shen and Phanikumar, 2010)), as well as regional (e.g., Subin et al., 2011) and global (e.g., Koven et al., 2013;Tang et al., 2013) climate prediction are typically applied at resolutions that are orders of magnitude larger than these scales.Unfortunately, there are few large-scale observational datasets with which to test the impact of the discrepancies in scale between model representation and known variability of coupled hydrological and biogeochemical processes.This problem is particularly acute for biogeochemical dynamics, which generally depend strongly on the hydrological state.
Watershed-scale hydrological models are often tested against, or calibrated to, stream flow observations.The impact of these types of calibrations on the relative accuracy of surface soil moisture heterogeneity is not well characterized.For example, Nykanen and Foufoula-Georgiou (2001) used observations from the 1997 Southern Great Plains (SGP) experiment to investigate the impact of nonlinear soil moisture dependencies of parameters on the scale dependency of those parameters.They showed that failing to consider this scale dependency could cause large biases in predicted surface runoff.Gebremichael et al. (2009) compared scaling characteristics of spatial soil moisture fields from the same 1997 SGP experiment with predicted values from a distributed hydrologic model.Inconsistencies between the observed and predicted soil moisture mean and spatial scaling parameters indicated that while the model accurately reproduced outlet stream flow, the underlying mechanisms leading to runoff might have been inaccurately simulated.
Quantifying relationships between the statistical properties of the soil moisture field and spatial scale may allow for prediction of heterogeneity at scales finer than those resolved by the model.Since the pioneering work of Rodriguez-Iturbe (1995) and Wood (1998), who described the power law decay of variance as a function of the observation scale, many studies have quantified the variance-scale relationship.Hu et al. (1997) showed that the variance (σ 2 θ ) of the soil moisture (θ ) field at different spatial averaging areas (A) can be related to the ratio of those areas raised to a scaling exponent (γ , i.e., "simple scaling").They also showed that γ is related to the spatial correlation structure of the soil moisture field and that it decreases as soils dry.Their scaling analysis of higher-order moments indicated that soil moisture might not always follow simple scaling.A number of investigators have since demonstrated that the relationship between σ 2 θ and spatial scale is not log-log linear across all spatial scales, and that the relationship can depend on the mean soil moisture (µ θ ) field (e.g., Mascaro et al., 2010Mascaro et al., , 2011;;Famiglietti et al., 1999;Nykanen and Foufoula-Georgiou, 2001;Das and Mohanty, 2008;Joshi and Mohanty, 2010).
It has been further observed that soil moisture mean is often related to its variance and higher-order moments.Most commonly, an upward convex relationship between µ θ and σ 2 θ has been reported when a sufficiently large range of mean moistures is analyzed (e.g., Teuling and Troch, 2005;Lawrence and Hornberger, 2007;Teuling et al., 2007;Famiglietti et al., 2008;Pan and Peters-Lidard, 2008;Brocca et al., 2010Brocca et al., , 2012;;Tague et al., 2010;Rosenbaum et al., 2012;Li and Rodell, 2013;Choi and Jacobs, 2011).Theoretical analyses have indicated that an upward convex relationship is consistent with current understanding of soil moisture dynamics (e.g., Vereecken et al., 2007).Famiglietti et al. (2008) used over 36 000 soil moisture observations in four field campaigns to demonstrate that soil moisture variability generally increased with extent scale and followed fractal scaling.Their reported soil moisture standard deviation versus mean moisture content exhibited a convex-upward relationship, with the peak of their best-fit relationships occurring at ∼ 0.15 mean soil moisture.Brocca et al. (2012), using data from 46 sites over two years in two adjacent ∼ 200 km 2 areas, observed a peak in the convexupward relationship around 0.2-0.25 mean soil moisture.Choi and Jacobs (2011) studied observations from two years of the Walnut Creek watershed, Iowa, Soil Moisture Experiment (2002Experiment ( , 2005)).They observed a convex-upward relationship during the 2002 observations, when the soil moisture range extended down to ∼ 0.1, but not in 2005, when mean soil moisture did not drop below ∼ 0.15.Rosenbaum et al. (2012) concluded that the relationship between the 0-5 cm soil moisture spatial standard deviation and mean also had a convex-upward shape, but peaked at a higher mean soil moisture level (0.35-0.40), although their range of mean soil moisture extended substantially further (0.58) than the other studies cited above.
Several studies have also investigated the relationships between observed soil moisture mean and higher-order moments (skewness (s θ ) and kurtosis (k θ )).For example, Famiglietti et al. (1999) used observations from the SGP97 experiment to conclude that the distribution of surface soil moisture content evolved from negatively skewed under very wet conditions, to normal in the midrange of mean moisture, to positively skewed under dry conditions.For the same SGP97 data set, Ryu and Famiglietti (2005) discussed the bimodality of the soil moisture distributions (which will be reflected in k θ ), and concluded that it resulted primarily from fractional precipitation within the observational footprint.
A fewer number of studies have combined observations of soil moisture with distributed hydrological model predictions to investigate spatial scaling properties.Li and Rodell (2013) examined spatial statistics of in situ, satellite-retrieved, and modeled soil moisture over three large climate regions.The relationship between σ 2 θ and µ θ had an upward convex shape for the in situ measurements, but not for the modeled relationship.Manfreda et al. (2007) examined the statistical structure of soil moisture patterns using modeled soil moisture obtained from the North American Land Data Assimilation System (NLDAS).They concluded that σ 2 θ followed a power law relationship with averaging area, and the dynamics of the relationship were controlled by mean soil water content.Maxwell (2010) performed transient simulations of an arid mountain system and showed that the land-energy fluxes were spatially correlated and that the soil saturation vertical structure did not follow a simple scaling relationship.Ivanov et al. (2010) studied the relationship between soil moisture mean and its coefficient of variation using a numerical model applied to a small hillslope, and demonstrated hysteretic patterns during the wetting-drying cycle.They concluded that the system response is not unique given the same initial mean state, but that it depends on the magnitude of precipitation inputs.
The relationships between soil moisture mean and statistical moments potentially depend on a wide range of factors and on spatial extent.As reviewed in Brocca et al. (2007), soil moisture statistical properties can be impacted by lateral redistribution (Moore et al., 1988;Williams et al., 2003), radiation (Moore et al., 1993;Western et al., 1999), soil characteristics (Hu et al., 1997;Famiglietti et al., 1998;Seyfried, 1998), vegetation characteristics (Qiu et al., 2001;Hupet and Vanclooster, 2002), elevation above the drainage channel (Crave and GascuelOdoux, 1997), downslope gradient (Merot et al., 1995), bedrock topography (Chaplot and Walter, 2003), specific upslope area (Brocca et al., 2007), and landscape unit (Park and van de Giesen, 2004;Wilson et al., 2004).Famiglietti et al. (1998) argued that under wet conditions, the best correlation of soil moisture variability was with soil porosity and hydraulic conductivity, and under dry conditions, with relative elevation, aspect, and clay content.Western et al. (1999) found that during wet conditions the best predictor of the soil moisture spatial pattern was the specific area (through lateral redistribution), while during dry conditions the best predictor was the potential solar radiation index (through aspect and evapotranspiration).Lawrence and Hornberger (2007) argued that trends across climate zones are related to the wilting point and porosity.Albertson and Montaldo (2003) and Montaldo and Albertson (2003) presented a theoretical argument for the impact of various factors on the relationship between soil moisture mean and variance.They showed that covariances between anomalies of soil moisture, infiltration, drainage, and ET control the production and destruction of variance over time.Teuling and Troch (2005) applied a similar approach to study the impacts of vegetation, soil properties, and topography on the controls of soil moisture variance.
Building on these previous studies, we begin with a downscaling hypothesis that consistent relationships between the transient higher-order statistical moments and mean nearsurface soil moisture fields exist (i.e., a "downscaling" closure relationship).We leave the problem of upscaling these relationships and their impact on the coarse-resolution transient solution for further work.In particular, we used a fiveyear, high-resolution hydrological simulation of the Clinton River watershed in Michigan to characterize relationships between µ θ and σ 2 θ , s θ , and k θ .Although we expected these relationships to vary with depth, we only evaluated the depth interval 0-10 cm to make the analysis scope tractable; future work will address this shortcoming.We also tested the extent to which using discrete bins across the mean moisture range improved characterization of spatial soil moisture heterogeneity.We then applied these relationships to investigate hypothesized controllers of soil moisture heterogeneity as a function of soil properties, evapotranspiration, topography, etc.The value of using a model for this analysis, compared to observations alone, is that we have continuous and spatially explicit estimates of states and fluxes, and since we know the mechanisms included in the model, we can attribute patterns to individual processes.
In the Methods section we describe the Clinton River watershed, the numerical model we applied (PAWS-CLM), model forcing and surface characterization applied, simulations performed, and our approach to generating a surrogate, or reduced order (ROM), model of fine-scale soil moisture heterogeneity.In the "Results and discussion" section we discuss the surrogate model estimates, the value of using a binned approach to characterizing soil moisture variability, and predicted controls on the relationship between soil moisture mean and variance.The last section provides a brief summary and conclusions.

The Clinton River watershed
Our study domain is the Clinton River watershed (Fig. 1), an 1837 km 2 , humid continental-climate basin draining into Lake St. Clair that was described in detail by Shen et al. (2013).Precipitation is relatively uniformly distributed throughout the year but there is strong seasonal variation in solar radiation and air temperature that affects ET demands.This watershed is well suited for our study because of its varied topography and subsurface properties, heterogeneity of surface and subsurface lateral exchanges, and heterogeneity in vegetation.The basin has rugged hills on the highlands of the west and flat, low-lying plains toward the east.This contrast in topography, as shown later, impacts largescale groundwater flow and the differences between hilly and flat terrain soil moisture dynamics.Urban areas of varying intensity span the southern portion of the watershed, the northwest is largely forested, and the northeast is dominated by agriculture.Glacial drifts and lacustrine deposits in the southeast form the unconfined aquifer, underlain by shale rock that bears little water.High-resolution elevation (30 m), land use (30 m), soil (1 : 12 000 to 1 : 63 360 SSURGO), river hydrography (1 : 24 000), well-log-based aquifer characteristics (∼ 1000 m), land-based climate forcing data (12 stations; precipitation, temperature, humidity, and wind speeds), and simulated steady-state carbon and nitrogen states (220 m) are used as inputs to the model (Shen et al., 2013).

PAWS-CLM4 model description and simulations performed
We applied the PAWS + CLM model to generate watershedscale predictions for the analyses presented here.PAWS (Process-based Adaptive Watershed Simulator) (Shen et al., 2013;Shen and Phanikumar, 2010) is a computationally efficient, physically based hydrologic model that has recently been coupled with CLM4.0 (Lawrence et al., 2011).PAWS + CLM explicitly solves the physically based governing partial differential equations for overland flow, channel flow, subsurface flow, wetlands, and the dynamic two-way interactions among these components.The model evaluates the integrated hydrologic response of the surface-subsurface system using a novel noniterative method that couples runoff and groundwater flow to vadose zone processes approximating the three-dimensional (3-D) Richards equation.By reducing the dimensionality of the fully 3-D subsurface problem, the model significantly reduces the computational demand with little loss of physics representation.We run the model with hourly time steps, but aggregate the results to a diurnal time step for the analyses performed here.
The PAWS + CLM model has been tested extensively with analytical and 3D benchmarks and compares favorably with other physically based models (Maxwell et al., 2014).It has been applied in several US Midwest watersheds, including the 1140 km 2 Red Cedar River (Shen and Phanikumar, 2010), the 1837 km 2 Clinton River (Shen et al., 2013), the 4527 km 2 Upper Grand (Shen et al., 2014), the 5232 km 2 Kalamazoo River, the 14 430 km 2 Grand River, and the 22 260 km 2 Saginaw River basins.More recently, physically based reactive transport of nutrients and bacteria has been integrated and the model has been applied to a desert environment in southern California to evaluate groundwater sustainability.
We applied PAWS + CLM at 220 m × 220 m horizontal resolution across the Clinton River watershed.Although this resolution is coarser than the hyper-resolution called for in Wood et al. (2011) and proof-of-concept work in Kollet et al. (2010), it provides substantial resolution of topographic and land use variation across a horizontal 256 × 280 grid.Twenty vertical layers were used to discretize the subsurface between the land surface and bedrock top.The vertical spatial resolution therefore varies throughout the basin depending on the depth to bedrock.As described in Shen et al. (2014), to create a PAWS + CLM model for the Clinton River watershed, daily weather data were obtained from the National Climatic Data Center (NCDC, 2010).We obtained the 30 m resolution National Elevation Dataset (NED) to generate average cell elevation and lowland storage bottom elevation.The 30 m resolution IFMAP 2001 land use and land cover data (MDNR, 2010) were aggregated to provide land use information.Three dominant land use types (PFTs) were modeled in each horizontal cell.The soil color data are extracted from a global data set (GSDT, 2000).We obtained the spatial distribution of lateral conductivities of the unconfined aquifer (glacial drift) as well as depths to bedrock by interpolating well records from the WELLOGIC database (GWIM, 2006;Simard, 2007) using kriging.The bedrock has very low permeability as it is composed of shale and some limestone.The model was calibrated against USGS gaging station 04165500 (Clinton River at Mt. Clemens) using a parallel version of the differential evolution algorithm (Chakraborty, 2008).The simulations were performed from 2001 to 2008 with the first three years used as model "spinup" and 2004-2008 included in our analysis.We used daily averaged top 10 cm soil moisture (θ ) fields for the analyses presented here.To simplify our attempt at estimating quantitative relationships between the spatial properties of the fineresolution moisture fields and the mean moisture fields, we focused on the unfrozen periods during each year (days 130-300).The 2004-2008 temporal average soil moisture predicted in the 220 m simulation is shown in Fig. 1.There is a large-scale spatial pattern in the soil moisture field, being generally higher on the eastern lowland plains than on the western hills, due to basin-scale groundwater flow.However, contrary to the coarser-resolution soil moisture map provided previously (Fig. 10d in Shen et al., 2013), Fig. 1 shows finescale features, e.g., high surface moisture near channels and high moisture in clayey soils near the eastern boundary.

Developing surrogate models for surface soil moisture moments
We developed two classes of simple surrogate models to represent soil moisture spatial heterogeneity as a function of mean soil moisture in coarse-resolution grid cells.We chose a factor of 2 5 in resolution to define the coarse-resolution grid cells, resulting in thirty-four 7040 m coarse-resolution grid cells across the watershed.θ , s θ , and k θ for each coarseresolution grid cell.We tested the impact on the accuracy of the relationship for best-fit first-second-, and third-order polynomials.The second class of surrogate model represents the fraction of high-resolution grid cells in each coarseresolution grid cell that fall into a particular mean soilmoisture bin.A disadvantage of this latter approach is that it is not as easy to mathematically synthesize the patterns, while a potential advantage is that it represents the probability distribution function (PDF) of the moisture even in the case where the first few statistical moments do not fully capture its properties.

Relationships between soil moisture heterogeneity and system properties
We investigated the relationship between daily σ 2 θ and µ θ over the µ θ range where σ 2 θ decreases with increasing µ θ .As shown below, most of the µ θ predictions were above the ∼ 0.2 breakpoint (as often observed and predicted here) for the peak of a convex-up relationship between σ 2 θ and µ θ .Therefore, many of the coarse-resolution grid cells were relatively well characterized by a linear fit with a negative slope, although about 20 % of the grid cells were predicted to have a full convex-up relationship.For the latter grid cells, we evaluated the best-fit slope for the portion of the data to the wetter side of the peak of the convex-up relationship.
We investigated 16 hypothesized controllers of this slope (based on the literature cited in the Introduction), all of which are represented explicitly or implicitly in PAWS + CLM: specific upslope area, gradient, variance of the gradient in each coarse-resolution grid cell, aspect, soil characteristics (porosity, clay content, conductivity), temporal mean evapotranspiration (E (W m −2 )), temporal variance in Ê, bedrock topography, temporal mean groundwater depth (G w (m)), temporal variance in groundwater depth ( Ĝw (m)), elevation, mean surface roughness, variance in roughness, and stream drainage density.We used TopoToolbox (Schwanghart and Kuhn, 2010) to evaluate the topographic indices used in the analysis.

Comparing model predictions to observations
PAWS + CLM has been extensively tested and demonstrated favorable comparisons with various observations from several basins (Shen et al., 2013;Shen and Phanikumar, 2010;Niu and Phanikumar, 2012).In the Clinton River watershed, the model has been shown to satisfactorily reproduce streamflow observations both at the basin outlet and uncalibrated inner gages (Nash-Sutcliffe model efficiency coefficient ∼ 0.65), spatially distributed water table depths (R 2 = 0.66), soil temperature, and MODIS satellite-based observa- tions of leaf area index (LAI) and ET (Shen et al., 2013).In other basins, PAWS + CLM was able to match observed transient water table depths from a USGS monitoring well and water storage anomalies measured by the GRACE satellite.
In addition to the comparisons described above, we compared simulated versus observed soil moisture at a site in Romeo, MI, from an Enviro-weather automated weather station network (Fig. 2).Since maintenance records indicated problems with the soil moisture sensor installation in 2008, we only show comparisons in 2009.The winter freeze-up at the beginning of 2009, shown as a period of very low soil moisture, was well captured by the model.The subsequent large variations due to freeze and thaw were also closely reproduced, with some overestimation near the end of the freezing cycle (early April 2009).In May, soil moisture was over estimated during the recession periods after storms.In late spring, plants may preferentially increase rooting density near the surface when there is high moisture content, leading to a stronger recession of near-surface moisture (Sivandran and Bras, 2013).However, the current static rooting algorithm in CLM cannot reproduce this mechanism, and therefore may be partly responsible for this bias.From June to November the model accurately predicted the mean, range of fluctuations (0.25 ∼ 0.34), and general trend.However, toward the end of the year, the predicted freeze-up was not present in the observations.These mismatches may be attributed to differences between grid average moisture of a 220 m cell and the site-specific moisture measured by the probe or local variation and uncertainty in subsurface properties.

Predicted mean moisture
The range and dynamics of predicted mean moisture at the coarse resolution varied substantially across the watershed (Figs. 1, 3).The western upland grid cells tended to be drier overall with the mean moisture increasing toward the east and south, which are lower-elevation grid cells receiving both surface and subsurface water inputs.The low precipitation inputs in 2006 had proportionally larger impacts in the wetter, eastern grid cells, resulting in up to 25 % decreases in mean saturation.

Relationships between soil moisture mean and higher-order moments
Using the 0-10 cm soil moisture predictions from the 220 m resolution simulation, we evaluated µ θ , σ 2 θ , s θ , and k θ at every time point (daily) for each of the thirty-four 7040 m × 7040 m coarse-resolution grid cells (Fig. 4 shows representative transient profiles for one subregion over 90 days of the simulation).We used these temporally resolved values to build first-, second-, and third-order best-fit polynomial relationships, with the third-order fits having generally the best predictive power and therefore applied in the remainder of our analysis.Overall, these surrogate models accurately captured the relationships between µ θ and σ 2 θ , s θ , and k θ , with mean R 2 values of 0.73, 0.74, and 0.75, respectively.
Different types of σ 2 θ ∼ µ θ relationships were predicted among the 34 coarse-resolution grid cells across the watershed.Some grid cells exhibited large, negative slopes with little scattering, e.g., #10, #17, #40, and # 41, indicating that soil moisture heterogeneity in these cells was strongly controlled by mean moisture, and that the variability was smallest on the wettest days.Some cells have much smaller slopes, e.g., #6, #7, #22, and #23, suggesting that their variability was less sensitive to mean moisture.These cells tended to have very small spatial variability throughout the year.Most grid cells with monotonic σ 2 θ ∼ µ θ relationships did not experience mean moisture below ∼ 0.25.However, for some (e.g., grid cells #10, #17, and #24), this monotonic, approximately linear relationship extended down to µ θ ∼ 0.2.We did not observe any grid cells with a purely upward σ 2 θ ∼ µ θ slope.On the other hand, about 20 % of the grid cells had convex-upward σ 2 θ ∼ µ θ relationships (e.g., grid cells #19, #26, #32, and #3).These grid cells primarily reside in the large topographic gradient in the middle of the watershed that alternates between recharge and discharge across the year (Salvucci and Entekhabi, 1995;Shen et al., 2013) and correspond with relatively higher drainage densities (Fig. 6).Higher drainage density corresponds to larger topographic variation, and this region connects upland hills and lowland plains and is characterized by a sharp change in elevation.As a result it is also a transition zone over which the distance to the water table decreases strongly.Therefore the 7040 m cells in these regions all included large variations in soil moisture, and they shifted from high to low water table regimes seasonally.The differences in these relationships indicate that at the 7040 m × 7040 m scale, the σ 2 θ ∼ µ θ relationships are determined locally, a finding consistent with that by Mascaro et al. (2010), where coefficients in a predictive formula for scaling exponents were related to local attributes.
For a particular coarse-resolution grid cell, the scattering of the σ 2 θ ∼ µ θ points around the polynomial fit, or departure from a deterministic function, can be attributed to different hydrologic processes that similarly affect the mean but differently affect the spatial heterogeneity.For example, homogeneous precipitation increases surface moisture evenly across the domain, and therefore decreases the variance.This homogenizing effect acts as the major driver that sets the negative slope in the σ 2 θ ∼ µ θ curves.However, an increase in regional groundwater flow would create spatial heterogeneity that adds to the variance.This effect is clear in the transition zones (e.g., grid cells #19 and #26).A flood wave that inundates riparian zones (which are represented in PAWS + CLM) would increase the mean soil moisture and spatial heterogeneity in the grid cell by increasing soil moisture only in the riparian zones.Grid cells that are further to the west have smaller σ 2 θ ranges for particular values of µ θ , and have soil moisture that are less impacted by exfiltration.
As mentioned above, Famiglietti et al. (2008) used data from several ground-based measurement campaigns in the Southern Great Plains and Iowa to characterize relationships between µ θ , σ 2 θ , and s θ .They found convex-upward relationships between σ 2 θ and µ θ at 800 m and 50 km scales with a mean moisture in the range [∼ 0.05, 0.4], and fit an exponential function to the standard deviation: σ θ = k 1 µ θ exp (−k 2 µ θ ).For the range of mean moisture we predicted in the Clinton Watershed [∼ 0.2, 0.45] (i.e., a smaller range than used in the Famiglietti et al. (2008) study), the overall monotonically declining trend in σ 2 θ with µ θ qualitatively matched the trend they reported (lower left panel of Fig. 5).There were, however, grid cells with higher variance that did not fit this pattern (i.e., #19, #26, and #33).We calculated k 1 (1.3 ± 0.3) and k 2 (7.1 ± 1.0) across the 7040 m coarse-resolution grid cells and found a mean R 2 for all the grid cells of 0.48.These predicted values of k 1 and k 2 matched well those reported by Famiglietti et al. (2008) for their 1.6 km scale (1.2 and 7.1, respectively).We note also that the third-order polynomial fit explained more of the variance of the modeled relationships than did the exponential relationship, but we are not aware of a mechanistically based rationale for a choice of this relationship.
The relationships between µ θ and s θ also varied across the watershed (Fig. 7).For the grid cells toward the west (in the four western columns), which are typically drier than those to the east, s θ was predominantly positive across the mean moisture range, implying a consistently right-skewed PDF.The transition between positive and negative s θ occurred in several of these grid cells at µ θ of about 0.3-0.35(#4, #10,  and #17).For the wetter grid cells to the east (column 5-7), the µ θ distribution was predominantly left-skewed (i.e., s θ < 0), even though part of the µ θ range was drier than the transition values for the grid cells farther to the west.For the sixth and seventh column (farthest east), µ θ was predicted to be above 0.3 for the entire simulation period, and most of these grid cells (and those in the southern portion of column 5) showed a decreasing s θ with increasing µ θ .This pattern is consistent with the fact that there is a maximum µ θ value possible (corresponding to fully saturated), and as more of the 220 m grid cells reach this level, the PDF becomes more left skewed.
Comparing our predictions (lower left panel of Fig. 7) to the 800 m and 50 km s θ relationships with µ θ reported in Famiglietti et al. (2008) indicates good qualitative agreement: a monotonic decrease from positive s θ values between 0 and 1 at µ θ of ∼ 0.2 to a s θ value of between about −1 and −2 at µ θ of ∼ 0.4.In our predictions, and somewhat visible in the Famiglietti et al. (2008) observations, there is a divergence of s θ values toward the wetter end of the µ θ range.The best linear fit to these predictions had a slope of −13 and intercept of 4.2, which corresponded well to values inferred from their observations at the 1.6 km scale.However, the slope and intercepts inferred from their observations varied substantially and inconsistently across scale, making this comparison inconclusive.
The relationships between predicted k θ and µ θ (Fig. A1 in the Appendix) can be divided into a few characteristic shapes: (1) monotonically increasing (e.g., #11, #27, #34, and #41), (2) relatively constant (e.g., #18 and #23), and (3) convex down (e.g., #13, #14, and #20).To the extent that k θ represents an index of "peakedness" in the moisture distribution, an increase in k θ with increasing mean moisture is consistent with the limit in the range occurring at full saturation, and with s θ becoming more negative in this part of the range.The more strongly convex down shapes occur in grid cells where the mean soil moisture range extends toward fully saturated, so the relationships that are more constant with µ θ variations may simply be a result of that grid cell not experiencing periods with higher µ θ .
Finally, we also tested our results against the theoretical predictions of Montaldo and Albertson (2003), who concluded that the time derivative of the root-zone soil moisture variance ( ∂σ 2 θ ∂t ) would increase as the covariance between soil moisture and infiltration increased, or decrease as the covariance between soil moisture and either drainage or transpiration increased.We tested these potential dependencies by comparing our predicted values of ∂σ 2 θ ∂t to θ K , θ q r , and θ E , where K is the soil hydraulic conductivity (which affects infiltration), q r is the drainage flux, E is the evapotranspiration, the prime represents anomalies compared to the spatial mean of that variable, and the overbar represents a spatial average.We evaluated these relationships with daily, weekly, and monthly averaging periods using the model predictions and found very weak relationships.These results indicate that the creation and destruction of variance in a watershed model that represents a range of moisture redistribution mechanisms is more complex than can be represented by these inferred dependencies.

Predicted soil moisture PDF as a function of mean moisture
Because σ 2 θ , s θ , and k θ of the soil moisture field do not fully characterize the PDF, we also examined the dependence of the proportion of high-resolution grid cells in each coarseresolution grid cell occupying µ θ bins (e.g., seven bins are shown in Fig. 8).The advantages of this binning approach are that it more fully represents the heterogeneity in µ θ and allows for visualization of variation within coarse-resolution grid cells.
An interesting observation from Fig. 8 is that the coarseresolution grid cells that have a clear convex-upward shape for variance versus mean moisture have the peak of that distribution very close to where the third and fourth quartile bands have equal representation (not shown).Thus, when the coarse-resolution grid cell has more of its fine-scale soil moisture mean values occupying the wettest quartile, the system variance begins to decline as mean moisture increases.In the grid cells that have a monotonically decreasing relationship, the transition between third and fourth quartile mean moisture bands occurs to the drier end of the µ θ range.
It is also interesting to note the different behavior of the wettest bin (the gray line).In many of the drier cells in the upland area (e.g., #23, #24, and #10), this bin remains relatively constant across the µ θ range.This pattern likely occurs because these regions have larger topographic variation and are effective at redistributing moisture, therefore preventing the wettest areas (or source areas; Lyon et al., 2004;Dunne and Black, 1970;Frankenberger et al., 1999) from expanding in area.In many cells on the western plains (e.g., #27, #28, #40, and #47), there appears to be a threshold soil moisture value, around 0.35, above which the wettest bin suddenly grows very rapidly as mean moisture increases.This rapid change is due to the upper limit of saturation set by soil porosity and the flatter terrain.

Relationships between soil moisture heterogeneity and system properties
As mentioned in the Introduction, many of the convexupward relationships reported in the literature appear to have a peak in this relationship at µ θ of ∼ 0.2, although that value is not universal (e.g., Rosenbaum et al., 2012).This transition point represents the transition between system properties (e.g., roughness, hydraulic conductivity) and fluxes (e.g., evapotranspiration) that tend to homogenize soil moisture versus those that lead to more heterogeneity.For example, imagine a flat region with a distribution of plants of equal potential evapotranspiration but with drought tolerances that are different functions of soil moisture (Fig. A2 in the Appendix shows an example using CLM4.5'sestimate of water stress on photosynthesis (β t ) for soils with different sand composition).A precipitation event that occurs on a dry coarseresolution grid cell would tend to alleviate the drought stress in a fraction of the plants, thereby leading to a higher heterogeneity in soil moisture (and therefore σ 2 θ ).If the precipitation continued and µ θ increased to a level where none of the plants were stressed, the now relatively more homogeneous evapotranspiration would tend to reduce σ 2 θ .As discussed in the Methods section, many controllers of these tradeoffs have been inferred from observations, and include topographical features, evapotranspiration, and edaphic properties.Because most of the coarse-resolution grid cells in our computational domain did not experience µ θ below ∼ 0.2, we investigated the relationship between σ 2 θ and µ θ over the range where σ 2 θ decreases with increasing µ θ .In our predictions, ∼ 80 % of the coarse-resolution grid cells were relatively well characterized by a linear fit with a negative slope.For the remaining ∼ 20 %, we evaluated the best-fit slope for the portion of the data to the wetter side of the peak of the convex-up relationship.
Of the 16 hypothesized controllers of this slope (m) that we investigated (see Methods), 6 had independent linear best-fits with R 2 > 0.05: gradient (g, R 2 = 0.07), mean of evapotranspiration (E (W m −2 ), R 2 = 0.16), temporal mean of the spatial variance of evapotranspiration (R 2 = 0.05), porosity (R 2 = 0.08), mean of groundwater depth (R 2 = 0.06), and mean of stream density (R 2 = 0.05).Using a stepwise linear regression with these six variables and allowing for first-order interactions, the best-fit model explained 59 % of the variance in m and had the form C 1 + C 2 gE, where C 1 and C 2 are constants.Thus, over the five years of simulation, the rate at which σ 2 θ declined with increasing µ θ was controlled primarily by the elevation gradient convolved with the temporal mean of evapotranspiration.In this relationship, increases in this product lead to less negative values of m (i.e., Figure 6.Drainage density (length of streams per area) for streams of order 1 and higher.Areas with clear convex-upward shapes for soil moisture variance versus mean tend to be in grid cells with higher drainage density.less sensitive response of σ 2 θ to variations in µ θ ).The larger the gradient and higher the evapotranspiration in the grid cell, the lower the response of soil moisture spatial heterogeneity to mean soil moisture.This conclusion is consistent with the ideas that (1) high-evapotranspiration grid cells are more likely to be those with lower likelihood of partial water stress limitation and (2) the high-gradient grid cells more efficiently mix surface water, thereby reducing soil moisture gradients.

Applying the simple surrogate models to predict fine-scale heterogeneity
We also evaluated whether the simple polynomial surrogate models can be used to predict dynamic variations in σ 2 θ , s θ , and k θ given variations in µ θ .For this exercise, we used the first three years of the simulations to train the third-order polynomial surrogate model.We then applied those surrogates across the five years of simulation to evaluate estimates for σ 2 θ , s θ , and k θ within each coarse-resolution grid cell and compared those estimates to the moments calculated directly from the fine-resolution simulation.
The surrogate-estimated values of σ 2 θ over time corresponded well to those from the fine-scale solution, with an R 2 value of 0.78 and mean absolute bias of 0.00014 (Fig. 9).The estimates relatively accurately captured several of the dominant transients that occurred, including during the 2006 drought in, for example, grid cells #26, #27, and #28.These grid cells span a µ θ gradient from relatively drier to wetter, and that transition is apparent in the σ 2 θ gradient across these grid cells (from a value ∼ 0.001 to ∼ 0.006).The temporal dynamics during drying are also different between these grid cells; for example, during 2006 the response in grid cell #26 is a reduction in σ 2 θ , while in grid cells #27 and #28 the response is an increase in σ 2 θ .This differential response occurred because grid cells #27 and #28 had µ θ that was high enough before the drought that, even though they dried, remained to the right of the peak in the convex-upward relationship between σ 2 θ and µ θ .In contrast, grid cell #26 had a µ θ value before the drought that was close to the peak in that relationship and the reduction in µ θ therefore resulted in a reduction in σ 2 θ .The surrogate-estimated values of s θ over time corresponded well to those from the fine-scale solution, with an R 2 value of 0.74 and mean absolute bias of 0.11 (Fig. 10).The temporal variability in s θ was largest in the coarseresolution grid cells in the eastern (wetter) portion of the watershed and, in particular, during 2006 as the soils dried.During this period, for example, s θ in grid cells #27 and #28 increased rapidly in spring and then stabilized at a value near zero, indicating a relatively uniform distribution of µ θ .Interestingly, the relatively drier grid cell #26 did not have as strong a response in s θ , but all three grid cells stabilized at values indicating a more uniform µ θ distribution.
The polynomial surrogate model for k θ was less accurate than those for either s θ or µ θ , with an R 2 value of 0.51 and mean bias of 0.43 (Fig. A3 in the Appendix).In contrast to the s θ dynamics, k θ was relatively temporally variable in both the western (e.g., grid cells #15 and #29) and eastern portions of the watershed.In most years in many of the wetter southeastern grid cells there was a reduction in k θ (i.e., a flattening of the µ θ PDF) following the spring thaw followed by periodic increases associated with precipitation events.In contrast, during the 2006 dry period, the southeastern wetter grid cells showed a reduction that was sustained over the summer.

Limitations and future work
Many factors impact near-surface soil moisture heterogeneity in watersheds; characterizing this heterogeneity and its relationships with the mean moisture field, topographical features, vegetation, and climate forcing could improve regional to global-scale estimates of surface energy and greenhouse gas exchanges (which depend on soil moisture).The approach we applied here, using a tested high-resolution numerical model, showed promise in this regard, yet a number of limitations remain for future work.For example, we did not consider the temporal pattern of the relationships.We note, though, that we did not observe large hysteresis in the relationships between the mean moisture field and higherorder moments, as has been reported previously (Ivanov et al., 2010;Vivoni et al., 2010).The high-dimensional state space, especially the influence of vertical soil moisture profiles, also needs to be examined in detail.We also considered only one watershed over a relatively short (five-year) period.A longer study, covering several decades, would better capture the interannual range in precipitation and vegetation status (and therefore soil moisture) that the watershed experiences.Repeating the analyses described here for other watersheds with different topography, vegetation, bedrock features, soil properties, etc. could yield insights into the impacts these various properties have on soil moisture heterogeneity and its relationship with mean soil moisture.Such an analysis could also be used to test the extent to which relationships developed in one watershed could be used for other watersheds, and more specifically, for which watershed features such an extrapolation would be appropriate.We also note that soil moisture heterogeneity exists at scales much smaller than we simulated here (220 m), including down to the soil macropore scale, where the soil biogeochemical transformations that impact ecosystem function and climate occur.Further computational and data enhancement that examine variability at hyper-resolution (on the order of 10 m) are the next reasonable steps and are possible with current computational resources.Developing modeling structures that account for, at some level, this wide range of  scales will be important for consistently representing terrestrial ecosystem processes.
Finally, an important application of the relationships developed here would be to apply them with coarse-resolution simulations to substantially reduce computational costs for regional and global simulations.Comparisons between fine and coarse-resolution simulations of a particular watershed have been used to transfer nonlinearity from microscale to mesoscale models via nonstationary effective parameters (Barrios and Frances, 2012).The approach we envision here would combine that type of model calibration with a cost function that includes a larger suite of observations, including the ability to capture the fine-scale predicted soil moisture heterogeneity and its relationship with mean moisture.

Summary and conclusions
We applied a watershed-scale hydrological model (PAWS + CLM4) that has been previously tested in the Clinton River watershed in Michigan to investigate relationships between fine-scale, near-surface soil moisture mean and spatial heterogeneity.We used fine-resolution (220 m) simulations to calculate statistical properties of soil moisture at a resolution 2 5 times coarser (∼ 7 km), and then (1) developed and evaluated simple polynomial surrogate models relating soil moisture mean to its variance, skewness, and kurtosis during the nonfrozen portion of five years; (2) applied those surrogates over the time period to evaluate their accuracy; and (3) investigated the relationship between the predicted soil moisture mean and variance and topographic and hydrological system properties.The surrogate models accurately reproduced the relationships between the soil moisture mean and higher-order moments (R 2 ∼ 0.7-0.8).Driving the surrogate model with the mean coarse-resolution soil moisture predictions across the simulation period gave comparably accurate predictions for variance and skewness, and a less accurate (R 2 ∼ 0.5) prediction of kurtosis.This close correspondence between the surrogate and fine-resolution model predictions argues that these types of reduced-order models can be used to inform heterogeneity at scales below those explicitly represented at coarse resolution.It also argues that the surrogates can be effectively applied to understand controls on spatial heterogeneity of soil moisture, as discussed below.
In our predictions, and in many reported observations, there is typically a reduction in soil moisture variance with increasing mean past a particular intermediate value of the mean.Many possible controllers of this relationship have been hypothesized; in our predictions the approximately linear relationship was estimated (R 2 = 0.59) using only the elevation gradient convolved with the mean of evapotranspiration in the coarse-resolution grid cell.Increases in the elevation gradient and mean evapotranspiration each, and

Figure 1 .
Figure 1.Topography and predicted 2004-2008 temporal average soil moisture of the Clinton River watershed (the black line outlines the watershed).The 3-D elevation and shading represent the digital elevation model, which is enhanced by a 1 : 50 ratio, and the color represents the average soil moisture.

Figure 2 .
Figure 2. Comparison between 220 m resolution predicted and measured soil moisture for a site in Romeo, MI.The large differences in 9 December are caused by the model predicting freezing in the top 10 cm of soil, while the observations suggest unfrozen conditions.

Figure 3 .
Figure3.Fine-resolution (220 m) simulated 0-10 cm soil moisture mean (µ θ ).The individual plots are distributed in the same pattern as the coarse-resolution grid cell they represent in the watershed (see watershed boundary in Fig.1).

Figure 5 .
Figure 5.In each subplot (except for the two in the bottom left-hand corner), soil moisture variance (σ 2 θ ) is plotted versus mean (µ θ ) for that coarse-resolution grid cell based on the fine-resolution (220 m) model predictions (blue dots) and the best-fit third-order polynomial fits (green line).The individual 7040 m coarse-resolution grid cells are placed in their relative position in the watershed.Bottom left-hand corner: (a) soil moisture variance (σ 2 θ ) versus mean (µ θ ) for all grid cells combined; (b) the PDF of R 2 values referenced to the polynomial fit from each coarse-resolution grid cell.

Figure 7 .
Figure 7.In each subplot (except for the two in the bottom corner), soil moisture skewness (s θ ) is plotted versus mean (µ θ ) for that coarse-resolution grid cell based on the fine-resolution (220 m) model predictions (blue dots) and the best-fit third-order polynomial fits (green line).The individual 7040 m coarse-resolution grid cells are placed in their relative position in the watershed.Bottom left-hand corner: (a) soil moisture skewness (s θ ) versus mean (µ θ ) for all grid cells combined; (b) the PDF of R 2 values referenced to the polynomial fit from each coarse-resolution grid cell.

Figure 8 .
Figure 8. Relationships between mean coarse-resolution grid cell soil moisture (µ θ ) and the proportion of the grid cell in each of 10 moisture bins (the wettest 7 bins are shown) that span that coarse-resolution grid cell's µ θ range.The individual 7040 m coarse-resolution grid cells are placed in their relative position in the watershed.

Figure 9 .
Figure 9.Comparison over time between the fine-resolution (220 m) simulated soil moisture variance (σ 2 θ ) and that predicted by the surrogate model using the mean of the fine-resolution soil moisture (µ θ ).The individual 7040 m coarse-resolution grid cells are placed in their relative position in the watershed.Bottom left-hand corner: (a) the PDF of R 2 values between the surrogate model and fine-resolution model predictions of soil moisture variance across all the coarse-resolution grid cells; (b) mean absolute bias between the surrogate and fine-model predictions of soil moisture variance across all the coarse-resolution grid cells.

Figure A2 .
Figure A2.The water stress term on photosynthesis applied in CLM4 as a function of percent sand of the soil.