Model representation of the coupling between evapotranspiration and soil water content at different depths

Soil water content (θ ) influences the climate system by controlling the fraction of incoming solar and longwave energy that is converted into evapotranspiration (ET). Therefore, investigating the coupling strength between θ and ET is important for the study of land surface–atmosphere interactions. Physical models are commonly tasked with representing the coupling between θ and ET; however, few studies have evaluated the accuracy of model-based estimates of θ /ET coupling (especially at multiple soil depths). To address this issue, we use in situ AmeriFlux observations to evaluate θ /ET coupling strength estimates acquired from multiple land surface models (LSMs) and an ET retrieval algorithm – the Global Land Evaporation Amsterdam Model (GLEAM). For maximum robustness, coupling strength is represented using the sampled normalized mutual information (NMI) between θ estimates acquired at various vertical depths and surface evaporation flux expressed as a fraction of potential evapotranspiration (fPET, the ratio of ET to potential ET). Results indicate that LSMs and GLEAM are generally in agreement with AmeriFlux measurements in that surface soil water content (θs) contains slightly more NMI with fPET than vertically integrated soil water content (θv). Overall, LSMs and GLEAM adequately capture variations in NMI between fPET and θ estimates acquired at various vertical depths. However, GLEAM significantly overestimates the NMI between θ and ET, and the relative contribution of θs to total ET. This bias appears attributable to differences in GLEAM’s ET estimation scheme relative to the other two LSMs considered here (i.e., the Noah model with multi-parameterization options and the Catchment Land Surface Model, CLSM). These results provide insight into improved LSM model structure and parameter optimization for land surface–atmosphere coupling analyses.

Abstract. Soil water content (θ ) influences the climate system by controlling the fraction of incoming solar and longwave energy that is converted into evapotranspiration (ET). Therefore, investigating the coupling strength between θ and ET is important for the study of land surface-atmosphere interactions. Physical models are commonly tasked with representing the coupling between θ and ET; however, few studies have evaluated the accuracy of model-based estimates of θ / ET coupling (especially at multiple soil depths). To address this issue, we use in situ AmeriFlux observations to evaluate θ / ET coupling strength estimates acquired from multiple land surface models (LSMs) and an ET retrieval algorithm -the Global Land Evaporation Amsterdam Model (GLEAM). For maximum robustness, coupling strength is represented using the sampled normalized mutual information (NMI) between θ estimates acquired at various vertical depths and surface evaporation flux expressed as a fraction of potential evapotranspiration (fPET, the ratio of ET to potential ET). Results indicate that LSMs and GLEAM are generally in agreement with AmeriFlux measurements in that surface soil water content (θ s ) contains slightly more NMI with fPET than vertically integrated soil water content (θ v ). Overall, LSMs and GLEAM adequately capture variations in NMI between fPET and θ estimates acquired at various vertical depths. However, GLEAM significantly overestimates the NMI between θ and ET, and the relative contribution of θ s to total ET. This bias appears attributable to differences in GLEAM's ET estimation scheme relative to the other two LSMs considered here (i.e., the Noah model with

Introduction
Soil water content (θ ) modulates water and energy feedbacks between the land surface and the lower atmosphere by determining the fraction of incoming solar energy that is converted into evapotranspiration (ET; Seneviratne et al., 2010Seneviratne et al., , 2013. In water-limited regimes, θ exhibits a dominant control on ET, and therefore exerts significant terrestrial control on the Earth's water and energy cycles. Accurately representing θ / ET coupling in land surface models (LSMs) is therefore expected to improve our ability to project the future frequency of extreme climates (Seneviratne et al., 2013).
A key question is how the constraint of θ on ET and sensible heat (H ) varies as θ is vertically integrated over deeper vertical soil depths. Given the tendency for the timescales of θ dynamics to vary strongly with depth, the degree to which the ET is coupled with vertical variations in θ determines the temporal scale at which θ variations are propagated into the lower atmosphere. Therefore, in order to represent θ / ET coupling, and thus land-atmosphere interactions in general, LSMs must accurately capture the relationship between vertically varying θ values and ET. Unfortunately, their ability to do so remains an open question.
Recently, land surface-atmosphere coupling strength has been investigated by sampling mutual information proxies (e.g., correlation coefficient or other coupling indices) between time series of θ and ET observations (or air temperature proxies for ET). Results suggest that, even when confined to very limited vertical support (e.g., within the top 5 cm of the soil column), surface θ estimates retain significant information for describing overall θ control on local climate (Ford and Quiring, 2014;Qiu et al., 2014;Crow, 2018, 2019). These findings are in contrast with the common perception that ET is constrained only by θ values within deeper soil layers (Hirschi et al., 2014). Hence, it is necessary to examine whether LSMs can realistically reflect observed variations of θ / ET coupling strength within the vertical soil profile.
Previous studies examining the θ / ET relationship have generally been based on Pearson product-moment correlation (Basara and Crawford, 2002;Ford et al., 2014), which captures only the strength of a linear relationship between two variables. However, the coupling between θ and ET is generally nonlinear. Therefore, non-parametric mutual information measures are generally more appropriate. Nearing et al. (2018) used information theory metrics (transfer entropy, in particular) to measure the strength of direct couplings between different surface variables, including soil water content, and surface energy fluxes at short timescales in several LSMs. They found that the LSMs are generally biased as compared with the strengths of couplings in observation data, and that these biases differ across different study sites. However, they did not look specifically at the effect of vertical water content profiles or of subsurface soil water content on partitioning surface energy fluxes.
Here we apply the information theory-based methodology of Qiu et al. (2016) to examine the relationship between the vertical support of θ estimates and their mutual information (MI) with respect to ET. Our approach is based on analyzing the MI content between ET and θ time series acquired from both LSMs, and ET retrieval algorithm -the Global Land Evaporation Amsterdam Model (GLEAM) -and AmeriFlux in situ observations. MI values are then normalized by entropy in the corresponding ET time series to remove the effect of inter-site variations to generate estimates of normalized mutual information (NMI) between θ and ET. Both surface (roughly 0-10 cm) soil water content (θ s ) and vertically integrated (0-40 cm) soil water content (θ v ) are considered to capture the impact of depth on NMI results. AmeriFluxbased NMI results are then compared with analogous NMI results obtained from LSM-based and GLEAM-based θ and ET time series.

Data and methods
The AmeriFlux network provides temporally continuous measurements of θ, surface energy fluxes and related envi-ronmental variables for sites located in a variety of North American ecosystem types, e.g., forests, grasslands, croplands, shrublands and savannas (Boden et al., 2013). To minimize sampling errors, AmeriFlux sites lacking a complete 3-year summer months (June, July and August) daily time series between the years of 2003 and 2015 (i.e., 3×92 = 276 daily observations in total) of θ s , θ v and latent heat flux (LE) are excluded here, resulting in the 34 remaining eligible AmeriFlux sites listed in Table 1. These sites cover a variety of climate zones within the contiguous United States (CONUS). Table 1 gives background information on these 34 sites including local land-cover information. Hydro-climatic conditions in each site are characterized using the aridity index (AI), calculated using CRU (Climate Research Unit, v4.02) monthly precipitation and potential evaporation (PET) datasets.
As described above, θ / ET coupling assessments made using AmeriFlux observations are compared with those using state-of-the-art LSMs including the Noah model with multiparameterization options (NOAHMP) and Catchment Land Surface Model (CLSM). In addition, θ and ET retrievals provided by the Global Land Evaporation Amsterdam Model (GLEAM) are also considered. See below for details on all three approaches. To avoid any spurious correlations between θ and ET due to seasonality, all NMI analyses are performed on θ and ET time series anomalies acquired during the period 2003-2015. The θ and ET anomalies are calculated by removing the seasonal cycle -defined as 31 d window averages centered on each day of the year sampled across all years of the 2003-2015 historical data record -from the raw θ and ET time series data. The analysis is limited to the CONUS during summer months (June, July and August) when θ / ET coupling is expected to be maximized.

Ground-based AmeriFlux measurements
The Level 2 (L2) AmeriFlux LE and H flux observations are based on high-frequency (typically > 10 Hz) eddy covariance measurements processed into half-hourly averages by individual AmeriFlux investigators. LE and θ observations at a half-hour time step and without gap-filling procedures are collected from the AmeriFlux Site and Data Exploration System (see http://ameriflux.ornl.gov/, last access: November 2018). The LE and θ observations are further aggregated into daily (00:00 to 24:00 UTC) values, and daily LE is converted into daily ET using the latent heat of vaporization.
Daily ET values based on less than 30 % half-hourly coverage (i.e., < 15 half-hourly observations per day) are considered not representative at a daily timescale and are therefore excluded.
Soil water content measurements are generally available at two discrete depths that vary between the AmeriFlux sites (Table 1). Here, the top (i.e., closest to the surface) soil water content observation is always used to represent surface soil water content (θ s ). Since the depth of this top-layer measure- ment varies between 0 and 15 cm (see Table 1), we consider the surface-layer measurement θ s to be roughly representative of 0-10 cm (vertically integrated) θ . For more details on AmeriFlux sites utilized here, see Raz-Yaseef et al. (2015). Given variations in the depth of the lower AmeriFlux θ observations (see Table 1), we applied a variety of approaches for estimating vertically integrated soil water content (θ v ). Our first approach, hereinafter referred to as Case I, is based on the application of an exponential filter (Wagner et al., 1999;Albergel et al., 2008) to extrapolate θ s to a consistent 40 cm bottom-layer depth. Therefore, only θ s is used to derive θ v and the bottom-layer (or second-layer) AmeriFlux θ measurement is neglected in this case. The application of the exponential filter requires a single timescale parameter T . Since θ measurements from the United States Department of Agriculture's Soil Climate Analysis Network (SCAN) are taken at fixed soil depth, we utilized this dataset to determine the most appropriate parameter T at AmeriFlux sites. Following Qiu et al. (2014), first, we estimated the optimal parameter T (T opt ) for the extrapolation of θ measurements from 10 to 40 cm depth and established a global relationship between T opt and site-based NDVI (MOD13Q1 v006, 250 m, 16 d; T opt = 2.098 × exp(−1.895× (NDVI +0.6271)) +2.766). Then, this global relationship (goodness of fit R 2 : 0.85) is applied to AmeriFlux sites to extrapolate 0-10 cm θ s time series into 0-40 cm θ v .
Previous research has suggested that such a filtering approach does not significantly squander ET information present in actual measurements of θ v (Qiu et al., 2014(Qiu et al., , 2016. Nevertheless, since the quality of θ v estimates is important in our analysis, we also calculated two additional cases where 0-40 cm θ v is estimated using (1) the bottom-layer soil water content measurement acquired at each AmeriFlux site (hereinafter, Case II) and (2) linear interpolation of θ s , and the bottom-layer AmeriFlux soil water content measurement (hereinafter, Case III). The sensitivity of key results to these various cases is discussed below.

LSM-and GLEAM-based simulations
Simulations are acquired from the NOAHMP (Niu et al., 2011) and CLSM (Koster et al., 2000) LSMs embedded within the NASA Land Information System (LIS, Kumar et al., 2006) and the GLEAM ET retrieval algorithm (Miralles et al., 2011). Both NOAHMP and CLSM are set up to simulate 0.125 • θ profiles at a 15 min time step using North America Land Data Assimilation System, Phase 2 (NLDAS-2) forcing data. A 10-year model spin-up period (1992 to 2002) is applied for NOAHMP and CLSM.
NOAHMP numerically solves the one-dimensional Richards equation within four soil layers of thicknesses of 0-10, 11-30, 31-60 and 61-100 cm. Major parameterization options relevant to θ simulation include options for canopy stomatal resistance parameterization and schemes controlling the effect of θ on the vegetation stress factor β. Here we employed the Ball-Berry-type stomatal resistance scheme and Noah-type soil water content factor controlling the β factor. The specific expressions are as follows: where θ wilt and θ ref are, respectively, soil water content at the wilting point (m 3 m −3 ) and reference soil water content (m 3 m −3 ), which is set as field capacity during parameterization. θ i and z i are soil water content (m 3 m −3 ) and soil depth (cm) at ith layer, N root and z root are total number of soil layers with roots and total depth (cm) of root zone, respectively.
Following the Ball-Berry stomatal resistance scheme, the θ -controlled β factor and other multiplicative factors including temperature and foliage nitrogen simultaneously determine the maximum carboxylation rate V max as follows: where V max25 is maximum carboxylation rate at 25 • C (µmol CO 2 m −2 s −1 ), α vmax is a parameter sensitive to vegetation canopy surface temperature T v , f (N ) is a factor representing foliage nitrogen and f (T v ) is a function that mimics thermal breakdown of metabolic processes. Based on V max , photosynthesis rates per unit leaf area index (LAI) including carboxylase-limited (Rubisco limited, denoted by A C ) type and export-limited (for C 3 plants, denoted by A S ) type are calculated, respectively. The minimum of A C , A S and the light-limited photosynthesis rate determines stomatal resistance r s , and consequently affects ET over vegetated areas.
For the complete NOAHMP configuration, please see Table S1 in the Supplement. CLSM simulates the 0-2 and 0-100 cm soil water content and evaporative stress as a function of simulated θ and environmental variables. ET is then estimated based on the estimated evaporative stress and land-atmosphere humidity gradients. Energy and water flux estimates are iterated with soil state estimates (e.g., θ and soil temperature) to ensure closure of surface energy and water balances. For a detailed explanation of CLSM physics, please refer to Koster et al. (2000).
GLEAM is a set of algorithms dedicated to the estimation of terrestrial ET and root-zone θ from satellite data. In this study, the latest version of this model (v3.2a) is employed. In GLEAM, the configuration of soil layers varies as a function of the land-cover type. Soil stratification is based on three soil layers for tall vegetation (0-10, 10-100 and 100-250 cm), two layers for low vegetation (0-10 and 10-100 cm) and only one layer for bare soil (0-10 cm; Martens et al., 2017).
The cover-dependent PET (mm d −1 ) of GLEAM is calculated using the Priestley and Taylor (1972) equation based on observed air temperature and net radiation. Following this, estimates of PET are converted into actual transpiration or bare soil evaporation (depending on the land-cover type, ET (mm d −1 )), using a cover-dependent, multiplicative stress factor S (-), which is calculated as a function of microwave vegetation optical depth (VOD) and root-zone θ (Miralles et al., 2011). The related expressions are as follows: where E i is rainfall interception (mm), S essentially represents the fPET (see Sect. 2.3) estimated by GLEAM, θ c (m 3 m −3 ) is the critical soil water content and θ ω (m 3 m −3 ) is the soil water content of the wettest layer, assuming that plants withdraw water from the layer that is most accessible. Based on Eq. (4), GLEAM S (or fPET) tends to become more sensitive to θ in areas of low VOD seasonality (i.e., low differences between VOD and VOD max ). As for bare soil conditions, S is linearly related to surface soil water content (θ 1 ): To resolve variations in the vertical discretization of θ applied by each model, we linearly interpolated NOAHMP, CLSM and GLEAM outputs into daily 0-10 and 0-40 cm soil water content values using depth-weighted averaging.

Variable indicating soil water content and surface flux coupling
Soil water content-ET coupling can be diagnosed using a variety of different variables derived from ET, e.g., the fraction of PET (fPET, the ratio of ET and PET) or the evaporative fraction (EF, the ratio of LE and the sum of LE and sensible heat). Since ET is strongly tied to net radiation (R n ; Koster et al., 2009), both fPET and EF are advantageous in that they normalize ET by removing the impact of non-soil water content influences on ET (e.g., net radiation, wind speed and soil heat flux (G)). However, since sensible heat flux is not provided in the GLEAM dataset, we are restricted here to using fPET.
It should be noted that the applied meteorological forcing data for NOAHMP and CLSM are somewhat different from those used for GLEAM. Therefore, to minimize the impact of this difference, NOAHMP and CLSM fPET are computed from North American Regional Reanalysis (NARR) using the modified Penman scheme of Mahrt and Ek (1984), while GLEAM fPET is calculated using its own internal PET estimates. To examine the impact of the PET source on the results, AmeriFlux fPET calculations are duplicated using both GLEAM-and NARR-based PET values.

Information measures
Mutual information (MI; Cover and Thomas, 1991) is a nonparametric measure of correlation between two random variables. MI and the related Shannon-type entropy (SE; Shannon, 1948) are calculated as follows. Entropy about a random variable ζ is a measure of uncertainty according to its distribution p ζ and is estimated as the expected amount of information from p ζ sample: Likewise, MI between ζ and another variable ψ can be thought of as the expected amount of information about variable ζ contained in a realization of ψ and is measured by the expected Kullback-Leibler (KL) divergence (Kullback and Leibler, 1951) between the conditional and marginal distributions over ζ : In this context, the generic random variables ζ and ψ represent fPET and θ (soil water content), respectively. The observation space of the target random variable fPET is discretized using a fixed bin width. As bin width decreases, entropy increases but mutual information asymptotes to a constant value. On the other hand, increased bin width requires a greater sample size, which cannot always be satisfied. The trick is choosing a bin width where the NMI values stabilize with sample size. After a careful sensitivity analysis, we choose a fixed bin width of 0.25 [-] for fPET and make sure that each AmeriFlux site had enough samples to accurately estimate the NMI, and change of this constant bin width from 0.1 to 0.5 [-] will not significantly alter our conclusions. Following Nearing et al. (2016), a bin width of 0.01 m 3 m −3 (1 % volumetric water content) for θ is applied.
Integrations required for MI calculation in Eq. (7) are then approximated as summations over the empirical probability distribution function bins (Paninski, 2003). By definition, the MI between two variables represents the amount of entropy (uncertainty) in either of the two variables that can be reduced by knowing the other. Therefore, the MI normalized by the entropy of the AmeriFlux-based fPET measurements represents the fraction of uncertainty in fPET that is resolvable given knowledge of the soil water content state (Nearing et al., 2013). Unlike Pearson's correlation coefficient, MI is insensitive to the impact of nonlinear variable transformations. Therefore, it is well suited to describe the strength of the (potentially non-linear) relationship between θ and fPET.
Here, we applied this approach to calculate the MI content between soil water content representing different vertical depths (as reflected by θ s and θ v ) and fPET at each AmeriFlux site. All estimated site-specific MI are normalized by the entropy of the corresponding AmeriFlux-based fPET measurements to remove the effect of inter-site entropy variations on the magnitude of NMI differences. The resulting normalized MI calculations between both θ s and θ v and fPET are denoted as NMI(θ s , fPET) and NMI(θ v , fPET), respectively.
The underestimation of observed θ / ET coupling via the impact of mutually independent θ and ET errors in Ameri-Flux observations (Crow et al., 2015) is minimized by focusing on the ratio between NMI(θ S , fPET) and NMI (θ v , fPET). Therefore, relative comparisons between NMI(θ s , fPET) and NMI(θ v , fPET) are based on examining the size of their mutual ratio NMI(θ S , fPET) / NMI (θ v , fPET). To quantify the standard error of NMI differences between various soil water content products, we applied a nonparametric, 500-member bootstrapping approach and calculated the pooled average of sampling errors across all sites assuming spatially independent sampling error.
Finally, we also examined the impact of potential nonlinearity in the θ / ET relationship by comparing nonparametric NMI results with comparable inferences based on a conventional Pearson's correlation calculation. The correlation-based coupling strength between θ s and fPET is denoted as R(θ s , fPET) and between θ v and fPET as R(θ v , fPET).

Comparison of NMI(θ s , fPET) and NMI(θ v , fPET)
Figure 1 contains boxplots of modeled and observed NMI(θ s , fPET) and NMI(θ v , fPET), i.e., the relative magnitude of fPET information contained in surface soil water content and vertically integrated (0-40 cm) soil water content estimated from Case I, sampled across all the AmeriFlux locations listed in Table 1. According to the AmeriFlux ground mea- surements, median values of NMI(θ s , fPET) and NMI(θ v , fPET; across all sites) are near 0.3 [-]. This suggests that approximately 30 % of the uncertainty (i.e., entropy at this particular bin width of 0.25 [-]) in fPET can be eliminated given knowledge of either surface or vertically integrated soil water content state. This is consistent with earlier results in Qiu et al. (2016) who used similar metrics to evaluate θ / EF (evaporative fraction) coupling strength. The sampled medians of NMI(θ s , fPET) and NMI(θ v , fPET) estimated by the NOAHMP and CLSM models are similar to these (observation-based) AmeriFlux values. With the single exception that the CLSM predicts much larger site-to-site variation in NMI(θ s , fPET).
Using the 34 AmeriFlux site-collocated samples pixels for a paired t test, both LSMs and GLEAM overall exhibit significantly (at the 0.05 level) higher NMI(θ s , fPET) compared to NMI(θ v , fPET), implying the surface soil water content observations contain more fPET information than vertically integrated soil water content observations. However, the observed difference between NMI(θ s , fPET) and NMI(θ v , fPET) is less discernible in AmeriFlux measurements (Fig. 1a).
Here, AmeriFlux observations are used as a baseline for LSM and GLEAM evaluation. However, it should be stressed that random observation errors in θ and fPET will introduce a low bias into AmeriFlux-based estimates of both NMI(θ s , fPET) and NMI(θ v , fPET; Crow et al., 2015) and thus their difference as well. To address this concern, Fig. 1b plots the ratio of NMI(θ s , fPET) and NMI(θ v , fPET), which effectively normalizes (and therefore minimizes) the impact of random observation errors. As discussed above, these ratio results illustrate the general tendency for NMI(θ s , fPET) to exceed NMI(θ v , fPET). They also highlight the tendency for GLEAM to overvalue θ s (relative to θ v ) when estimating fPET. A second approach for reducing the random error of θ and fPET measurement errors is the correction based on triple collocation (TC) applied in Crow et al. (2015). However, this approach is currently restricted to linear correlations and cannot be applied to estimate NMI. Future work will examine extending the information-based TC approach of Nearing et al. (2017) to the examination of NMI.

Sensitivity of AmeriFlux-based NMI(θ s , fPET) / NMI(θ V , fPET)
As mentioned in Sect. 2.1, an important concern is the impact of interpolation errors used to estimate 0-40 cm θ v from AmeriFlux θ s observations acquired at non-uniform depths.
To ensure that different methods for calculating AmeriFlux θ v values do not affect the main conclusion of this study, we configured three cases for θ v calculation, and compared their NMI(θ S , fPET) / NMI(θ v , fPET) results in Fig. 2. Case I reflects the baseline use of the exponential filter described in Sect. 2.1. However, slight changes to AmeriFlux results are noted if alternative approaches are used. Specifically, AmeriFlux-based NMI(θ v , fPET) increases and closes the gap with NMI(θ s , fPET) if the bottom-layer soil water content measurements are instead directly used as θ v (Case II) or if 0-40 cm θ v is based on the linear interpolation of the two AmeriFlux θ observations (Case III); the impact of this modest sensitivity on key results is discussed below. In addition, switching from GLEAM-to NARR-based PET when calculating fPET for AmeriFlux-based NMI(θ s , fPET) and NMI(θ v , fPET) does not qualitatively change results and produces only a very slight (∼ 6 %) increase in the median NMI(θ s , fPET) / NMI(θ v , fPET) ratio.
3.3 Spatial distribution of NMI(θ s , fPET) and NMI(θ v , fPET) Figure 3 plots the spatial distribution of NMI(θ s , fPET) and NMI(θ v , fPET) results for each of the individual 34 Amer-iFlux sites listed in Table 1. The climatic regime is represented by AI (aridity index) values plotted as the background color in Fig. 3. It can be seen in Fig. 3 that NMI(θ s , fPET) estimates from LSMs and GLEAM are spatially related to hydro-climatic conditions, as NOAHMP and CLSM predict that θ s is moderately coupled with fPET (i.e., NMI(θ S , fPET) of 0.3-0.5 [-]) in the arid southwestern USA (AI < 0.2) and only loosely coupled with fPET in the relatively humid eastern USA. A similar decreasing trend of NMI(θ s , fPET) from the southwestern to eastern USA is also captured by GLEAM. However, as noted above, GLEAM generally overestimates NMI(θ s , fPET) and NMI(θ v , fPET) compared to NOAHMP, CLSM and AmeriFlux. In contrast, a relatively weaker spatial pattern emerges in AmeriFlux-based NMI(θ s , fPET) results. In addition, spatial patterns for NMI(θ s , fPET) are less defined than for NMI(θ v , fPET) in all four datasets. Scatterplots in Fig. 4 summarize the spatial relationship between LSM-and GLEAM-based NMI(θ s , fPET) and NMI(θ v , fPET) results versus AmeriFlux observations across different land use types. While observed levels of correlation in Fig. 4 are relatively modest, there is a significant level (p < 0.05) of spatial correspondence between modeled and observed NMI results only over forest sites, motivating the need to better understand processes responsible for spatial variations in NMI results. In addition, stratifying NMI(θ s , fPET) / NMI(θ v , fPET) ratio results according to vegetation type (Fig. A1 in the Appendix) confirms that NMI(θ s , fPET) slightly exceeds NMI(θ v , fPET) across all vegetation types (and thus all rooting depths characterizing each vegetation type). This suggests that our analysis is not severely affected by variations in the depth of θ measurements. For further discussion on the impact of land cover on NMI results, please see Appendix A.
3.4 Sensitivity of NMI(θ s , fPET) / NMI(θ v , fPET) ratio to climatic conditions Figure 5 further summarizes the NMI(θ s , fPET) / NMI(θ V , fPET) ratio as a function of AI for all four products (NOAHMP, CLSM, GLEAM and AmeriFlux). Error bars represent the standard deviation of sampling errors calculated from a 500-member bootstrapping analysis. With increasing AI, there is a significant decreasing trend in both NMI(θ S , fPET) and NMI(θ v , fPET) for all three simulations, with a goodness of fit above 0.5 (figure not shown). For all cases, the NMI(θ s , fPET) / NMI(θ v , fPET) ratios are consistently greater than unity under all climatic conditions. However, the estimated NMI(θ s , fPET) / NMI(θ v , fPET) ratios from all three simulations (NOAHMP, CLSM and GLEAM) exhibit quite different trends with respect to AI. The NMI(θ s , fPET) / NMI(θ V , fPET) ratio for CLSM decreases with increasing AI, with a moderate goodness-of-fit value of 0.28, while GLEAM estimates of NMI(θ s , fPET) / NMI(θ v , fPET) shows an opposite increasing trend with increasing AI. Conversely, there is relatively lower sensitivity of the NMI(θ s , fPET) / NMI(θ v , fPET) ratio to AI captured in the AmeriFlux measurements.
Connecting these findings to the spatial distribution of NMI(θ s , fPET) and NMI(θ v , fPET; Fig. 3) confirms that the relative magnitudes of NMI(θ s , fPET) and NMI(θ v , fPET) for both LSMs and GLEAM are spatially related to hydroclimatic regimes. In contrast, this link is weaker in the Amer-iFlux measurements which, except for a small fraction of very low AI sites, do not appear to vary as a function of AI. These conclusions are not qualitatively impacted by looking at NMI(θ s , fPET) and NMI(θ v , fPET) differences, as opposed to their ratio as in Fig. 5, or by looking at R(θ s , fPET) and R(θ v , fPET) instead of NMI.

Discussion and conclusion
Since transpiration dominates the global ET (Jasechko et al., 2013), deep-layer soil water content (θ v ) is generally considered to contain more ET information than that of surface soil water content (θ s ), given that plant transpiration is balanced by root water uptake from deeper soils (Seneviratne et al., 2010). However, this assumption is rarely tested using models and/or observations. Here, we apply normalized mutual   information (NMI) to examine how the vertical support of a soil water content product affects its relationship with concurrent surface ET.
Specifically, using AmeriFlux ground observations, we examine whether (NMI-based) estimates of LSMs and GLEAM θ s versus ET and θ v versus ET coupling strength accurately reflect observations acquired at a range of Ameri-Flux sites. In general, compared to the baseline case of exponential filter extrapolated 40 cm bottom-layer θ v , LSMs and GLEAM agree with AmeriFlux observations in that the overall fPET information contained in θ s is slightly higher than that of θ v (Fig. 1). However, the sensitivity analysis showed this difference between NMI(θ s , fPET) and NMI(θ v , fPET) diminishes when using different methods for calculating θ v using AmeriFlux observations (Fig. 2). As a result, this result should be viewed with caution.
While NOAHMP-and CLSM-derived NMI(θ s , fPET) and NMI(θ V , fPET) results are generally consistent with the AmeriFlux observations, GLEAM overestimates NMI(θ s , fPET), NMI(θ V , fPET) and the ratio NMI(θ s , fPET) / NMI(θ V , fPET) relative to observations. Although both LSMs and GLEAM are based on the same classical two-section (soil water content-limited and energy-limited) ET regimes framework (Sect. 2.2), they differ in two fundamental aspects. First, the evaporative stress factor S is represented as a more direct and strong function of soil water content in GLEAM -see Eqs. (4) and (5) -which leads to the overestimation of the θ / ET coupling strength. This is consistent with our results that GLEAM generally overestimates NMI(θ s , fPET) and NMI(θ v , fPET) consistently across all land covers, compared to AmeriFlux-based estimates. On the other hand, NOAHMP and CLSM approximate ET in the manner of biophysical models, and expresses biophysical control on ET through the stomatal resistance r s , which is a function of multiple limiting factors including θ . Therefore, the more complex ET scheme employed by NOAHMP and CLSM would seem to mitigate the overestimation of NMI(θ S , fPET) and NMI(θ v , fPET), as other relevant factors besides θ (such as temperature, foliage nitrogen) are also considered in determining maximum carboxylation rate V max and stomatal resistance r s , and consequently more realistic actual ET.
Second, the stress factor β in both LSMs considers the cumulative effects of θ conditions along different layers (Eq. 1), while the corresponding factor S in GLEAM only uses the wettest soil layer condition, which is top layer at most sites. This likely explains the overestimation of the NMI(θ s , fPET) / NMI(θ v , fPET) ratio by GLEAM.
Nevertheless, we would like to stress that all approaches considered in our paper contain (at their core) a parameterized relationship between θ and ET. While the implications of mis-parameterizing this relationship are arguably more severe for a land surface model, we argue that the issue remains relevant for any approach (such as GLEAM) that utilizes a water balance (and/or data assimilation system) approach to estimate θ and, in turn, uses θ to constrain ET. Regardless of the complexity that a given approach employs, failing to accurately describe the relationship between ET and (large number of potential) environmental constraints should eventually degrade the robustness of the model, whether it is employed as a retrospective, diagnostic or predictive manner. To examine this issue directly, Fig. 6 plots the relationship between GLEAMS bias in NMI(θ s , fPET) / NMI(θ v , fPET) ratio versus the RMSE of daily GLEAM ET simulations for a range of AmeriFlux sites. There is a positive correlation between the two quantities, which suggests that GLEAM overestimation of θ / ET coupling during the summer may undermine the accuracy of its daily ET retrievals. It should be noted that GLEAM simultaneously overestimates both NMI(θ s , fPET) and NMI(θ v , fPET); however, the impact of this mis-parameterization impact on GLEAM ET accuracy is most obvious when plotted against the ratio NMI(θ s , fPET) / NMI(θ v , fPET).
Although the median values of NMI(θ s , fPET) and NMI(θ V , fPET) predicted by NOAHMP and CLSM are generally in line with AmeriFlux observations, they are more spatially related to hydro-climatic conditions (as summarized by AI) than their counter parts acquired from AmeriFlux measurements. Seen from the plot of NMI(θ s , fPET) / NMI(θ v , fPET) ratio as a function of AI (Fig. 5), the modeled and observed median of NMI(θ s , fPET) / NMI(θ V , fPET) ratio decreases with increasing AI, and the decreasing trend is particularly clear when AI is lower than 1.0 [-]. In contrast, there is relatively lower sensitivity to aridity exhibited in the AmeriFlux measurements.
These results provide several key insights into future landatmosphere coupling analysis and LSMs, as well as ET algorithm development. First, all the datasets -both modelbased and ground-observed -indicate that θ s contains at least as much ET information as θ v . Hence, remote-sensing land surface soil water content datasets are suitable, and should be considered, for analyzing the general interaction between land and atmosphere, e.g., soil water content-air temperature coupling (Dong and Crow, 2019) and the interplay of soil water content and precipitation (Yin et al., 2014). Additionally, future generations of GLEAM may consider more sophisticated evaporation stress functions, which may improve its accuracy in representing the soil's control on local ET. This may, in turn, improve the accuracy of the GLEAM ET product. Finally, our results demonstrate that modeled θ / ET is more sensitive to hydro-climates than the observed relationship. Modifying the model structures to reduce such sensitivity might be necessary for accurately representing the interaction of land surface and atmosphere across different climate zones. This may lead to more realistic projections of future drought-induced heat waves, when coupled with general circulation models.
Author contributions. JQ and WTC conceptualized the study. JD helped preparing the LSMs simulation. GSN assisted in the mutual information analysis. JQ carried out the analysis and wrote the first draft manuscript and WTC refined the work. All authors contributed to the analysis, interpretation and writing.
Competing interests. The authors declare that they have no conflict of interest.
Financial support. This research has been supported by the National Natural Science Foundation of China (grant nos. 41971031, 41501450 and 51779278), the Natural Science Foundation of Guangdong Province (grant no. 2016A030310154).
Review statement. This paper was edited by Dominic Mazvimavi and reviewed by two anonymous referees.