the Creative Commons Attribution 4.0 License.

the Creative Commons Attribution 4.0 License.

# Using lagged dependence to identify (de)coupled surface and subsurface soil moisture values

### Coleen D. U. Carranza

### Martine J. van der Ploeg

### Paul J. J. F. Torfs

Recent advances in radar remote sensing popularized the mapping of surface soil moisture at different spatial scales. Surface soil moisture measurements are used in combination with hydrological models to determine subsurface soil moisture values. However, variability of soil moisture across the soil column is important for estimating depth-integrated values, as decoupling between surface and subsurface can occur. In this study, we employ new methods to investigate the occurrence of (de)coupling between surface and subsurface soil moisture. Using time series datasets, lagged dependence was incorporated in assessing (de)coupling with the idea that surface soil moisture conditions will be reflected at the subsurface after a certain delay. The main approach involves the application of a distributed-lag nonlinear model (DLNM) to simultaneously represent both the functional relation and the lag structure in the time series. The results of an exploratory analysis using residuals from a fitted loess function serve as a posteriori information to determine (de)coupled values. Both methods allow for a range of (de)coupled soil moisture values to be quantified. Results provide new insights into the decoupled range as its occurrence among the sites investigated is not limited to dry conditions.

Although recent decades have seen great advances in remote sensing applications for mapping surface soil moisture (Jackson, 1993; Njoku et al., 2003; Mohanty et al., 2017), most hydrological studies that make use of soil moisture data require integrated values over a certain soil depth (Brocca et al., 2017). Extrapolation of surface soil moisture from remote sensing techniques to depths beyond the sensor's capacity (up to 5 cm) is not a trivial task given the spatiotemporal variability of soil moisture. The vertical distribution of soil moisture, which determines integrated soil moisture content over a soil column, is rarely uniform, as more pronounced dynamics are expected closer to the surface compared to deeper in the soil (Hupet and Vanclooster, 2002). Currently, information derived from remote sensing are assimilated into hydrological models to obtain integrated soil moisture values (Houser et al., 1998; Das et al., 2008). However, Kumar et al. (2009) stressed that it is important to assess vertical variability, especially the strength of coupling between surface and subsurface soil moisture for improvement of data assimilation (DA) results. Analyses of vertical soil moisture distributions also have important implications for modeling studies, as they could be used for calibration or validation of model parameters (De Lannoy et al., 2006).

The amount of soil moisture at any given time is controlled by factors operating at different timescales. While prevailing atmospheric conditions directly affect surface layers and control the temporal dynamics of soil moisture (Albertson and Montaldo, 2003; Koster et al., 2004), it is the downward movement of water from the surface that dictates the amount of subsurface soil moisture at a given time (Belmans et al., 1983; Rodriguez-Iturbe et al., 1999). Flow rates to the subsurface are driven by hydraulic properties, which are in turn controlled by physical soil characteristics such as texture, bulk density, and structure. Relative to changes in atmospheric conditions, soil physical properties change over longer timescales. Vegetation further modifies vertical soil moisture distribution by root water uptake (Yu et al., 2007) and by changing soil structure (Angers and Caron, 1998).

Given the variability along a soil column, during which conditions does surface soil moisture reflect subsurface soil moisture? Several studies have investigated this relation to address the correspondence between surface and subsurface soil moisture. One of the earliest studies is by Capehart and Carlson (1997), in which they compared modeling outputs with remote sensing measurements. Using very shallow depths of 5 mm and 5 cm, they referred to decoupling as the deviation from a linear correlation between these depths due to variable drying rates. Further assessment of decoupling from model-generated time series soil moisture data have been investigated using cross-correlation values (Martinez et al., 2008; Mahmood et al., 2012; Ford et al., 2014). High correlation to the subsurface was obtained using lagged values of surface soil moisture. However, cross-correlation is limited to providing a single value throughout the range of soil moisture encountered per lag. Furthermore, cross-correlation generally aims to evaluate the strength of lagged linear dependence between two variables (Shumway and Stoffer, 2010). However, lagged dependence between surface and subsurface soil moisture may not be linear given that nonlinear processes determine water flow along the soil profile. Using in situ field measurements, Wilson et al. (2003) investigated spatial surface (0–6 cm) and subsurface (0–30 cm) soil moisture distribution by calculating statistical metrics and by means of a variogram. Decoupling between the two depths was observed, which they suggested was influenced by vegetation, especially root density at surface soil. Their results were also affected by the dry soil moisture range and emphasized the importance of distinguishing between surface and total soil moisture for future applications of remote sensing to atmospheric studies.

Based on previous studies, the term decoupling refers to a weak dependence
between soil moisture contents at the surface and subsurface. Recognition of
decoupling is important; however, most studies have been limited to providing
qualitative characterization of conditions when decoupling occurs (e.g., dry
period). Only Capehart and Carlson (1997) identified a mid-range soil
moisture (∼ 0.3 cm^{3} cm^{−3}) when the
surface and very near surface begin to decouple. Their results, however, are
limited to a thin layer of the soil column. In this paper, our main objective
is to quantitatively identify a range of surface soil moisture values that is
decoupled from the subsurface. Furthermore, we consider depths greater than
those investigated by Capehart and Carlson (1997). The ability to quantify
(de)coupled surface and subsurface soil moisture contents will contribute to
more effective estimation of depth-integrated soil moisture data using remote
sensing methods and improved data assimilation results in hydrological models.

We utilized in situ time series datasets at depths of 5 and 40 cm to represent surface and subsurface, respectively. Values outside the decoupled range are considered coupled since soil moisture is inherently bounded up to a maximum value equal to soil's porosity. The investigation of (de)coupling is based on the idea that surface conditions will be reflected at the subsurface after a certain delay, indicating strong coupling between the two zones, and vice versa. More focus is given to the decoupled soil moisture range since it has greater implications for extrapolation of surface soil moisture values to deeper soil layers. We applied statistical methods to identify conditions of decoupling with no prior assumptions on the type of functional relation between surface and subsurface. As an exploratory step, we first assessed the dependence without considering lags using regression and residual analysis. The main approach for assessing decoupling was application of distributed-lag nonlinear models (DLNMs; Gasparrini et al., 2010) to incorporate both the lag structure and the functional relation between surface and subsurface soil moisture. Applications of distributed-lag models to econometrics and environmental epidemiology have been well documented (Almon, 1965; Zanobetti et al., 2002; Bhaskaran et al., 2013; Wu et al., 2013). However, their application to hydrological studies have rarely been explored.

Four time series datasets from the Twente soil moisture and temperature monitoring network (Dente et al., 2011) were used in this study (Fig. 1). Datasets from 2014–2016 are available with only short periods of missing data. The stations are located in agricultural fields with sensors installed at 5, 10, 20, and 40 cm depths. To investigate decoupling, only the 5 and 40 cm depths were considered because the largest possible distance was desired. Each station consists of EC-TM ECH2O capacitance probes (Decagon Devices, Inc., USA) that logged soil moisture data every 15 minutes. A calibration procedure using gravimetric measurements was applied prior to analysis (Dente et al., 2011).

Land cover in the area varies from corn in one field (SM05), to grass in two fields (SM05 and SM13), to a forest area (SM20). Values at 40 cm capture the root zone of vegetation for each site. In reality, rooting depths vary and depend on species composition, climate, and plant growth rate. However, the depth considered would still allow for approximation of root zone conditions. The landscape is characterized by flat to slightly sloping terrain. It is important to note that SM20 is located at the eastern foot of a small hilly terrain. Throughout the study period, either land cover remained unchanged or the same crop was planted. The soil types for the stations range from coarse sandy soils to weakly silty soils (Wosten et al., 2013). A summary of the land cover and relevant characteristics of the stations are summarized in Table 1.

Soil moisture values were averaged into daily values to match the available daily rainfall data from the Dutch national weather service (KNMI). For SM13 and SM20, there are some missing data from the beginning of 2014. The datasets from SM13 begins on 25 April 2014 while SM20 begins on 2 May 2015 (Fig. 2).

## 3.1 Regression and residual analysis

As an exploratory step, the dependence between surface and subsurface soil
moisture was initially visualized using scatter plots. Conditional means for
every 0.01 cm^{3} cm^{−3} interval and vertical
bars representing ± standard deviation were added to show changes in
vertical variability across the soil moisture range. Longer standard
deviation bars indicate higher vertical variability (Fig. 3a). We referred to vertical variability as the unevenness or irregularity
in the soil moisture distribution within a certain depth of interest along
the vertical profile, in this case up to depths of 40 cm. For the rest of
this paper, variability will refer to vertical soil moisture variability,
unless otherwise stated. The plotted points were colored per month to show
any impacts of seasonality. The effect of rainfall was included by adjusting
the sizes of the points proportional to rainfall intensity measured from the
nearest KNMI stations. For the overall measure of dependence, Spearman's
rank correlation coefficient *R*_{s} was computed for every pair of ranked
values in the time series. This was chosen as the assumption of linear
dependence was not made.

A flexible nonparametric locally weighted regression function (commonly
called a loess function, Cleveland and Devlin, 1988) was fitted along the
soil moisture range. This was used to explore and identify trends across the
soil moisture range. A linear regression was also fitted only for comparison.
Residuals were analyzed further for variability not captured by the fitted
function (Fig. 3b). The residual variance for every
0.1 cm^{3} cm^{−3} interval as well as the
resulting cumulative residual variance were analyzed to examine variability
across the range. The degree of variability was related to the slope of the
cumulative variance line, with steep slopes indicating high variability. In
addition, a significant change in variance between two points was indicated
by a significant change in the slope of the line. The soil moisture value
where a significant change in slope occurred was marked by *θ*_{c}; this
divides the soil moisture range into two groups. The group with a steeper
slope was interpreted as the decoupled range, and vice versa. Since the
measured variance is sensitive to sample size, a correlation coefficient was
calculated to determine if there was significant dependence between the two
variables. Residual variance were first normalized from 0 to 1 because of the
varying soil moisture range encountered at each station.

Results of the exploratory methods were considered a posteriori knowledge for analysis of lagged dependence and interpretation of results.

## 3.2 Analysis of lagged dependence

### 3.2.1 Cross-correlation

Since decoupling is based on the strength of lagged dependence, the existence of lag between surface and subsurface soil moisture values was first determined. Cross-correlation is known to be a quick and easy method to apply for this objective. Lagged values of surface soil moisture were correlated with instantaneous values at the subsurface. A maximum cross-correlation at negative lags indicated that surface soil moisture is leading subsurface soil moisture, and vice versa (Shumway and Stoffer, 2010). A 10-day lag was deemed long enough to show the presence of lag–lead relations in the time series since the maximum correlation occurred within this period.

### 3.2.2 Distributed-lag nonlinear model

We incorporated delayed or lagged effects in evaluating the relation between surface and subsurface values, and eventually in determining the (de)coupled values. It should be emphasized that the analysis was primarily focused on examining the trends and relation between surface and subsurface soil moisture. Moreover, it was not intended to replace other existing models for estimating soil moisture or examining its patterns.

A distributed-lag nonlinear model developed by Gasparrini et al. (2010) was applied to the 5 and 40 cm time series datasets at the study sites. Briefly, the model is capable of simultaneously representing both functional the dependence and delayed response between exposure and response values. We considered surface soil moisture as the exposure values that produced delayed effects to the response values at the subsurface. A nonlinear model was selected in order to capture the nonlinear dynamics of flow and transport along the soil profile (Mohanty and Skaggs, 2001; Kim and Barros, 2002). Furthermore, DLNM offered enough flexibility to model a variety of dependencies in the time series dataset by selecting a suitable basis function. As an analogy, a DLNM is to a linear time series model (e.g., autoregressive model) just as a generalized linear model is to a linear model, as can be seen in Eq. (1).

In assessing lagged dependence, event-scale patterns were of interest rather than large-scale trends within the time series (Wilson et al., 2004). This required seasonal patterns to be addressed prior to applying the DLNM. This was done by fitting a loess function to the time series and then subtracting it from the original soil moisture values (Cleveland et al., 1990). Removal of seasonality was further justified by the scatter plot results (see Sect. 4.1). The influence of seasonality on the vertical soil moisture variability is indicated by clustering of observation points occurring within the same months (Fig. 4). Deseasonalized soil moisture values were used for identifying (de)coupled soil moisture conditions.

For consistency in modeling, the range of surface soil moisture values used
was from 0–0.50 cm^{3} cm^{−3}. This was based
on the highest surface soil moisture value encountered among the four sites.
A lag value of up to 30 days was considered long enough to investigate
delayed effects. This period also approximated the recurrence of heavy
rainfall within the study sites. A spline function was the basis function
chosen to represent the functional dependence and delayed effects as it
offered flexibility to capture nonlinearities. In addition, contributions
from daily rainfall data were used to incorporate current and past
meteorological conditions. This was applied as a covariate and was
represented with an additional basis function. We only considered delayed
effects in vertical flow as lateral movement is deemed negligible in a flat
to slightly sloping terrain (Table 1). The analysis was
performed in R software using dlnm (Gasparrini, 2011)
and mgcv (Wood, 2006a) packages.

The following section concisely describes the mathematical formulation of a DLNM. However, the reader may choose to skip this section as the general description of the methods applied has already been given in the text above. For a more detailed explanation, readers are referred to Gasparrini et al. (2010) and Gasparrini et al. (2017).

To more formally describe a DLNM, let us first consider a general time series
model, where outcomes *Y*_{t} with *t* = 1, ⋯, *n* can be described by

where *μ* ≡ *E*(*Y*), assumed to be derived from a Poisson distribution,
and *g* is a monotonic link function. The functions *s*_{j} denote
relationships between the variables *x*_{j} and vector parameters *β*_{j}.
Other *u*_{k} variables with predictors are included in
coefficients *γ*_{k} to specify their related effects. The relation
between *x* and *g*(*μ*) is represented by *s*(*x*) through a basis function.
The complexity of this estimated relationship depends on the type basis
function chosen and its dimensions. In the presence of delayed effects, the
outcome *Y* at any time *t* is explained by the past exposures *x*_{t−l} with
*l* as the “lag” representing the elapsed time between exposure and
response. The final goal of a DLNM is to simultaneously describe the
dependency along both the predictor space and lag dimension. This is achieved
by selecting two sets of basis functions that are combined to obtain the
cross-basis functions (Gasparrini et al., 2010).

Within the DLNM framework, a response *Y*_{t} at time *t* = 1 is based on
lagged occurrences of predictor *x*_{t}, which is represented by vector
*q*_{t} = $[{x}_{t-{l}_{\mathrm{0}}}$; ⋯; *x*_{t−L}]^{T}. The minimum and maximum lags are given
by *l*_{0} and *L*_{T}, respectively. The function represents dependence through

where *f* ⋅ *w*(*x*_{t−L}, *l*) represents the exposure–lag–response function,
which is composed of two marginal functions: the exposure–response function
*f*(*x*) and lag–response function *w*(*l*) in the space of the lag.
Parameterization of *f* and *w* is achieved by application of the known basis
functions to the vectors *q*_{t} and ** l**. The result can be expressed as
matrices

**R**and

**C**with dimensions (

*L*−

*l*

_{0}+ 1) ×

*v*

_{x}and (

*L*−

*l*

_{0}+ 1) ×

*v*

_{l}, respectively.

The cross basis function *s* and parameterized coefficients ** η** are given by

The values of ** w** are derived from

**A**

_{t}, which is computed from the row-wise Kronecker product between matrices

**R**and

**C**. The dependence is expressed through

**and parameters**

*w***. The cross-basis function represents the integral of**

*η**s*(

*x*,

*t*) over the interval [

*l*

_{0},

*L*], summing the contributions from the exposure history. The estimated dependence to specific exposure values is determined by prediction of $\widehat{\mathit{\beta}}$, called lag coefficients. The estimated $\widehat{\mathit{\beta}}$ and covariance matrix $\mathbf{V}\left(\widehat{\mathit{\beta}}\right)$ is given by

A further extension to DLNM is the application of penalties for smoothness of the lag structure and shrinkage of lag coefficients to null at very high lags. These penalties were applied in the analysis using a second-order difference (Wood, 2006b) and varying ridge penalties (Obermeier et al., 2015; Gasparrini et al., 2017), respectively. Application of penalties was based on the assumption that, at higher lags, the lag coefficients become smaller and approach the null value.

## 3.3 Evaluating (de)coupled soil moisture values

Application of a DLNM resulted in the estimation of parameter $\widehat{\mathit{\beta}}$ for each surface soil moisture value (Eqs. 4 and 5). This indicated the strength of dependence between surface and subsurface soil moisture. Higher $\widehat{\mathit{\beta}}$ values indicated stronger dependence or coupling between the two. Hence, we referred to $\widehat{\mathit{\beta}}$ as the relative influence of surface soil moisture on subsurface values.

## 4.1 Regression and residual analysis

The overall dependence between surface and subsurface given by the Spearman's
rank coefficient (*R*_{s}) range from 0.746 to 0.866 (Fig. 4).
However, even with a high overall dependence, variability is not uniform
across the soil moisture range (Fig. 4). Except for SM13,
increased variability is observed towards drier soil moisture values.
Furthermore, the degree of variability also differs among the four sites. The
most pronounced variability is observed at SM13 and the least at SM05.
Clustering of observation points occurring within the same months indicate
that seasonality dictates soil moisture values and impacts soil moisture
variability. Rainfall events measured on the same day do not show a clear
effect on surface and subsurface soil moisture dependence. Observations with
higher rainfall intensities appear scattered in the plots
(Fig. 4). In addition, the said observation points do not
necessarily fall along the fitted functions or at the wet soil moisture
region of the scatter plots. As lag is not considered, the impact of rainfall
on variability is not fully captured in the scatter plots alone.

Assessment of the regression fit quality was performed by comparison using residual standard errors (RSEs). The results for both linear and loess functions show highly similar values (Fig. 4). This indicates that, in this case, a linear function captures the relation between surface and subsurface values. Nevertheless, the more flexible loess function was preferred for further residual analysis because of its slightly better model fit and, using only visual inspection of Fig. 4, it more closely approximates the calculated conditional mean.

Figure 5 shows the residual plots with lines of the cumulative
residual variance. The change in slope of the line is a feature consistent
for all sites regardless of the magnitude of residual variance. The changes
in variability are more clearly observed from the residuals than from the
standard deviation bars in the scatter plots. The change in slope at
*θ*_{c} is highlighted by the vertical dashed line. The decoupled soil
moisture range corresponds to the section of cumulative residual variance
line with a steeper slope. Specifically, the range of decoupled surface soil
moisture values (in cm^{3} cm^{−3}) is 0.08–0.21 for SM05, 0.12–0.27 for
SM09, 0.30–0.39 for SM13, and 0.08–0.12 for SM20.
Except for SM13, the decoupled values are within the dry to intermediate soil
moisture range. The cumulative residual variance line for SM13 appears to
increase exponentially with increasing surface soil moisture. This differs
from the other three sites which show a distinct decrease in slope at
increasing soil moisture values. For SM20, a second point is identified with
a change in slope. The flat line starting from 0.24–0.28 cm^{3} cm^{−3} indicates there is
still lowered variance at the very wet soil moisture range.

The correlation between normalized variance and sample size yielded a value of −0.24 (Fig. 6). This low-correlation magnitude confirms that the variance obtained for the soil surface moisture intervals was not strongly influenced by the sample size used.

## 4.2 Cross-correlation

Figure 7 shows cross-correlation values at the four sites. Maximum correlation occurs at −1 to −2 days of lag, except at SM20. This translates to a 1–2-day lead of surface soil moisture values. For SM20, the maximum correlation occurs at positive lags. Correlation values from lag = 0 to lag = 10 are almost equal at SM20. Although this indicates leading subsurface values, it does not eliminate the possibility of having a lag between surface and subsurface values (see Sect. 5.2). Other factors may play a role in having leading subsurface values in the cross-correlation plots. Hence, SM20 was still analyzed for decoupling using DLNM.

## 4.3 Distributed-lag model

Figure 8 shows the overall $\widehat{\mathit{\beta}}$ for each surface
soil moisture value with 5 and 95 % confidence intervals in shaded gray
regions. In order to identify a range that is decoupled, a threshold value (${\widehat{\mathit{\beta}}}_{\mathrm{c}}$)
must be specified. This value is comparable to the
intermediate soil moisture *θ*_{c} identified from Sect. 4.1.
The values of *θ*_{c} provided a suitable guide for identifying
a threshold common to all four sites (Table 2). The
corresponding $\widehat{\mathit{\beta}}$ values obtained at *θ*_{c} were very close
to 1; therefore, setting the threshold ${\widehat{\mathit{\beta}}}_{\mathrm{c}}$ = 1 seemed a reasonable
choice. This was preferred over the exact $\widehat{\mathit{\beta}}$ at each *θ*_{c}
since the latter was defined using exploratory methods at lag = 0. Using the
chosen ${\widehat{\mathit{\beta}}}_{\mathrm{c}}$ = 1, surface soil moisture values with $\widehat{\mathit{\beta}}$ < 1
are considered decoupled while those with $\widehat{\mathit{\beta}}$ ≥ 1 are coupled.

Based on ${\widehat{\mathit{\beta}}}_{\mathrm{c}}$, the identified decoupled values are generally in
the dry to intermediate soil moisture range (Fig. 8), except
for SM13 where decoupled values are at the wet range. Table 2
shows the decoupled values identified based on the selected ${\widehat{\mathit{\beta}}}_{\mathrm{c}}$.
The behavior and trends of $\widehat{\mathit{\beta}}$ also differ for each station. For
instance, at SM05 and SM09, there is a general increase in $\widehat{\mathit{\beta}}$ from
dry towards wet surface soil moisture values. SM20 also shows increasing $\widehat{\mathit{\beta}}$
over a limited soil moisture range (0.1–0.25 cm^{3} cm^{−3}). Outside this range,
the estimated $\widehat{\mathit{\beta}}$ values for SM20 were less than 1 and have very
broad confidence intervals. Recall that the range used for DLNM was only for
uniformity among the four study sites. The lack of or very few observations
for very dry or very wet soil moisture conditions led to wider confidence
intervals not only for SM20 but also for the other three sites. Compared to
the three sites, the estimated $\widehat{\mathit{\beta}}$ values for SM13 show decreasing
values towards the wet soil moisture range (∼ > 0.3 cm^{3} cm^{−3}). From the intermediate
to dry soil moisture conditions, the values fluctuate around the designated ${\widehat{\mathit{\beta}}}_{\mathrm{c}}$.

## 5.1 Decoupled soil moisture values

Regression and residual analyses show that there is an inherent vertical variability between surface and subsurface soil moisture values based on the lack of 1 : 1 correspondence between the two (Fig. 4). This inherent variability is also not uniform as higher variability is observed at certain soil moisture ranges. The cumulative residual variance plots (Fig. 5) clearly indicate the soil moisture values where vertical variability starts to become consistently larger. The increase in variability further translates to weak lagged dependence which we observe as low $\widehat{\mathit{\beta}}$ values from DLNM. The increase in vertical variability and weakening of lagged dependence is what we considered as decoupling between the surface and subsurface soil moisture.

Both residual analysis and DLNM were successful in identifying a decoupled
soil moisture range, and there is good agreement between the results from
both. Three out of four sites show decoupled values in the dry to
intermediate soil moisture range (Fig. 5 and Table 2).
These results agree with the known range where decoupling is expected
(Capehart and Carlson, 1997; Hirschi et al., 2014; Wilson et al., 2003). For SM05
and SM09, the intermediate soil moisture value *θ*_{c} that marks when
decoupling begins (Table 2) is close to that identified by
Capehart and Carlson (1997). They obtained a value of 0.3 cm^{3} cm^{−3}
as the point below which decoupling begins. However, results for SM13 do not conform to the
traditional concept of decoupling. This result is significant as it implies
that decoupling may occur at any value and is not confined to dry soil moisture range.

The vegetation type at each site exerts some influence on the soil moisture variability and the resulting (de)coupled values. First, the vegetation type affects how much ground surface is directly exposed to atmospheric conditions. Forested areas and grass fields are almost fully covered by vegetation compared to a corn field where the crops are organized in equidistant rows. Vegetation or canopy cover will determine how atmospheric conditions affect the soil moisture values. For instance, the amount of intercepted precipitation and evaporation are both dependent on vegetation cover. This in turn will have direct impacts to the surface soil moisture dynamics at each of the sites. For comparison, the variability given by the standard deviation bars in Fig. 4 and variance in Fig. 5 at the cornfield (SM09) is higher compared to that of the grass field (SM05) or the forested area (SM20). In addition, the forested area (SM20) has the smallest range of soil moisture values among the four sites. This may be due to the large intercepted rainfall by the forest canopy. Root water uptake (RWU) is another method by which vegetation affects soil moisture variability. RWU can have significant influence on the subsurface dynamics. The influence of RWU may vary for different vegetation types as it can be exerted over a range of depths, leading to differences in the resulting (de)coupled values.

Among the four sites, the subsurface trends observed for the 40 cm values at SM13 show consistently high values, which can be more pronounced during winter months. This resulted in decoupling during wet soil moisture conditions (Fig. 8). This trend is different from the other three sites, which only show a slight increase in the subsurface values. Further inspection of the time series data at SM13 reveals no sudden disturbance in the signal which could be attributed to errors in the sensor. Field investigation confirmed an increase in silt content at 40 cm compared to the upper layers. The increase in silt content promotes a decrease in hydraulic conductivity over depth that results in a slower vertical flow towards deeper layers. The presence of burrowing and hibernating animals was also observed at the site during winter. These create macropores which eventually alter the hydraulic properties of the soil (Kodešová et al., 2006; Beven and Germann, 2013). We infer that, at the measurement domain of the sensor, these burrows or macropores facilitated faster vertical flow to the subsurface. Alternatively, if the burrows produced voids around the measurement domain, this would result in lowered soil moisture or data gaps due to the loss of sensor-to-soil-media contact. However, there were no gaps observed that coincided with the burrowing animals' period of hibernation. During precipitation events, soil moisture flowing from upper layers arrived more rapidly at 40 cm depths due to the presence of macropores. There it accumulated and flowed more slowly to deeper layers because of the low hydraulic conductivity promoted by the increase in silt content. The overall effect of these factors was the pronounced increase in soil moisture values at 40 cm compared to those at 5 cm during winter periods, as observed from the time series dataset (Fig. 2).

Site-specific characteristics at each station control the magnitude of variability as well as the range at which decoupling is observed. However, the occurrence of decoupling is independent of the magnitude of variability since it was observed from SM05, where variability is least up to that of SM13 where it is greatest. The methods applied in this study only identify conditions when decoupling occurs but do not explicitly determine its controls. Identification of controls for decoupling requires a separate analysis where mechanistic models or statistical approaches can be applied.

## 5.2 Assessing the use of lagged dependence for identifying decoupled conditions

To assess the applicability of the methods applied, we further discuss their strengths and weaknesses. We also present opportunities for further studies as well as foreseen limitations for other sites.

### 5.2.1 Strengths

The residual analysis and DLNM methods allow quantification of a range of soil moisture values where decoupling occurs. This provides further extension to previous studies where decoupling is only described qualitatively. As seen from the results at the four sites, decoupling can occur at any soil moisture value, and is not confined to dry periods or ranges. Furthermore, by making no initial assumptions on data distributions and the type of functional relation and lag structure, the methods applied were considered robust. Nonlinear functions were applied as they conform to the nonlinearity of water flow in the unsaturated zone. They can also handle a variety of bivariate dependence, even in cases where the relation is linear, as shown by the highly similar fit of the loess and linear functions in Sect. 4.1.

### 5.2.2 Weaknesses

The first aspect that needs to be further investigated is the selected
${\widehat{\mathit{\beta}}}_{\mathrm{c}}$ value for identifying the decoupled soil moisture range.
Although the selection in this study was based on trends identified from time
series datasets, the methods applied should be tested further using other
datasets to confirm the suitability of ${\widehat{\mathit{\beta}}}_{\mathrm{c}}$ = 1 for other depths and
soil types. The choice of *β*_{c} is crucial as it dictates which soil
moisture values are expected to be decoupled. For instance, at the sites
where decoupling occurs during dry conditions, a higher *β*_{c} value would
enlarge the decoupled range. A similar effect would be expected for the site
with decoupling during wet conditions. However, a lower *β*_{c} value could
result to decoupling only during extreme soil moisture conditions (e.g., very
wet or very dry).

Another aspect to further examine is the use of cross-correlation for confirming the presence of leading surface soil moisture values. Results from SM20 show maximum correlation at positive lags which indicate leading subsurface values (Fig. 7). The weakness of using cross-correlation as a test for the presence of lag can be two-fold. First, cross-correlation can also capture the effect of subsurface dynamics such as groundwater influence and lateral flow. We infer that in SM20, subsurface dynamics dominates and masks the lag relation sought. An additional covariate representing subsurface dynamics was not included in the DLNM analysis since a dominant downward vertical flow was assumed. This assumption was based on the flat slopes encountered at SM20 (Table 1). Therefore, the occurrence of subsurface lateral flow or groundwater influence pose limitations to the applicability of DLNM for assessing decoupling. Second, cross-correlation is limited to evaluating linear lagged dependence and, in incorporating nonlinear lagged dependence, can make the test more robust. Equivalent methods exist (e.g., mutual information content; Qiu et al., 2014) but they are much more computationally demanding when the goal is simply to check for the existence of lag–lead relation.

### 5.2.3 Opportunities

In relation to utilizing remote sensing techniques, our results imply that the accuracy of estimating subsurface values from surface soil moisture can be greatly affected by vertical coupling. Lower variability and hence lower uncertainties are expected in the coupled soil moisture range. Assessment of decoupling can be used in combination with modeling studies as a preliminary method to determine the range where variability is expected to be higher. Furthermore, it can be helpful in assessing whether simulation results capture the variabilities observed in both the coupled and decoupled ranges. Taking decoupling into account can also assist in evaluating the necessity of complex models for simulating vertical soil moisture content.

For data assimilation applications, (de)coupling methods can be used for cross-comparison of the vertical coupling derived from DA model outputs with those observed from long term in situ measurements. This can aid in examining the adequacy of the assumed inherent connection between surface and subsurface values. As Kumar et al. (2009) pointed out, land surface models vary in their representation of the strength of this connection (e.g., weak or strong connection), which contributes to the degree to which modeling results are improved. They also suggested that strong coupling is a more robust choice unless independent information suggests that a more decoupled surface–subsurface representation is more realistic. In this aspect, the analysis applied in this study could be a valuable tool in determining which type of surface–subsurface coupling is the more optimal choice. Furthermore, the assumed connection strength is adopted for the whole range of soil moisture values. The results of our analysis show that at any given site, decoupling will occur regardless of the degree of soil moisture variability. A variable coupling strength could be adopted based on the soil moisture range where decoupling is likely to occur as an alternative to the single value for the whole range.

Although the study focused on vertically discrete values, the results are also applicable for depth-average values commonly used in remote sensing and DA applications. This requires that the vertically discrete values adequately capture the overall dynamics within zone being investigated. In such a case, we infer that the translation to depth-averaged values would result in (de)coupled values that are close, but not identical, to the values obtained when only comparing two discrete depths. As an illustration, we calculated the depth-average values using all the available measurements at each site (i.e., 5, 10, 20, and 40 cm depth) following the formula from Qiu et al. (2014). Figure 9 (left) reveals highly similar dynamics for both discrete and depth-average values. Therefore, it can be expected that the results from a regression and DLNM analyses using depth-average values would be highly similar to the original results in Figs. 5 and 8. However, if the vertically discrete values insufficiently represent the subsurface dynamics, larger deviations in the resulting decoupled values can be expected.

### 5.2.4 Limitations

In this study, only meteorological factors were incorporated into the DLNM analysis since vertical movement was assumed to be the dominant flow mechanism. However, the subsurface can also be influenced by lateral movement or groundwater by capillary rise. In such scenarios, decoupling will not be limited to changes in surface conditions. For this, SM20 provides an excellent example. This station is located at the foot of a small hill (Fig. 2) where the occurrence of lateral subsurface movement is highly probable. This shows that although the analysis would be limited to smaller scales, or even a single point, recognition of regional setting is important for interpretation of results. In addition, subsurface dynamics can also be affected by capillary rise in areas with shallow groundwater. For future applications, the effect of both capillary rise and lateral movements to subsurface dynamics should be assessed and included in the DLNM analysis, but caution should be exercised when interpreting the results. Assessment of decoupling with DLNM is deemed more applicable to areas where the subsurface has insignificant groundwater influence and where vertical downward movement is the dominant flow mechanism.

The methods applied in this study allow for investigation of vertical soil moisture variability. More importantly, application of DLNM allowed for a decoupled soil moisture range to be quantitatively identified. The results also reveal that decoupling is not confined to dry soil moisture range as implied by previous studies. The reasons for decoupling are manifold, and controls for the dry soil moisture range may differ from those for the wet range. The results of this study have implications for remote sensing and data assimilation methods, especially for uncertainties related to the use of surface soil moisture to obtain integrated soil moisture values.

The datasets for soil moisture were obtained from the Water Resource Department of ITC-Twente University. At the moment, the datasets are not publicly available. Access to the datasets may be granted upon request from the institute through Rogier van der Velde, PhD (r.vandervelde@utwente.nl).

CDUC and MPvdP initially conceptualized the idea for investigating the relation between surface and subsurface soil moisture values. PJJFT provided significant contributions to the statistical analysis applied. All three authors contributed to writing and editing of the paper.

The authors declare that they have no conflict of interest.

The authors are grateful for the Water Resources Department of ITC-Twente
University, the Netherlands, for sharing the datasets from their network. We
thank the three anonymous reviewers for providing critical insights that
improved the paper. This work is part of the research programme
Optimizing Water Availability through Sentinel-1 Satellites (OWAS1S) with
project number 13871 which is (partly) financed by the Netherlands
Organisation for Scientific Research (NWO). We also thank colleagues in the
lab meeting group for insightful discussions as well as Demi Moore for
proofreading earlier versions of the text.

Edited by: Nunzio Romano

Reviewed by: three anonymous referees

Albertson, J. D. and Montaldo, N.: Temporal dynamics of soil moisture variability: 1. Theoretical basis, Water Resour. Res., 39, 1274, https://doi.org/10.1029/2002WR001616, 2003. a

Almon, S.: The distributed lag between capital appropriations and expenditures, Econometrica, 33, 178–196, 1965. a

Angers, D. A. and Caron, J.: Plant-induced changes in soil structure: processes and feedbacks, in: Plant-induced soil changes: processes and feedbacks, Springer Netherlands, 55–72, 1998. a

Belmans, C., Wesseling, J., and Feddes, R. A.: Simulation model of the water balance of a cropped soil: SWATRE, J. Hydrol., 63, 271–286, 1983. a

Beven, K. and Germann, P.: Macropores and water flow in soils revisited, Water Resour. Res., 49, 3071–3092, 2013. a

Bhaskaran, K., Gasparrini, A., Hajat, S., Smeeth, L., and Armstrong, B.: Time series regression studies in environmental epidemiology, Int. J. Epidemiol., 42, 1187–1195, 2013. a

Brocca, L., Ciabatta, L., Massari, C., Camici, S., and Tarpanelli, A.: Soil Moisture for Hydrological Applications: Open Questions and New Opportunities, Water, 9, 1–20, 2017. a

Capehart, W. J. and Carlson, T. N.: Decoupling of surface and near-surface soil water content: A remote sensing perspective, Water Resour. Res., 33, 1383–1395, 1997. a, b, c, d, e

Cleveland, R. B., Cleveland, W. S., and Terpenning, I.: STL: A seasonal-trend decomposition procedure based on loess, J. Offic. Stat., 6, 3–73, 1990. a

Cleveland, W. S. and Devlin, S. J.: Locally weighted regression: an approach to regression analysis by local fitting, J. Am. Stat. Assoc., 83, 596–610, 1988. a

Das, N., Mohanty, B., Cosh, M., and Jackson, T.: Modeling and assimilation of root zone soil moisture using remote sensing observations in Walnut Gulch Watershed during SMEX04, Remote Sens. Environ., 112, 415–429, 2008. a

De Lannoy, G. J., Verhoest, N. E., Houser, P. R., Gish, T. J., and Van Meirvenne, M.: Spatial and temporal characteristics of soil moisture in an intensively monitored agricultural field (OPE 3), J. Hydrol., 331, 719–730, 2006. a

Dente, L., Verkedy, Z., Su, Z., and Ucer, M.: Twente Soil Moisture and Soil Temperature Monitoring Network, Tech. rep., ITC – University of Twente, Enschede, the Netherlands, http://www.itc.nl/library/papers_2011/scie/dente_twe.pdf (last access: 11 April 2018), 2011. a, b, c

Ford, T. W., Harris, E., and Quiring, S. M.: Estimating root zone soil moisture using near-surface observations from SMOS, Hydrol. Earth Syst. Sci., 18, 139–154, https://doi.org/10.5194/hess-18-139-2014, 2014. a

Gasparrini, A.: Distributed lag linear and non-linear models in R: the package dlnm, J. Stat. Softw., 43, 1–20, 2011. a

Gasparrini, A., Armstrong, B., and Kenward, M.: Distributed lag non-linear models, Stat. Med., 29, 2224–2234, 2010. a, b, c, d

Gasparrini, A., Scheipl, F., Armstrong, B., and Kenward, M.: A penalized framework for distributed lag non-linear models, Biometrics, 73, 938–948, https://doi.org/10.1111/biom.12645, 2017. a, b

Hirschi, M., Mueller, B., Dorigo, W., and Seneviratne, S.: Using remotely sensed soil moisture for land–atmosphere coupling diagnostics: The role of surface vs. root-zone soil moisture variability, Remote Sens. Environ., 154, 246–252, 2014. a

Houser, P. R., Shuttleworth, W. J., Famiglietti, J. S., Gupta, H. V., Syed, K. H., and Goodrich, D. C.: Integration of soil moisture remote sensing and hydrologic modeling using data assimilation, Water Resour. Res., 34, 3405–3420, 1998. a

Hupet, F. and Vanclooster, M.: Intraseasonal dynamics of soil moisture variability within a small agricultural maize cropped field, J. Hydrol., 261, 86–101, 2002. a

Jackson, T. J.: III. Measuring surface soil moisture using passive microwave remote sensing, Hydrol. Process., 7, 139–152, 1993. a

Kim, G. and Barros, A. P.: Space–time characterization of soil moisture from passive microwave remotely sensed imagery and ancillary data, Remote Sens. Environ., 81, 393–403, 2002. a

Kodešová, R., Kodeš, V., Žigová, A., and Šimúnek, J.: Impact of plant roots and soil organisms on soil micromorphology and hydraulic properties, Biologia, 61, S339–S343, 2006. a

Koster, R. D., Dirmeyer, P. A., Guo, Z., Bonan, G., Chan, E., Cox, P., Gordon, C., Kanae, S., Kowalczyk, E., Lawrence, D., et al.: Regions of strong coupling between soil moisture and precipitation, Science, 305, 1138–1140, 2004. a

Kumar, S. V., Reichle, R. H., Koster, R. D., Crow, W. T., and Peters-Lidard, C. D.: Role of subsurface physics in the assimilation of surface soil moisture observations, J. Hydrometeorol., 10, 1534–1547, 2009. a, b

Mahmood, R., Littell, A., Hubbard, K. G., and You, J.: Observed data-based assessment of relationships among soil moisture at various depths, precipitation, and temperature, Appl. Geogr., 34, 255–264, 2012. a

Martinez, C., Hancock, G., Kalma, J., and Wells, T.: Spatio-temporal distribution of near-surface and root zone soil moisture at the catchment scale, Hydrol. Process., 22, 2699–2714, 2008. a

Mohanty, B. P. and Skaggs, T.: Spatio-temporal evolution and time-stable characteristics of soil moisture within remote sensing footprints with varying soil, slope, and vegetation, Adv. Water Resour., 24, 1051–1067, 2001. a

Mohanty, B. P., Cosh, M. H., Lakshmi, V., and Montzka, C.: Soil Moisture Remote Sensing: State-of-the-Science, 16, 1–9, https://doi.org/10.2136/vzj2016.10.0105, 2017. a

Njoku, E. G., Jackson, T. J., Lakshmi, V., Chan, T. K., and Nghiem, S. V.: Soil moisture retrieval from AMSR-E, IEEE T. Geosci. Remote, 41, 215–229, 2003. a

Obermeier, V., Scheipl, F., Heumann, C., Wassermann, J., and Küchenhoff, H.: Flexible distributed lags for modelling earthquake data, J. Roy. Stat. Soc. C, 64, 395–412, 2015. a

Qiu, J., Crow, W. T., Nearing, G. S., Mo, X., and Liu, S.: The impact of vertical measurement depth on the information content of soil moisture times series data, Geophys. Res. Lett., 41, 4997–5004, 2014. a, b

Rodriguez-Iturbe, I., Porporato, A., Ridolfi, L., Isham, V., and Cox, D.: Probabilistic modelling of water balance at a point: the role of climate, soil and vegetation, P. Roy. Soc. Lond. A, 455, 3789, https://doi.org/10.1098/rspa.1999.0477, 1999. a

Shumway, R. H. and Stoffer, D. S.: Time series analysis and its applications: with R examples, Springer Science & Business Media, New York, 2010. a, b

Wilson, D. J., Western, A. W., Grayson, R. B., Berg, A. A., Lear, M. S., Rodell, M., Famiglietti, J. S., Woods, R. A., and McMahon, T. A.: Spatial distribution of soil moisture over 6 and 30 cm depth, Mahurangi river catchment, New Zealand, J. Hydrol., 276, 254–274, 2003. a, b

Wilson, D. J., Western, A. W., and Grayson, R. B.: Identifying and quantifying sources of variability in temporal and spatial soil moisture observations, Water Resour. Res., 40, W02507, https://doi.org/10.1029/2003WR002306, 2004. a

Wood, S. N.: Generalized Additive Models: An Introduction with R, CRC Press, Wageningen, the Netherlands, 2006a. a

Wood, S. N.: Low-Rank Scale-Invariant Tensor Product Smooths for Generalized Additive Mixed Models, Biometrics, 62, 1025–1036, https://doi.org/10.1111/j.1541-0420.2006.00574.x, 2006b. a

Wosten, J., de Vries, F., Hoogland, T., Massop, H., Veldhuizen, A., Vroon, H., Wesseling, J., Heijkers, J., and Bolman, A.: BOFEK2012, de nieuwe bodemfysische schematisatie van Nederland, Tech. rep., Alterra, Wageningen, the Netherlands, 2013. a, b

Wu, W., Xiao, Y., Li, G., Zeng, W., Lin, H., Rutherford, S., Xu, Y., Luo, Y., Xu, X., Chu, C., et al.: Temperature–mortality relationship in four subtropical Chinese cities: a time-series study using a distributed lag non-linear model, Sci. Total Environ., 449, 355–362, 2013. a

Yu, G.-R., Zhuang, J., Nakayama, K., and Jin, Y.: Root water uptake and profile soil water as affected by vertical root distribution, Plant Ecol., 189, 15–30, https://doi.org/10.1007/s11258-006-9163-y, 2007. a

Zanobetti, A., Schwartz, J., Samoli, E., Gryparis, A., Touloumi, G., Atkinson, R., Le Tertre, A., Bobros, J., Celko, M., Goren, A., Forsberg, B., Michelozzi, P., Rabczenko, D., Aranguez Ruiz, E., and Katsouyanni, K.: The temporal pattern of mortality responses to air pollution: a multicity assessment of mortality displacement, Epidemiology, 13, 87–93, 2002. a