Land surface models (LSMs) use a large cohort of parameters and state variables to simulate the water and energy balance at the soil–atmosphere interface. Many of these model parameters cannot be measured directly in the field, and require calibration against measured fluxes of carbon dioxide, sensible and/or latent heat, and/or observations of the thermal and/or moisture state of the soil. Here, we evaluate the usefulness and applicability of four different data assimilation methods for joint parameter and state estimation of the Variable Infiltration Capacity Model (VIC-3L) and the Community Land Model (CLM) using a 5-month calibration (assimilation) period (March–July 2012) of areal-averaged SPADE soil moisture measurements at 5, 20, and 50 cm depths in the Rollesbroich experimental test site in the Eifel mountain range in western Germany. We used the EnKF with state augmentation or dual estimation, respectively, and the residual resampling PF with a simple, statistically deficient, or more sophisticated, MCMC-based parameter resampling method. The performance of the “calibrated” LSM models was investigated using SPADE water content measurements of a 5-month evaluation period (August–December 2012). As expected, all DA methods enhance the ability of the VIC and CLM models to describe spatiotemporal patterns of moisture storage within the vadose zone of the Rollesbroich site, particularly if the maximum baseflow velocity (VIC) or fractions of sand, clay, and organic matter of each layer (CLM) are estimated jointly with the model states of each soil layer. The differences between the soil moisture simulations of VIC-3L and CLM are much larger than the discrepancies among the four data assimilation methods. The EnKF with state augmentation or dual estimation yields the best performance of VIC-3L and CLM during the calibration and evaluation period, yet results are in close agreement with the PF using MCMC resampling. Overall, CLM demonstrated the best performance for the Rollesbroich site. The large systematic underestimation of water storage at 50 cm depth by VIC-3L during the first few months of the evaluation period questions, in part, the validity of its fixed water table depth at the bottom of the modeled soil domain.

Land surface models (LSMs) are used widely to simulate and predict the
exchanges of momentum, energy, and mass between the terrestrial biosphere and
overlying atmosphere at local, regional, and global scales. These models also
play a key role in assessing impacts of environmental changes (climate, land
use, and land cover) on energy, water, and biogeochemical fluxes
(e.g., CO

Many of the parameters of a LSM are model dependent and therefore difficult to transfer between different land surface schemes. Nevertheless, all LSMs use soil hydraulic, vegetation, and thermal parameters to describe heat transport, water flow, and root water uptake (canopy transpiration) in the variably saturated soil domain, and share a reflection coefficient (also known as surface albedo) to calculate the reflected shortwave radiation. Two main approaches exist to determine the hydraulic and thermal properties of the considered soil domain. Some LSMs such as the Community Land Model (CLM) use basic soil data (soil texture and organic matter fraction) to estimate hydraulic and thermal parameters via pedotransfer functions (Oleson et al., 2013; Han et al., 2014; Vereecken et al., 2016). Other land surface schemes such as the Variable Infiltration Capacity Model (VIC) (Liang et al., 1994; Gao et al., 2010) expect users to specify values for the hydraulic and thermal parameters. Pedotransfer functions are particularly useful in large-scale application of CLM as they simplify tremendously soil hydraulic characterization. Nevertheless, soil hydraulic parameter values derived from pedotransfer functions are subject to considerable uncertainty, and might therefore not accurately describe soil water movement and storage, particularly at larger spatial scales. What is more, (measurement) errors of the atmospheric forcing (e.g., wind speed, temperature, radiation, vapor pressure deficit, and precipitation) and errors in the auxiliary model input (e.g., topographic properties, vegetation characteristics) further enhance LSM prediction uncertainty.

In the past decades, many different search and optimization methods have been developed for automatic calibration of dynamic system models. Of these, Bayesian methods have found widespread application and use in Earth systems modeling due to their innate ability to treat, at least in principle, model input (forcing), output (forecast), parameter, and structural errors. The Bayesian approach relaxes the assumption of a single optimum parameter value in favor of a posterior parameter and forecast distribution which summarizes the coordinated impact of different uncertainties on the modeling results. Yet, general-purpose methods such as DREAM (Vrugt et al., 2008, 2009; Vrugt, 2016) require a relatively large number of LSM evaluations to estimate parameter and forecast uncertainty. This can pose significant computational challenges for CPU-intensive and parameter-rich LSMs, and complicates treatment of input data uncertainty via latent variables (e.g., Vrugt et al., 2008).

Data assimilation offers an attractive alternative as a general framework to account for LSM parameters, input, output, and other sources of uncertainty to take advantage of all available ground-based, airborne, or spaceborne observations to improve the compliance between numerical models and corresponding data. This approach enables joint estimation of model state variables and parameters and simplifies treatment of forcing data errors (Liu and Gupta, 2007). Many different studies published in the hydrologic literature have demonstrated the benefits of parameter estimation in the context of data assimilation for soil moisture characterization (e.g., Montzka et al., 2011; Lee et al., 2014), rainfall–runoff (e.g., Moradkhani et al., 2005b; Vrugt et al., 2005a) and land surface modeling (e.g., Pauwels et al., 2009).

Data assimilation methods merge uncertain observations with predictions (output) of imperfect models to optimally estimate the state of a dynamical system. The prototype of this method, the Kalman filter (KF), was developed in the 1960s by Rudy Kalman for optimal control of linear dynamical systems (Kalman, 1960). The KF is a maximum likelihood estimator of the dynamic state of the system if the model error and measurement error distributions are (multivariate) normal. For nonlinear dynamical models this Gaussian assumption is not generally valid, and the KF will not give a maximum likelihood state estimate. The ensemble Kalman filter, or EnKF, is a stochastic generalization of the KF to nonlinear system models, in which the evolution of the model error covariance matrix is derived from a finite set of state realizations (Evensen, 1994). The use of this Monte Carlo ensemble not only makes possible state estimation for complex system models, but also enables the explicit treatment of different sources of modeling error. Two decades on from its inception, the EnKF has received operational status in real-time weather, tsunami, and flood prediction systems (amongst others) due to its proven ability to enhance a model's forecast skill and characterize accurately prediction uncertainty.

State estimation via the EnKF advances significantly the capabilities of hydrologic and land surface models to predict spatiotemporal dynamics of water movement and storage in soils, lakes, and reservoirs, and fluxes of mass, energy, and momentum between the soil and the atmosphere. The predictive skill of these models is, however determined in large part by their parameterization. This has led hydrologists and hydrometeorologists to develop data assimilation approaches that permit the simultaneous inference of model state variables and parameter values. The power and usefulness of such joint state and parameter estimation methods have been investigated by many different authors in the water resources literature. Most of these publications use synthetic (or twin) experiments with assimilation of artificially generated data. Examples include studies with simulated measurements of the groundwater table depth or hydraulic head (Franssen and Kinzelbach, 2008; Bailey and Baù, 2012; Kurtz et al., 2014; Shi et al., 2014; Song et al., 2014; Tang et al., 2015), discharge/streamflow (Bailey and Baù, 2012; Moradkhani et al., 2012; Vrugt et al., 2013; Rasmussen et al., 2015), groundwater temperature (Kurtz et al., 2014), soil moisture (Wu and Margulis, 2011; Plaza et al., 2012; Erdal et al., 2014; Shi et al., 2014; Song et al., 2014; Pasetto et al., 2015), brightness temperature from passive remote sensing (Montzka et al., 2013; Han et al., 2014), and contaminant concentration (Gharamti et al., 2013). These studies use a variety of different methods for joint parameter and state estimation, among which the EnKF (Franssen and Kinzelbach, 2008; Wu et al., 2011; Gharamti et al., 2013; Erdal et al., 2014; Kurtz et al., 2014; Shi et al., 2014; Pasetto et al., 2015), the iterative EnKF (Song et al., 2014), the extended KF (Pauwels et al., 2009), the local ensemble transform KF (Han et al., 2014), the ensemble transform KF (Rasmussen et al., 2015), and the normal score EnKF (Tang et al., 2015).

The overarching conclusion from the body of synthetic experiments is that the joint estimation of parameters and state variables via data assimilation enhances significantly the predictive capabilities of hydrologic and land surface models. This finding is corroborated by results for real-world assimilation studies documented in a rapidly growing list of publications and involving model structural inadequacies, measurement errors of the atmospheric forcing variables and calibration (assimilation) data, inadequate characterization of the lower boundary condition (aquifer), and uncertainty of other, auxiliary, model inputs. This includes assimilation of measurements of the electrical conductivity (Wu and Margulis, 2013), hydraulic head in wells (Kurtz et al., 2014; L. Shi et al., 2015), groundwater temperature (Kurtz et al., 2014), streamflow and discharge (Moradkhani et al., 2012; Y. Shi et al., 2015), active remote sensing (Pauwels et al., 2009), passive brightness temperature (Qin et al., 2009), soil moisture from lysimeters (Lue et al., 2011; Wu and Margulis, 2013; Erdal et al., 2014; L. Shi et al., 2015), land surface temperature (Bateni and Entekhabi, 2012), and sensible and latent heat fluxes (Y. Shi et al., 2015) using methods such as the PF (Qin et al., 2009), PMCMC (Moradkhani et al., 2012), EnKF (Bateni and Entekhabi, 2012; Wu and Margulis, 2013; Erdal et al., 2014; Kurtz et al., 2014; Y. Shi et al., 2015), and the extended KF (Pauwels et al., 2009; Lue et al., 2011). Despite this growing body of applications, relatively few studies (e.g., Lue et al., 2011; Y. Shi et al., 2015) have focused on an accurate characterization of soil moisture dynamics simulated by LSMs. This is particularly surprising, as root zone moisture storage modulates spatiotemporal variations in climate and weather, and governs the production and health status of crops and the organization of natural ecosystems and biodiversity (Vereecken et al., 2008).

In this paper, we evaluate the usefulness and applicability of four different data assimilation methods for joint parameter and state estimation of VIC-3L and CLM using a 5-month calibration (assimilation) period of soil moisture measurements at 5, 20, and 50 cm depths in the Rollesbroich experimental test site in the Eifel mountain range in western Germany. This grassland site is part of the TERENO network of observatories and has been extensively monitored since 2011 to catalog long-term ecological, social, and economic impacts of global change at regional level. We used the EnKF with state augmentation (Chen and Zhang, 2006) or dual estimation (Moradkhani et al., 2005b), respectively, and the residual resampling PF (Douc et al., 2005) with a simple, statistically deficient (Moradkhani et al., 2005a), or more sophisticated, MCMC-based (Vrugt et al., 2013) parameter resampling method. The “calibrated” LSM models were tested using SPADE water content measurements from a 5-month evaluation period. To the best of our knowledge, this is only the second study after Chen et al. (2015) that compares sequential data assimilation methods for joint parameter and state estimation of a LSM. Related work by DeChant and Moradkhani (2012) and Dumedah and Coulibaly (2013) consider application to the rainfall–runoff transformation of a watershed.

The three main objectives of our study may be summarized as follows: (1) to evaluate the usefulness and applicability of joint parameter and state estimation for soil moisture characterization with LSMs, (2) to compare the performance of four commonly used parameter and state estimation methods in their ability to predict soil moisture dynamics at different depths in the Rollesbroich experimental test site, and (3) to compare, contrast, and juxtapose the soil moisture simulations and predictions of CLM and VIC.

The remainder of this paper is organized as follows. Section 2 discusses briefly VIC-3L and CLM, which are used as our LSMs to characterize soil moisture dynamics of the Rollesbroich experimental site in Germany. In this section, we contrast the numerical approaches, boundary conditions, and spatial discretization (soil layers) that are used by VIC-3L and CLM to describe water flow and storage in the modeled soil domain, and are particularly concerned with selection of their calibration parameters. Section 3 then reviews the basic concepts and theory of the four different data assimilation algorithms used herein. This is followed in Sect. 4 with a detailed discussion of the Rollesbroich experimental site, and the numerical implementation and setup of each data assimilation method. Section 5 introduces the results of the different parameter and state estimation methods and two LSMs, and Sect. 6 discusses the main findings of our numerical experiments and assimilation studies. Finally, this paper concludes in Sect. 7 with a summary of our main findings.

LSMs simulate terrestrial biosphere fluxes of matter and energy via numerical solution of the water, energy, and carbon balance of the land surface. This includes hydrologic processes such as soil evaporation, infiltration, surface runoff, canopy interception and transpiration, aquifer discharge, groundwater recharge, and precipitation (Schaake et al., 1996), and energy fluxes such as latent and sensible heat from soil, snow, surface water, and vegetated surfaces (Bertoldi, 2004). Their respective equations contain parameters whose values depend on global or regional distributions of vegetation and soil properties (Milly and Shmakin, 2002).

The Rollesbroich site investigated herein covers an area of about
270 000 m

The assumption of homogeneity simplifies considerably the model definition in Eq. (1). Yet, this lumped topology might not characterize adequately real-world soil land surface systems that exhibit considerable spatial variations in soils, vegetation, and land properties. Such systems might necessitate distributed application of Eq. (1) via spatial discretization of the considered land surface domain into different grid cells. This discretization should honor spatial variations in vegetation and soil properties, and could account for small-scale (within-grid-cell) variability. Nevertheless, in our present application of LSM we the treat the Rollesbroich site as a single grid cell with grassland vegetation and homogeneous, but layered, soil (details to follow).

We now discuss briefly two different land surface schemes, VIC-3L and CLM, which are used to describe temporal variations in soil water storage at different depths in the Rollesbroich experimental site in Germany.

The VIC model is a macro-scale semi-distributed hydrological model which
solves for the water and energy balance of each grid cell using explicit
consideration of within-grid-cell vegetation variations. Accordingly, each
grid cell is divided into land cover tiles (Liang et al., 1994, 1996;
Cherkauer and Lettenmaier, 1999) and assumes constant values of the soil
properties (e.g., soil texture, hydraulic conductivity, thermal
conductivity). The total evapotranspiration, sensible heat flux, effective
land surface temperature, and runoff are then obtained for each grid cell by
summing over all the land cover tiles (vegetation types and bare soil)
weighted by their respective fractional coverage (Gao et al., 2010). The VIC
model can either be executed in a water balance mode or a water-and-energy
balance mode. In this paper, we assume the latter and use a 70 cm deep soil
composed of a 10 cm surface layer followed by middle and bottom layers of 20
and 40 cm, respectively. The relatively thin surface layer is used to
capture rapid fluctuations in soil moisture due to rainfall and bare soil
evaporation, and the deepest and thickest layer summarizes seasonal water
content dynamics and baseflow. We use herein VIC-3L and force the model
with atmospheric boundary conditions (e.g., precipitation, wind speed, air
temperature, longwave and shortwave radiation, and relative humidity) for the
Rollesbroich experimental site in Germany. In the absence of detailed
information about the hydraulic properties of the considered soil domain, we
treat each layer's saturated hydraulic conductivity, log

CLM is the land model for the Community Earth System Model (CESM) (Oleson et al., 2013), and is made up of multiple different building blocks, or modules, which resolve processes related to land biogeophysics, the hydrological cycle, biogeochemistry, and dynamic vegetation composition, structure, and phenology. The model recognizes explicitly surface heterogeneity by dividing each individual grid cell into multiple subgrid levels. For example, a grid cell can be made up of different land cover types, each with its own respective patches of plant functional types (PFTs) and associated stem area index and canopy height. The first subgrid level is defined by land units (vegetated, lake, urban, glacier, and crop), each composed of a number of different columns (second subgrid level) for which separate energy and water calculations are made. Vegetated land units, as well as lakes and glaciers, use one column. Urban land uses five separate columns, and for crop land there is a distinction between irrigated and unirrigated columns, with one single crop occupying each column. The third subgrid level is composed of PFTs and includes bare soil. The vegetated column has 16 possible PFTs besides bare soil. For the crop column, several crop types are available. Processes such as canopy evaporation and transpiration are calculated for each individual PFT, whereas soil and snow processes are calculated at the column level using areal-weighted values of the properties of the PFTs of individual patches. Note that a similar aggregation approach is used by VIC-3L.

In our application of CLM to the Rollesbroich experimental site in Germany,
we calculate soil temperature for 15 different soil layers, and simulate
hydrological states and fluxes for the top 10 soil layers only. Appendix B
presents a brief description of the soil module of CLM, and discusses the
main parameters used. CLM is forced with atmospheric conditions (e.g.,
precipitation, vapor pressure deficit, wind speed, incoming shortwave and
longwave radiation) using values for the model parameters and initial states,
and land surface data and other physical constants and/or variables as
auxiliary input. The soil hydraulic (e.g., saturated hydraulic conductivity)
and thermal parameters of CLM are derived from built-in pedotransfer
functions (see Eqs. B1–B4 of Appendix B) using as inputs the auxiliary list

Description of the soil parameters of VIC-3L and CLM that are
subject to inference with the different data assimilation methods using the
5-month soil moisture calibration data period of the Rollesbroich site. We
list the symbol, unit, feasible range, perturbation, and domain of
application of each parameter of VIC-3L and CLM. The column with the
header “Perturbation” lists the statistical distributions that are used to
create the initial parameter ensemble for each data assimilation algorithm.
The notation

Before we proceed, we first summarize the main differences of VIC-3L and CLM in their calculations of the water and energy balance of the land surface. In the first place, VIC-3L treats the vadose zone as a multi-layer bucket with variable infiltration capacity, whereas CLM uses a more physics-based description of soil water movement, storage, and associated hydrological fluxes (e.g., root water uptake) by numerical solution of a modified form of Richards' equation (Zeng and Decker, 2009). A bucket model is computationally convenient, but sacrifices important detail regarding the vertical distribution of soil water storage. The latter is a prerequisite for characterizing accurately processes such as infiltration, redistribution, root-water uptake, drainage, and groundwater recharge. We refer the interested reader to Romano et al. (2011) for a detailed comparison of bucket type and Richards-based vadose zone flow models.

Second, VIC-3L treats the saturated and variably saturated soil domain as two separate, lumped, control volumes which are decoupled from the underlying groundwater reservoir. In other words, a fixed lower boundary condition is imposed. CLM, by contrast, simulates interactions between the modeled soil domain and an unconfined aquifer. The resulting water table variations of the aquifer affect soil water movement in the unsaturated zone via a variable recharge flux. In our application of CLM, this recharge flux emanates at the bottom of the tenth soil layer. The calculation of this recharge flux may be best explained via the use of a virtual soil layer, say layer 11, whose depth extends from the bottom of layer 10 to the groundwater table. If we assume hydrostatic conditions in layer 11, then we can calculate the recharge flux from layer 10 using Eq. (B9) in Appendix B. This recharge flux then changes the depth of the water table according to Eq. (B11). This equation also takes into consideration drainage from the water table due to topographic gradients. If the groundwater table is within the upper 10 soil layers, a drainage flux emanates from the uppermost saturated layer according to Eq. (B10).

Third, VIC-3L expects the user to specify values for the soil hydraulic
(e.g., saturated hydraulic conductivity), thermal, and baseflow parameters of
the first, second, and third layers of each grid cell, respectively, whereas
CLM derives their counterparts (e.g., hydraulic conductivity at
saturation, matric head at saturation, Clapp–Hornberger exponent

Finally, VIC-3L allows the user to determine freely the number and thickness of the soil layers in the bucket model (the default is three layers), whereas CLM assumes a fixed thickness of each soil layer.

LSMs contain a large number of parameters whose values can be adjusted by fitting model output to observed data. Yet, only a few of those parameters will affect noticeably model performance. Various authors have investigated the parameter sensitivity of VIC-3L via Monte Carlo simulation, generalized likelihood uncertainty estimation (GLUE), or model calibration methods (Demaria et al., 2007; Xie et al., 2007; Troy et al., 2008). These studies demonstrated a strong dependency of parameter sensitivity on climatic conditions. Table 1 lists VIC-3L and CLM parameters that have been selected for calibration via data assimilation, and reports their units, feasible ranges, perturbation, and spatial configuration. To honor prior information (e.g., soil textural data), we do not draw the model parameters from their feasible ranges, but rather sample their initial values around some best-guess VIC-3L and CLM parameterization using the normal and uniform distributions listed under the header “Perturbation”. This makes up the prior parameter distribution and is further explained in Sect. 4.2.

Appendices A (VIC-3L) and B (CLM) summarize the main variables, processes, and equations which are used by both models to describe the storage and vertical and/or horizontal movement of water in the variably saturated soil domain of the Rollesbroich site. These two appendices help to better understand the role of the different calibration parameters of Table 1, and will be most beneficial to readers who are rather unfamiliar with both models. Note that CLM estimates the hydraulic and thermal parameters of each soil layer from built-in pedotransfer functions (Oleson et al., 2013; Han et al., 2014) using as input the sand, clay, and organic matter fractions of each soil layer.

Data assimilation methods merge uncertain observations with predictions (output) of imperfect models to optimally estimate the state and/or parameters of a dynamical system. This includes the use of four-dimensional variational data assimilation (4D-Var), EnKF, PF, and related assimilation schemes. These methods have been applied successfully to a large number of different fields for model–data fusion in the atmospheric, oceanic, biogeochemical, and hydrological sciences. We now briefly discuss the theory of four different data assimilation methods which are used herein with VIC-3L and CLM to characterize spatiotemporal soil moisture dynamics at our experimental site.

The EnKF was proposed by Evensen (1994) as a generalization of the Kalman
filter to nonlinear system models with many state variables. This method uses
a Monte Carlo approach to generate an ensemble of different model
trajectories from which the time evolution of the probability density of the
model states and related error covariances are estimated. The EnKF uses a
state-space implementation of the dynamic system model of Eq. (1) and
implements the following steps (Burgers et al., 1998):

We can now update the predicted state values of each ensemble member as
follows:

In some cases it might be appropriate to estimate the model parameters along
with the state variables. This requires a slight modification to the
state-space formulation of Eq. (2) as the

In state augmentation, the

This results in the following equation for the updated states and parameter
values:

In the dual estimation approach, the state variables and model parameters are
stored in two separate vectors and updated using their own individual steps
(Moradkhani et al., 2005b). The parameter values of each ensemble member are
first updated according to

The EnKF suffers from filter inbreeding; that is, the ensemble spread
degrades after several data assimilation steps. In extreme cases, the
covariance matrix

The PF was first suggested in the research area of object recognition,
robotics and target tracking (Gordon et al., 1993) and was introduced to
hydrology by Moradkhani et al. (2005b). The PF differs from the EnKF in that
it describes the evolving probability density function (PDF) of the LSM state
variables by a set of

If we assume the parameters to be known, then we can write the evolving
posterior distribution,

We conveniently assume herein, a Gaussian likelihood function:

The PF makes use of the following identity of Eq. (14) to approximate the
evolving state PDF:

Before we can implement the PF in practice, we need to specify the importance
density,

To combat particle degeneracy, we monitor the effective sample size (ESS)
after assimilation of each new observation:

In the present application of the RRPF, we not only estimate the LSM states
but also jointly infer the values of the model parameters. We use state
augmentation and add the model parameters to the vector of LSM state
variables. Yet, this approach requires definition of an importance density
for the parameters to avoid parameter impoverishment after several successive
assimilation steps. This has been demonstrated numerically by Plaza et
al. (2012) using a series of data assimilation experiments. In principle, we
could corrupt the posterior parameter distribution using the ensemble
inflation method of Whitaker and Hamill (2012) detailed in Eq. (13). This
approach was used by Qin et al. (2009) to avoid degeneracy of the parameter
values. Instead, we use the approach described by Plaza et al. (2012) and
perturb the parameter values of the resampled particles using draws from a
zero-mean

The RR procedure produces a sample with more evenly distributed weights, but
many of the particles are exact copies of one another. To enhance sample
diversity, we therefore evaluate another resampling step using Markov chain
Monte Carlo (MCMC) simulation. We follow herein the MCMC resampling method of
Vrugt et al. (2013) and create candidate particles after RR using a discrete
proposal distribution with state and parameter jumps equal to a multiple of
the difference of two or more pairs of resampled particles. Each candidate
particle is then re-evaluated between

Before we proceed with application of the EnKF-AUG, EnKF-DUAL, RRPF and PMCMC
data assimilation methods, we reminisce about the key differences of the EnKF
and PF. These differences are often overlooked and misunderstood but of
crucial importance to help understand the two filters, and analyse and
interpret our findings (see Vrugt et al., 2013). Most critically, the EnKF
uses the measured values of the state variables (via measurement operator, if
appropriate) to correct (update) the forecasted states of each ensemble
member. The state PDF at each time is approximated by a weighted average of
the distributions of the measured and forecast states. The PF on the other
hand does not use a state analysis step, but rather assigns a likelihood to
each particle. This likelihood is a dimensionless scalar which measures in a
probabilistic sense the distance between the measured and forecasted state
variables. The state PDF at each time is then constructed via the likelihoods
(normalized importance weights) of the particles. Resampling is required to
rejuvenate the ensemble, but this step is rather inefficient compared to the
state analysis step of the EnKF as the measured states are only used
indirectly in the PF via calculation of the likelihood. What is more, a
single resampling step in RRPF or PMCMC does not guarantee a good
approximation of the actual state PDF, as the particles' forecasted states
may be systematically biased. Consequently, the PF may need a very large
ensemble and/or many resampling steps to characterize properly the state PDF.
By contrast, the state analysis step of the EnKF resurrects rapidly a biased
ensemble by migrating the members' forecasted states in closer vicinity of
their measured values. This crucial difference between the EnKF and PF is the
result of their dichotomous design, as is also evident from our mathematical
notation. The EnKF estimates separately at each time the state PDF via
Eq. (5), whereas the PF is designed to estimate the posterior distribution of
the entire state trajectory via the recursion of Eq. (18). This latter task
is much more difficult in practice, and requires use of the laws of
probability to ensure that each particles' state trajectory constitutes a
plausible realization from the transition density,

We apply the four data assimilation approaches to characterize soil moisture
dynamics of the 27 ha Rollesbroich experimental test site
(50

Aerial photograph of the 270 000 m

The atmospheric LSM forcing data in this study were measured at the eddy covariance tower and include hourly measurements of air temperature, air pressure, relative humidity, wind speed, and incoming shortwave and longwave radiation. Precipitation was measured by a tipping bucket located in close proximity of the eddy covariance station. Soil texture was determined using 273 soil samples, taken from three different depths, ranging between 5 and 11, 11 and 35, and 35 to 65 cm. The sample locations coincided exactly with the location of the SoilNet sensors. The soil textural composition, organic carbon content, and bulk density were determined for each sample using standard laboratory experiments. These values were averaged to obtain mean values for the listed depths. Soil hydraulic parameters were then estimated for each of these three measurement depths from pedotransfer functions using as input data the basic soil measurements.

In this work, we conveniently assume the soil land surface domain of the
Rollesbroich site to be homogeneous and characterized by areal-average values
of soil moisture content at 5, 20, and 50 cm depths. In other words, we
consider only vertical variations in soil water storage. Common LSM data
assimilation experiments published in the literature usually involve
application to much larger spatial scales, especially when remote sensing
data are used. Hence, it is important to evaluate the LSM performance for a
site where heterogeneities are neglected. Qu et al. (2014) investigated the
geostatistical properties of the soils of the Rollesbroich test site. This
work demonstrated a rather small spatial variability of the soil texture.
This does not suggest however that we can ignore spatial variations in the
measured soil moisture values. Indeed, the standard deviations of soil
moisture vary between 0.04 and 0.07 cm

A total of

Soil moisture contents measured at 5, 20, and 50 cm depths were assimilated jointly. The three (default) soil layers in VIC-3L (0–10, 10–30, and 30–70 cm) were synchronized to match the three measurement depths. Soil parameters were defined separately for all individual layers, measured or not. In CLM, we used 10 (default) soil layers with increasing thickness downwards (see Table 2). The 5, 20, and 50 cm measurement depths correspond to the third, fifth, and sixth layers in CLM. Spatial relationships (covariance matrices) between the soil parameters of the measured layers and their values of the unmeasured layers were used in the EnKF to update the parameterization of layers 1, 2, 4, 7, 8, 9, and 10. A slightly different approach was followed in RRPF and PMCMC, in which the soil parameters of the unmeasured moisture layers in CLM were updated to their weighted-average values of the resampled particles using the vector of normalized importance weights.

Nodal depth,

The measurement errors of the soil moisture observations are assumed to be
zero-mean Gaussian with standard deviation,

The hyetograph of each ensemble member is derived by multiplying the measured hourly precipitation rates of the tipping bucket by multipliers drawn from a unit-mean normal distribution with a standard deviation of 0.10. This is equivalent to a heteroscedastic error of 10 % of the observed precipitation (Hodgkinson et al., 2004). Forcing variables which govern evapotranspiration (incoming shortwave and longwave radiation, air temperature, relative humidity, and wind speed) were not corrupted.

The initial values of VIC-3L and CLM parameters are sampled at random
using a simple two-step procedure. This approach honours soil textural data
and is consistent with related results published in the literature. First, we
draw

One may debate our best-guess parameter values of VIC-3L and CLM and their respective marginal distributions. Nevertheless, the prior parameter distribution used herein introduces more than sufficient dispersion into the best-guess parameter values to rapidly overcome a possibly deficient initial model parameterization. Note that the prior uncertainty of the two texture parameters (sand and clay fraction) in CLM is much larger than their spread derived from the texture measurements of each soil layer. This inflation of the prior distribution is done purposely to account indirectly for the epistemic uncertainty of the pedotransfer functions that are used to predict the soil hydraulic parameters. Indeed, the prior parameter uncertainty of the sand and clay fraction should be large enough to guarantee a sufficient soil moisture spread of the ensemble, which is of crucial importance for an adequate performance of the different data assimilation methods.

Figure 2 shows the measured records of daily precipitation and daily air
temperature for the 10-month measurement period used herein. The measurement
period is rather wet, with several intensive precipitation events during the
summer. For example, notice the event on 27 July 2012 in which 31 mm of
precipitation fell in just 1 h. Our experience suggests that such extreme
rainfall events corrupt the parameter estimates, in large part due to an
inadequate description and/or characterization of surface runoff. What is
more, the correlation between the hydraulic parameters of the different
layers of our soil domain and the moisture state deteriorates rapidly close
to saturation. Therefore, on days with rainfall in excess of 20 mm we resort
to state estimation only, and proceed with this the next 2 consecutive days
to give VIC-3 and CLM sufficient opportunity to remove, via deficient surface
transport or state updating, the excess water. On the third day after each
20 mm

Historical records of daily mean air temperature (solid black line;
left

To evaluate joint state-parameter estimation algorithms for the two LSMs and the four different data assimilation algorithms, we carried out the following three numerical experiments for VIC-3L and CLM (see also Table 3).

Open-loop simulation. We evaluate the LSMs from 1 March 2012 to 31 December 2012 with time-invariant parameters via Monte Carlo simulation using a large number of draws from the prior parameter distribution summarized in Table 1 and Sect. 4.2.

State updating with EnKF. The soil moisture state variables were updated during the 5-month calibration period using the SPADE moisture content measurements. In theory, soil moisture assimilation should improve our estimates of the initial states of the evaluation period. We posit that this enhanced state-value characterization should improve the accuracy of the LSM simulated (predicted) soil moisture values during the first few days/weeks of the evaluation period, after which the model performance deteriorates rapidly over time in the absence of recursive state adjustments.

Joint state-parameter estimation using RRPF, PMCMC, and EnKF with state augmentation and dual estimation. The soil moisture state variables and model parameters are estimated during the 5-month calibration period using the SPADE soil moisture measurements. The parameter values and state variables at the end of the calibration data period are used for the evaluation period.

Summary of the different numerical experiments used in this paper for CLM and VIC-3L and their respective abbreviations used in the subsequent tables and figures.

We used the Nash–Sutcliffe model efficiency (NSE) and the root mean square
error (RMSE) to evaluate the quality-of-fit of VIC-3L and CLM predicted
(simulated) soil moisture values during the calibration (assimilation) and
evaluation periods. These two metrics are computed separately for the 5, 20,
and 50 cm measurement depths as follows:

In this section we present the results of our numerical experiments. We first discuss our findings for VIC-3L followed by the results of CLM. Sect. 6 proceeds with a discussion of the main findings.

Figure 3 displays the observed (blue dots) and VIC-3L predicted soil moisture
values (solid lines) at (a) 5, (b) 20, and (c) 50 cm depths using PMCMC
(black), RRPF (red), EnKF-AUG (green), and EnKF-DUAL (cyan). As the
Rollesbroich test site experiences a yearly average precipitation of more
than about 1000 mm it is not surprise that the upper soil layer at 5 cm is
rather wet with volumetric soil moisture contents that vary dynamically
between 0.3 and 0.5 cm

Calibration period: values of the NSE and RMSE summary statistics of the quality-of-fit of VIC-3L for the Rollesbroich soil moisture observations at 5, 20, and 50 cm depths using the PMCMC, RRPF, EnKF-AUG, and EnKF-DUAL data assimilation methods. For completeness, we also list the performance of the EnKF for state estimation only (noParamUpdate) using VIC-3L parameter values drawn randomly from the prior parameter distribution, and the performance of an open-loop run of VIC-3L (OpenLoop) using the mean simulation of many different VIC-3L parameterizations drawn randomly from the prior parameter distribution (see Table 1 and Sect. 4.2).

Assimilation period: observed (blue dots) and VIC-3L predicted time
series (solid lines) of soil moisture content at depths of

The different data assimilation methods demonstrate a rather similar performance with VIC-3L predicted moisture contents that track reasonably well the three different layers. Note however that RRPF does not reproduce well the measured data at 50 cm depth in the period from March (week 1) to June (week 17). This might be caused by filter inbreeding of the states, and will be discussed later (see also Fig. 9b). Nevertheless, RRPF recovers the observed soil moisture data in week 18. Although difficult to see, the EnKF produces the best results at 50 cm depth (state augmentation and dual estimation).

Table 4 summarizes the NSE and RMSE values of PMCMC, RRPF, EnKF-DUAL, and
EnKF-AUG for the calibration (assimilation) period. We also list the
performance of VIC-3L without data assimilation (OpenLoop) using the mean
soil moisture time series of many different realizations of the prior
parameter distribution, and include RMSE and NSE values of the EnKF for state
estimation only (noParamUpdate) using VIC-3L parameterizations drawn randomly
from its prior parameter distribution. The open loop deviates most from the
measured values, with RMSE values of 0.036, 0.037, and
0.129 cm

Figure 4 presents trace plots of VIC-3L parameters during the 5-month
calibration period using the PMCMC (black), PF (red), EnKF-AUG (green), and
EnKF-DUAL (cyan) data assimilation methods. We display the ensemble mean
saturated hydraulic conductivity (log

Trace plots (solid lines) of VIC-3L parameters. Saturated
hydraulic conductivity (log

Sampled trajectories of the

To provide a better understanding of the ensemble spread of VIC-3L
parameters, please consider Fig. 5, which presents trace plots of the sampled
log

Figure 6 displays VIC-3L simulated soil moisture time series for the independent 5-month evaluation period at (a) 5, (b) 20, and (c) 50 cm depths using initial states and parameter values derived from PMCMC (black), PF (red), EnKF-AUG (green), and EnKF-DUAL (cyan). The observed soil moisture values are separately indicated with the solid blue dots. The water content simulations of VIC-3L are hardly distinguishable, except for the deepest soil layer at 50 cm depth. Apparently, it does not matter which data assimilation method is used to estimate VIC-3L parameter values and initial states of the evaluation period. VIC-3L tracks very well the soil moisture data at 20 cm depth, but does not do a particularly good job of describing water content dynamics at 5 and 50 cm depths. In particular, the model systematically underestimates the observed storage of the bottom soil layer between weeks 25 and 36. This might be a consequence of the use of a fixed lower boundary condition (no connection with the underlying aquifer) and/or the relatively simple baseflow parameterization. Although not further shown herein, a separate VIC-3L run using state estimation only (noParamUpdate) produces similar results after a few days to an open-loop simulation.

Evaluation period: observed (blue dots) and VIC-3L simulated time
series (solid lines) of soil moisture content at depths of

We summarize in Table 5 the NSE and RMSE values of PMCMC, RRPF, EnKF-DUAL, and EnKF-AUG during the 5-month evaluation period. We also list the performance of VIC-3L without data assimilation (OpenLoop) using the mean soil moisture time series of many different realizations of the prior parameter distribution, and include RMSE and NSE values of the EnKF for state estimation only (noParamUpdate) using VIC-3L parameterizations drawn randomly from its prior parameter distribution. In general, the RMSE values of the evaluation period are much higher than their counterparts of the assimilation period, and noParamUpdate produces RMSE values similar to that of an open-loop simulation. VIC-3L parameter estimation is productive, as it substantially reduces the RMSE values of 20 and 50 cm measurement depths compared to a model run with state estimation only (noParamUpdate) and parameters drawn randomly from their prior distribution. More specifically, PMCMC, RRPF, EnKF-AUG, and EnKF-DUAL show a RMSE improvement of about 54 and 42 % for the second and third measurement depths compared to OpenLoop and noParamUpdate. The NSE values of VIC-3L for the 50 cm depth are negative for all six methods, conclusively demonstrating an inferior performance of the model for this soil layer.

Evaluation period: values of the NSE and RMSE summary statistics of the quality of fit of VIC-3L for the Rollesbroich soil moisture observations at 5, 20, and 50 cm depths using the calibrated parameter values and initial states derived from the PMCMC, RRPF, EnKF-AUG and EnKF-DUAL data assimilation methods. For completeness, we also list the performance of the EnKF using state estimation only (noParamUpdate) using VIC-3L parameter values drawn randomly from the prior parameter distribution, and the performance of an open-loop run of VIC-3L (OpenLoop) using the mean simulation of many different VIC-3L parameterizations drawn randomly from the prior parameter distribution.

We now investigate in more detail the effect of MCMC resampling with the PF as Fig. 4 has demonstrated that PMCMC produces rather dynamic trajectories of the sampled parameter values. Nevertheless, the parameters converge to stable values at the end of the assimilation period. This suggests that the choice of the length of the calibration period is crucially important in determining the performance of PMCMC during the evaluation period. To investigate this in more detail we use 11 June, 30 June, 20 July, and 31 July 2012 as end dates of the PMCMC calibration period and verify VIC-3L performance for the same 5-month evaluation period. The different end dates are conveniently referred to as PMCMC_0611, PMCMC_0630, PMCMC_0720, and PMCMC_0731 in Fig. 7. The simulated soil moisture trajectories of PMCMC_0630, PMCMC_0720, and PMCMC_0731 are in excellent agreement, but deviate from PMCMC_0611. Thus, a 4-month calibration period would have led to the same results of PMCMC.

The effect of initial uncertainties on the performance of EnKF with the ensemble inflation method is also tested with the VIC-3L model. Table 6 compares the RMSE values of EnKF-AUG and EnKF-DUAL for the calibration and evaluation period using heteroscedastic precipitation data errors equivalent to 10 % (default) and 20 % of their measured hourly rates plotted in Fig. 2. We list separate RMSE values for each soil moisture measurement depth. In short, the results are equivalent for both EnKF implementations.

RMSE values of VIC-3L for the Rollesbroich soil moisture measurements at 5, 20, and 50 cm depths using the EnKF with state AUGmentation or DUAL estimation during the calibration period. We also summarize the subsequent performance of the VIC-3L model using the calibrated parameter values and initial states derived from AUG and DUAL. The subscripts 10 and 20 % signify the standard deviations of the measurement errors that are used to corrupt the hourly precipitation data.

RMSE values of VIC-3L for the Rollesbroich soil moisture
observations at 5, 20, and 50 cm depths using data assimilation with RRPF
during the calibration period. We also summarize the subsequent performance
of the VIC-3L model using the calibrated parameter values and initial states
derived from RRPF. The subscripts 0.01, 0.1, and 0.5 signify the value of the
scaling factor

Next, we evaluate the effect of the choice of the scaling factor

Figure 8 displays trace plots of the sampled

Evaluation period: VIC-3L simulated volumetric moisture contents at

Sampled trajectories of the

Figure 10 shows the observed (blue dots) and ensemble mean predicted soil
moisture values by CLM (solid lines) at (a) 5, (b) 20, and (c) 50 cm
depths during the assimilation period using PMCMC (black), PF (red), EnKF-AUG
(green), and EnKF-DUAL (cyan). The most important results are as follows.
First, the ensemble mean soil moisture time series of CLM exhibit a
larger spread than VIC-3L depicted previously in Fig. 3. Second, the EnKF-AUG
and EnKF-DUAL exhibit a superior performance with ensemble mean CLM
simulations that track closely the observed soil moisture observations at
each depth. Third, the moisture time series (and data) demonstrate most
dynamics at the 5 cm depth in response to the variable atmospheric boundary
conditions. Fourth, the worst performance is observed for RRPF, as evidenced
by systematic deviations of this filter's soil moisture predictions with the
observed data between weeks 3–6 and 18–21 for the 5 cm depth, weeks 1–14
and weeks 18–21 for the 20 cm depth, and weeks 1–15 and 19–22 for the
50 cm measurement depth. Fourth, the initial soil moisture values of CLM
at 50 cm depth appear positively biased with a distance of approximately
0.05 cm

Soil moisture trajectories of the

CLM predicted time series of soil moisture content at

Table 8 lists the NSE and RMSE values of PMCMC, RRPF, EnKF-DUAL, and EnKF-AUG for the CLM calibration (assimilation) period. We also list the performance of CLM without data assimilation (OpenLoop) using the mean soil moisture time series of many different realizations of the prior parameter distribution, and list in column with header “noParamUpdate” the RMSE and NSE values of the EnKF using state estimation only with CLM parameterizations drawn randomly from the prior parameter distribution. These results demonstrate that soil moisture assimilation enhances considerably the ability of CLM to predict the observed data. Compared to open-loop CLM simulation, the RMSE is reduced from 0.051, 0.031, and 0.069 to values of about 0.020, 0.012, and 0.016 (average) for the different data assimilation methods, respectively. Yet, the RMSE and NSE values of a CLM run with state estimation only (noParamUpdate) appear as good as those derived from joint parameter and state estimation using PMCMC, RRPF, EnkF-AUG, and EnKF-DUAL. Overall, the best performance is observed for EnKF-AUG and EnKF-DUAL, followed by PMCMC and RRPF.

Calibration period: values of the NSE and RMSE summary statistics of the quality of fit of CLM for the Rollesbroich soil moisture measurements at 5, 20, and 50 cm depths with the PMCMC, RRPF, EnKF-AUG and EnKF-DUAL data assimilation methods. For completeness, we also list the performance of the EnKF for state estimation only (noParamUpdate) using CLM parameter values drawn randomly from the prior parameter distribution, and the performance of an open-loop run of CLM (OpenLoop) using the mean simulation of many different CLM parameterizations drawn randomly from the prior parameter distribution.

We proceed in Fig. 11 with trace plots of the

Sampled trajectories of the

Figure 12 displays the observed (blue dots) and ensemble mean predicted soil moisture values by CLM (solid lines) at (a) 5, (b) 20, and (c) 50 cm depths during the evaluation period using PMCMC (black), PF (red), EnKF-AUG (green), and EnKF-DUAL (cyan). The soil moisture time series of the different data assimilation methods appear rather similar with largest differences observed at the 50 cm depth. In general, the PMCMC, RRPF, EnKF-AUG, and EnKF-DUAL methods do not do a particularly good job of tracking the soil moisture observations of the top soil layer. Indeed, the CLM soil moisture predictions derived from the different data assimilations are systematically biased, either underestimating (weeks 35–41 and 43–44) or overestimating (weeks 24–31 and 42) the observed soil moisture data during large parts of the evaluation data set. CLM tracks much better the soil moisture data of the 20 and 50 cm depths.

Trace plots of soil moisture contents simulated by CLM during
the evaluation period at

Finally, Table 9 presents the NSE and RMSE values of PMCMC, RRPF, EnKF-AUG and EnKF-DUAL during the 5-month evaluation period. We also list the performance of VIC-3L without data assimilation (OpenLoop) using the mean soil moisture time series derived from many different realizations of the prior parameter distribution, and display NSE and RMSE values of the EnFK using state estimation only (noParamUpdate) with CLM parameterizations drawn randomly from the prior parameter distribution. The results of this Table are in agreement with our findings for VIC-3L. Indeed, the RMSE values of the evaluation period are much higher than their counterparts of the assimilation period. This is particularly evident for the 5 cm measurement depth where RMSE values have increased from 0.017–0.027 to 0.054–0.058. The deeper measurement depths do not appear to be as much affected, consistent with our findings from Fig. 12. The results also highlight the importance of joint CLM parameter and state estimation as state estimation alone (column noParamUpdate) results in significantly larger RMSE values during the evaluation period. This is most evident for the 50 cm measurement depth, where the RMSE value of 0.050 of noParamUpdate is much larger than its value of 0.016–0.025 derived from PMCMC, RRPF, EnKF-AUG and EnKF-DUAL. Altogether, RRPF achieves the worst performance of all four parameter-state estimation methods during the evaluation period. PMCMC, EnKF-AUG and EnKF-DUAL provide rather similar RMSE and NSE values.

Evaluation period: NSE and RMSE values for the Rollesbroich soil moisture measurements at 5, 20, and 50 cm depths using CLM. The initial states and parameter values used by the PMCMC, RRPF, EnKF-AUG, and EnKF-DUAL data assimilation methods originate from the 5-month calibration data period. For completeness, we also list the performance of the EnKF using state estimation (noParamUpdate) using CLM parameter values drawn randomly from the prior parameter distribution, and the performance of an open-loop run of CLM (OpenLoop) using the mean simulation of many different CLM parameterizations drawn randomly from the prior parameter distribution.

In this study, we have evaluated the usefulness and applicability of four different data assimilation methods for joint parameter and state estimation of the VIC-3L and CLM land surface models using a 5-month calibration (assimilation) data set of distributed SPADE soil moisture measurements at the 5, 20, and 50 cm depths in the Rollesbroich test site in the Eifel mountain range in western Germany. We used the EnKF with state augmentation or dual estimation, respectively, and the PF with a simple, statistically deficient, or more sophisticated, MCMC-based parameter resampling method. The “calibrated” LSM models were tested using water content data from a 5-month evaluation period. The uniqueness of the present work resides in the application of these four joint or dual parameter and state estimation methods to real-world data.

Our results demonstrated that joint inference of VIC-3L and CLM soil
parameters improved considerably soil moisture characterization during the
evaluation period compared to the mean water content predictions of an
open-loop run derived via averaging of simulations of many different
realizations drawn randomly from the prior parameter distribution. This is
particularly true for CLM, the two deeper soil layers, and the EnKF-AUG
and EnKF-DUAL methods (but followed closely by PMCMC). Despite this
improvement in model performance over an open-loop simulation, VIC-3L and
CLM do not adequately characterize soil moisture dynamics of the top layer
(5 cm measurement depth) during the evaluation period (RMSE values of about
0.05 cm

The improvement in quality-of-fit of the VIC-3L and CLM models compared to an open-loop run does not necessarily imply that the estimated parameter values of VIC-3L and CLM characterize better the hydraulic properties and maximum baseflow velocity of the soils of the Rollesbroich experimental test site. Assimilation studies with synthetically generated data help to ascertain whether the model parameters converge properly to their “true” values, yet this is difficult to confirm with real-world measurements. State estimation will, without doubt, help reduce the impact of epistemic errors and systematic biases of LSM input and forcing data on parameter inference during the assimilation period (e.g., Vrugt et al., 2005b). But the calibrated parameter values derived with state estimation do not necessarily guarantee a consistent and adequate model performance during an independent evaluation period without state estimation. Indeed, without assimilation the simulated states may diverge from their “measured” values and deteriorate model performance in an evaluation period. This begs the question of which parameter values we should use to predict future system behavior outside an assimilation period. Should we use parameter estimates derived with state estimation or should we use their values derived via batch calibration (optimization) without recursive adjustments to the state variables? This dilemma is illustrated further in Vrugt et al. (2005b) by modeling of a subsurface tracer test using data from Yucca Mountain, Nevada, USA. We conclude that the enhanced performance of VIC-3L and CLM during the evaluation period compared to our open-loop simulation is due to improved estimates of the initial states and the soil parameters.

In our implementation of the EnKF and PF, VIC-3L and CLM parameters were
assumed to be time-variant and their values updated jointly with the model
states at each assimilation time step. The 5-month calibration period we used
herein involves several large precipitation events, and as a consequence, the
soil profile is rather wet. The resulting parameter estimates might therefore
not be representative of dry periods with much lower moisture values of the
soil profile. What is more, the assumption of spatial homogeneity might not
characterize adequately the distributed soil properties of the Rollesbroich
site and induce temporal variability in VIC-3L and CLM parameters. Bias
in model input and measurement errors of the forcing data also contribute to
the temporal fluctuations of the estimated parameter values. These temporal
parameter variations are meaningful in some cases as they can help diagnose
structural model inadequacies and/or biases in model input and forcing data.
Kurtz et al. (2012) successfully estimated a temporally variant parameter
with the EnKF, but these authors concluded that the algorithm needs a
considerable spin-up period to “warm-up” to new parameter values. Vrugt et
al. (2013) found considerable temporal non-stationarity in the parameters
estimated by PMCMC as a result of the small time period used to calculate the
acceptance probability of candidate particles. This finding is in agreement
with the results of PMCMC in our paper. Of course, we could have assumed
time-invariant parameters via a method such as SODA, but this would have
enhanced significantly computational requirements. Fortunately, parameters
estimated via our implementation of the EnKF exhibit asymptotic properties
during the assimilation period (e.g., see Y. Shi et al., 2015). This is
particularly true for highly sensitive parameters. An example of this was
parameter

It is difficult to assess whether the inferred VIC-3L and CLM parameter values will do a good job of predicting soil moisture dynamics at the different measurement depths during a much longer evaluation period with wet and dry conditions. As the estimated parameters represent apparent properties of the Rollesbroich site, one may expect their calibrated values not to change too much over time. We would need additional soil moisture data and/or other types of measurements to corroborate this. Nevertheless, the apparent parameter values derived herein improve characterization of soil moisture dynamics at the Rollesbroich site compared to a separate state estimation run with VIC-3L and CLM using parameters drawn randomly from the prior distribution, or open-loop simulation using the ensemble mean model output of a large cohort of parameter vectors drawn randomly from the prior parameter distribution (initial parameter ensemble).

The different data assimilation methods (EnKF-AUG, EnKF-DUAL, RRPF, and
PMCMC) led to a rather similar performance of VIC-3L during the
calibration and evaluation periods. The only exception to this was the
anomalous RMSE value of RRPF at the 50 cm measurement depth during the
calibration period. This was explained by the slow convergence of the maximum
baseflow velocity in RRPF. Our results for VIC-3L further demonstrated
that the results of EnKF-AUG and EnKF-DUAL were equivalent for 10 and
20 % rainfall errors. Moreover, the use of a larger value of the scaling

For CLM, larger differences were observed in the performance of the
different data assimilation methods. This larger disparity among the methods
is explained by the considerably larger number of soil layers (10) used by
CLM. This increased significantly the dimensionality of the parameter
estimation problem. The overall best results at the 5, 20, and 50 cm
measurement depths were observed for EnKF-AUG and EnKF-DUAL, with RMSE values
that were somewhat smaller than their counterparts derived from PMCMC. This
was true for both the calibration and evaluation periods. The RRPF exhibited
the worst performance, in part determined by the use of a relatively small
ensemble of

Finally, our results demonstrated that the differences between the soil moisture simulations of VIC-3L and CLM are much larger than the discrepancies between the four data assimilation methods. Overall, CLM performed better than VIC-3L, especially at 50 cm measurement depth. Of course, we cannot generalize this finding to other sites, but VIC-3L's rather poor characterization of soil moisture dynamics at 50 cm depth (systematic underestimation during the first 2–3 months) warrants investigation into the use of a variable water table depth in this model to account for interactions between the variably saturated soil domain and the groundwater reservoir of the Rollesbroich site. CLM simulates such interactions and the resulting variations in the water table depth affect soil water movement in the unsaturated zone.

In this study, we have evaluated the usefulness and applicability of four different data assimilation methods for joint parameter and state estimation of the Variable Infiltration Capacity Model (VIC-3L) and the Community Land Model (CLM) using a 5-month calibration (assimilation) period (March–July 2012) of areal-averaged SPADE soil moisture measurements at 5, 20, and 50 cm depths in the Rollesbroich experimental test site in the Eifel mountain range in western Germany. This watershed is part of TERENO observatories and has been extensively monitored since 2011 to catalog long-term ecological, social, and economic impacts of global change at regional level. We used the EnKF with state augmentation or dual estimation, respectively, and the PF with a simple, statistically deficient, or more sophisticated, MCMC-based parameter resampling method. The “calibrated” LSM models were tested using SPADE water content measurements from a 5-month evaluation period (August–December 2012). The performance of the four different state and parameter estimation methods appeared rather similar during the evaluation period, with a slightly better performance of the augmentation and dual estimation methods, but followed closely by PMCMC and then RRPF. The differences between the soil moisture simulations of VIC-3L and CLM are much larger than the discrepancies between the four data assimilation methods. Overall, the best performance was observed for CLM. The large systematic underestimation of water storage at 50 cm depth by VIC-3L during the first few months of the evaluation period questions, in part, the validity of its fixed lower boundary condition at the bottom of the modeled soil domain. This approach ignores the movement of water into and out of the groundwater reservoir of the Rollesbroich site. CLM simulates interactions of the modeled soil domain with the Rollesbroich aquifer via the use of a variable water depth at the lower boundary.

The integrated water balance in VIC-3L can be written as follows:

The direct runoff in Eq. (A2) is not only a function of the water saturation
of the first layer, but also depends on the moisture content of the second
underlying soil layer. To be able to track adequately the large storage
variations of the top soil observed in experimental data, the first layer of
VIC-3L must be taken rather small. Consequently, this top layer will
saturate quickly in response to rainfall as it exhibits a rather negligible
water holding capacity. Hence, VIC-3L uses the available storage of the
first and second layer to determine the excess precipitation, which is set
equivalent to

Now that we have discussed the different fluxes from the soil domain
simulated by VIC-3L, we can now write differential equations of the
moisture dynamics in the individual soil layers (see also Liang et al.,
1996).

The use of three soil layers by VIC-3L makes it difficult to describe
accurately the vertical moisture distribution in the vadose zone. Indeed,
VIC-3L cannot distinguish between saturated and partially saturated areas in
a given soil layer. As a consequence, the baseflow flux,

This Appendix summarizes the main equations of CLM which are used to describe variably saturated water flow in the soil domain of our experimental catchment. The model uses a water balance formulation similar to Eq. (A1) of Appendix A to simulate moisture storage and movement in the soil of each grid cell of the application domain of interest. Yet, CLM includes a more exhaustive description of all the different processes that determine the water storage of the land surface. This includes canopy water, surface water, snow water, soil water, soil ice, and water stored in the unconfined aquifer. In addition to surface and subsurface runoff, CLM also considers runoff from glaciers, wetlands, and lakes.

Fluxes,

We use 10 soil layers (see Table 2) in CLM to solve for the vertical
storage and movement of water. Whenever the index

Vertical flow in the unsaturated zone is governed by rainfall infiltration,
surface and subsurface runoff, root water uptake (canopy transpiration), and
groundwater interactions. A modified Richards equation is used to predict
water storage and movement in the variably saturated soils of the
Rollesbroich site:

The matrix head,

The bottom boundary condition of Eq. (B6) depends on the depth of the water
table. This depth,

The authors declare that they have no conflict of interest.

We would like to thank the Terrestrial Environmental Observatories (TERENO) community for freely sharing with us the measurement data of the Rollesbroich experimental test site. The supercomputing center of Forschungszentrum Jülich is acknowledged for their computational support and our access to the JUROPA cluster. The first author of this paper was funded by a stipend from the government of China. We are grateful to the two anonymous reviewers and editor Kurt Roth for the very careful and detailed evaluation of this paper. The article processing charges for this open-access publication were covered by a Research Centre of the Helmholtz Association. Edited by: Kurt Roth Reviewed by: two anonymous referees