Articles | Volume 28, issue 4
Research article
28 Feb 2024
Research article |  | 28 Feb 2024

Evapotranspiration prediction for European forest sites does not improve with assimilation of in situ soil water content data

Lukas Strebel, Heye Bogena, Harry Vereecken, Mie Andreasen, Sergio Aranda-Barranco, and Harrie-Jan Hendricks Franssen

Land surface models (LSMs) are an important tool for advancing our knowledge of the Earth system. LSMs are constantly improved to represent the various terrestrial processes in more detail. High-quality data, freely available from various observation networks, are being used to improve the prediction of terrestrial states and fluxes of water and energy. To optimize LSMs with observations, data assimilation methods and tools have been developed in the past decades. We apply the coupled Community Land Model version 5 (CLM5) and Parallel Data Assimilation Framework (PDAF) system (CLM5-PDAF) for 13 forest field sites throughout Europe covering different climate zones. The goal of this study is to assimilate in situ soil moisture measurements into CLM5 to improve the modeled evapotranspiration fluxes. The modeled fluxes will be evaluated using the predicted evapotranspiration fluxes with eddy covariance (EC) systems. Most of the sites use point-scale measurements from sensors placed in the ground; however, for three of the forest sites we use soil water content data from cosmic-ray neutron sensors, which have a measurement scale closer to the typical land surface model grid scale and EC footprint. Our results show that while data assimilation reduced the root-mean-square error for soil water content on average by 56 % to 64 %, the root-mean-square error for the evapotranspiration estimation is increased by 4 %. This finding indicates that only improving the soil water content (SWC) estimation of state-of-the-art LSMs such as CLM5 is not sufficient to improve evapotranspiration estimates for forest sites. To improve evapotranspiration estimates, it is also necessary to consider the representation of leaf area index (LAI) in magnitude and timing, as well as uncertainties in water uptake by roots and vegetation parameters.

1 Introduction

Land surface models (LSMs) are important tools to improve our understanding of the Earth system. LSMs cover a broad range of land surface processes like the partitioning of incoming energy at the land surface, mass exchange between the land and atmosphere, and hydrological and ecological processes. They use sophisticated parameterizations and are constantly improved to achieve a more accurate representation of land surface processes, e.g., Arora et al. (2020) and references therein. However, there are still many sources of uncertainty, introducing systematic biases in the LSM (e.g., initial conditions, atmospheric forcings, parameters, and parameterization). One approach to improve model predictions is to assimilate observational data. Improved estimates of evapotranspiration (ET) by LSMs are of main interest as ET is a major driver of climate and weather, an important component of the water and energy cycles, closely coupled to the carbon cycle through the photosynthesis process (Jung et al., 2011). Fine-spatial-scale ET estimations are important to estimate water use and plant stress (Wurster et al., 2020). The flux of ET is, however, influenced by multiple factors, including soil water content (SWC), soil properties, ecophysiological processes, and vegetation characteristics (Wilson et al., 2004), so it is more common to assimilate these prognostic variables rather than ET itself.

Many studies assimilate soil moisture products into LSMs (e.g., Hung et al., 2022; Mahmood et al., 2019; Naz et al., 2019; Liu and Mishra, 2017; Han et al., 2015a) and report on the impact on hydrological variables like root-zone moisture and runoff. Some studies use assimilation of soil water content or related variables to evaluate ET estimation of LSMs. For example, Girotto et al. (2017) assimilated terrestrial water storage from the Gravity Recovery and Climate Experiment (GRACE) into a land surface model and evaluated results over India. They found that the assimilation decreased the accuracy of ET estimation compared to observations due to model limitations in representing irrigation. Peters-Lidard et al. (2011) assimilated two different remotely sensed soil water content products into the Noah land surface model over North America and found mixed results regarding the improvement of latent heat flux estimates. The domain-averaged root-mean-square error of the latent heat flux reduced from 27.6 to 25.6 W m−2 or increased to 29.4 W m−2 depending on the assimilated soil water content product. Additionally, they show that the improvements and degradation vary spatially across their study domain, with land cover type, and as a function of the season, and they note that the most significant improvements occur for cropland and grassland. Liu and Mishra (2017) assimilated surface soil water content data from the Advanced Microwave Scanning Radiometer-Earth Observing System in the global Community Land Model version 4.5 and found ET bias reductions of up to 2.5 mm d−1 compared to the Global Land Data Assimilation System (GLDAS) data product.

For our study, we chose the latest version (version 5) of the widely used Community Land Model (CLM5) (Lawrence et al., 2019) as various land surface process representations have been improved in CLM5 compared to earlier versions. For instance, Kennedy et al. (2019) added a plant hydraulic stress parameterization to improve the accuracy of simulated transpiration and soil water content. Lawrence et al. (2019) demonstrated the improvements of CLM5 over its precursor CLM4 in terms of ET using two study sites as examples and highlighted the better representation of the effects of soil depth on ET prediction in CLM5. On the other hand, Cheng et al. (2021) found that CLM5 predicts lower ET compared to older CLM versions and various observational data, likely due to low photosynthetic rate and leaf area index (LAI), which is consistent with their finding of low gross primary production (GPP) compared to reference data in the same simulations. In addition to these regional to global validation studies, CLM was used in several single-point setups, i.e., simulations for a single grid cell, to evaluate the performance of various LSM components. For example, Hudiburg et al. (2013) used CLM 4.0 to estimate net primary production (NPP) and GPP of a forested site and compared it with eddy covariance (EC) measurements. Another study (Zhang et al., 2019) reduced an overestimation of growing-season LAI and annual GPP of a grassland site for a CLM 4.5 single-point setup. More recently, CLM5 was extended to consider both cover crop management with improvements to ET estimation of up to 57 % (Boas et al., 2021) and fruit tree cultivation using extensive field measurements with a high correlation between observed and modeled ET (Dombrowski et al., 2022). Other studies have used manual tuning of parameters to improve CLM simulations for forests. For instance, Duarte et al. (2017) calibrated CLM4.5 for an old-growth coniferous forest and found good agreement between the simulated and observed response of canopy conductance to atmospheric vapor pressure deficit and soil water content. Raczka et al. (2016) used CLM4.5 and implemented a seasonally varying calibration of vegetation parameters and accurately simulated net carbon exchange, latent heat exchange, and biomass.

In this study, we investigate if assimilating high-quality, in situ soil water content measurements can improve the evapotranspiration estimates of LSM. We focus on one specific land cover type, namely forests. In a previous study (Strebel et al., 2022), we investigated the potential for data assimilation of in situ SWC measurements to improve model estimation for a single forest site. This study expands this method to more forest sites and investigates the effect of improved SWC estimation on ET. Investigating the method for a large number of sites is the important contribution of this study and necessary to show that the conclusions from Strebel et al. (2022) are not just a characteristic of the one study site but apply more broadly to forest sites simulated with CLM5. To investigate this, we use point- and plot-scale in situ soil water content measurements. For most sites we use point measurements provided by FLUXNET (Baldocchi et al., 2020) and eLTER Europe. The FLUXNET data have been used in various studies to verify or compare model results. For example, Dirmeyer et al. (2018) used FLUXNET data to compare four model systems, including CLM4.5, in three configurations and found for annual averaged ET that correlations range from 0.28 to 0.43 and for sensible heat from 0.14 to 0.54. The point-scale measurements use invasive equipment, and the specific measurement volume, exact depth of the sensors, number of sensors, and number of stations vary from site to site. For a few sites we use soil water content measurements from cosmic-ray neutron sensors (CRNSs) from the COSMOS-Europe dataset (Bogena et al., 2022). The CRNSs provide continuous and non-invasive soil water content measurements over a spatial footprint of hundreds of meters and integrates from the surface to a depth of 10–70 cm vertically in the soil (Zreda et al., 2008; Köhli et al., 2015). The CRNSs use neutrons as a proxy for SWC, and the vertical measurement depth varies with the soil moisture conditions. Additionally, the uncertainty of CRNS-derived soil moisture varies not only with the different neutron detectors but also with the number of counts in a time period, and therefore results under lower soil moisture conditions are more accurate (Bogena et al., 2022). The spatial footprint area is similar to the footprint of the EC flux tower. We use the final processed data on soil water content and vertical penetration (measurement) depth provided by the COSMOS-Europe dataset (Bogena and Ney, 2021). In this study, we use the ensemble Kalman filter to assimilate in situ soil water content measurements into CLM5 simulations, and the effect on the modeling results are quantified by comparing the modeled ET against the observed ET obtained from EC flux towers. We also analyze the effects on other land–atmosphere exchange fluxes, i.e., net ecosystem exchange (NEE) and gross primary production (GPP). The paper is structured as follows: first, we introduce the model and data assimilation framework used. The sites selected for this study and the observational data used for data assimilation and model–observation comparison are then described. Subsequently, the results for each variable of interest are shown and analyzed. Finally, we end with a discussion of the obtained results and conclusions.

2 Methods and materials

2.1 Study sites

In our study, we are interested in the characterization of water, energy and carbon exchange between (European) forest ecosystems and the atmosphere, and whether soil water content assimilation can improve the characterization of these processes. Therefore, we selected European sites with different forest types (see Table 1) covering different climate zones in Europe. Another important constraint was the availability of soil water content data and evapotranspiration measurements for the period from 2009 to 2018. The selected sites are mostly part of FLUXNET (Baldocchi et al., 2020) or the European Long-Term Ecological Research network (eLTER-Europe) (Parr et al., 2002). In addition to the sites from these observation networks, we included three sites from the COSMOS-Europe network (Bogena et al., 2022) where CRNSs are installed to estimate the soil water content of the forested sites. Table 1 gives an overview of all selected sites for this study, and Fig. 1 shows the distribution on the map.

Table 1Overview of the study sites. Classification uses the International Geosphere-Biosphere Program (IGBP) code, as is used for FLUXNET: MF for mixed forests, ENF for evergreen needle leaf forests, DBF for deciduous broad leaf forests, EBF for evergreen broad leaf forests, and WSA for woody savannah. LONG is longitude and LAT latitude.

Download Print Version | Download XLSX

Figure 1Map showing the location of the selected study sites of the FLUXNET (F), eLTER (L), and COSMOS-Europe (C) networks.

In this study, daily average soil water content data are assimilated (see Sect. 2.4.1 for more details), and the model is verified using daily average evapotranspiration and sensible heat flux data. Since the observational data were already quality controlled by the providers, we did not filter out any data. We only assimilated (daily mean averaged) soil water content observations when measurements were available for a given day. The daily mean averages were calculated independent from the observation frequency for the different sites. Similarly, simulated evapotranspiration was only compared with observations when data were available, on the basis of daily mean averages.

2.2 Model description

For our study, we used the Community Land Model version 5.0 (CLM5) that can be applied in various configurations (Lawrence et al., 2019). We use CLM5-BGC, i.e., CLM5 with the biogeochemistry module active, as opposed to CLM5 with fixed phenology. The biogeochemistry module enables a fully prognostic treatment for carbon and nitrogen in the land surface model and has a significant impact on the modeled water and energy budgets.

CLM5 uses a sub-grid hierarchy of various plant functional types (PFTs) to characterize the land use and vegetation type within every grid cell, e.g., evergreen needle leaf or deciduous broad leaf forests. CLM5 contains a spatially variable soil depth with an underlying, impermeable bedrock instead of the unconfined aquifer parameterization used in the former CLM4 versions. To estimate the soil water content, CLM5 solves Richard's equation using the Brooks–Corey parameters derived from pedotransfer functions from Clapp and Hornberger (1978) with a finite-difference approximation to represent the vertical discretization and temporal evolution of soil water content.

The sensible and latent heat flux estimation in CLM5 is derived from the Monin–Obukhov similarity theory and differentiated for vegetated and non-vegetated surfaces. CLM5 simulates sensible and latent heat flux for both vegetated and ground fluxes. For the vegetation part, the contributions from the leaf boundary layer and the sunlit and shaded stomatal resistances affect the total resistance to the modeled water vapor transfer. The water vapor transfer includes transpiration from dry leaf surfaces, and the transpiration removes water from the soil based on root fraction for a given soil layer. Interception, throughfall, and canopy drip are explicitly modeled in CLM5, and canopy evaporation is represented as from the sum of stem and leaf surface evaporation as a function of temperature. The ground fluxes, e.g., from bare soils or soil beneath a canopy, are dependent on the ground surface temperature. The ground latent heat flux is reduced if not enough soil moisture is available, and the excess energy is redistributed to the sensible heat flux. The detailed procedure and equations are documented in Lawrence et al. (2018).

2.3 Data assimilation

2.3.1 Ensemble Kalman filter

In this work, assimilation of soil water content measurements is performed with the ensemble Kalman filter (EnKF) (Evensen, 1994; Burgers et al., 1998). The EnKF uses an ensemble modeling approach, with various simultaneous model runs, to approximate the model uncertainty. The ensemble members have different input model parameters and atmospheric forcings (see Sect. 2.4 for details). We define a state vector x and an observation vector y, e.g.,

(1) x i = θ 1 , 1 i θ 1 , 2 i θ 1 , m i θ n , m i ,

where n is the number of layers, m is the number of grid cells, θj,li is the soil water content for layer j and grid cell l of the model, and the superscript i refers to ensemble member i. In this study we use an ensemble of 96 members to sample the model uncertainty.

(2) y = o + e ,

where o is a vector of the observational data and e represents a perturbation vector with mean zero and covariance according to the observational error covariance matrix. This perturbation vector is used to correct the error statistics as described in Burgers et al. (1998).

The update step of the ensemble Kalman filter is

(3) x a i = x f i + K y - H x f i ,

where the superscript i refers to ensemble member i, xai is the updated state vector after the analysis, xfi is the forecasted model state vector, K is the Kalman gain, and H is the measurement operator that transforms between model and observational states. In this study, the measurement operator H consists of a simple mapping of the observations to the corresponding model layers in the state vector for simulations with point measurements. For FLUXNET sites, measured soil water content is provided for up to three depths described as superficial, medium, and deep. Since data assimilation in CLM5-PDAF requires a specific vertical layer, we assigned 5, 20, and 50 cm to the respective FLUXNET SWC layers. For the CRNS sites, the measurement depth for each individual measurement is calculated following Schrön et al. (2017) and is included in the dataset from Bogena et al. (2022). For simulations assimilating the CRNS, H assigns the mean observed SWC to all the layers down to the measurement depth. This is a simplified approach and will be improved in further studies to take the weighting function from Schrön et al. (2017) into account. The Kalman gain is calculated accordingly:

(4) K = PH T R + HPH T - 1 ,

where the superscript T is used for transposed matrices; R is the observational error covariance matrix; and P is the model error covariance matrix, which is approximated through ensemble statistics, specifically as follows:

(5) P = 1 N - 1 i = 1 N x f i - x f x f i - x f T ,

where N is the number of ensemble members, and x is the ensemble mean.

In this study, the state vector depends on the simulation scenario (explained in more detail in Sect. 2.3.2), and R is based on the measurement errors which are assumed to be constant and independent with a root-mean-square error of 0.02 cm3 cm−3.

To enable data assimilation with CLM5, we use the Parallel Data Assimilation Framework (PDAF) (Nerger et al., 2005), which was recently coupled to CLM5 (Strebel et al., 2022). This coupling (CLM5-PDAF) also supports the assimilation of soil water content measurements.

2.3.2 Parameter updating

In addition to the use of data assimilation for state updating, we also perform parameter updating based on the state augmentation approach (Friedland, 1969; Fertig et al., 2009). Here, model parameters are attached to the state vector and updated based on the Kalman gain calculations without observations of the model parameters. By default, CLM5-PDAF updates soil hydraulic parameters through changes to fractions of sand, clay, and organic matter and the pedotransfer function of Clapp and Hornberger (1978). In this indirect approach the state vector for the EnKF is defined as follows:

(6) x i = θ i % sand i % clay i % organic i ,

where the superscript i refers to ensemble member i. The components θ, %sand, %clay, and %organic each represent a vector containing the respective variable for each soil layer of each grid cell of the model. A damping factor of 0.1 is used on the parameter updates to avoid filter inbreeding and keep the ensemble spread larger so that the model error covariance matrix is a good approximation for model uncertainty.

In previous studies, parameters were updated indirectly (Naz et al., 2019; Han et al., 2014; Baatz et al., 2017). We tested directly updating saturated hydraulic conductivity, porosity, hydraulic conductivity exponent, and saturated soil matric potential, but this resulted in more unstable estimates than indirectly updating soil hydraulic parameters. The pedotransfer function which is used for the indirect updating results in reasonably correlated soil hydraulic parameters. In testing a direct approach to updating saturated hydraulic conductivity, porosity, hydraulic conductivity exponent B, and saturated soil matric potential, we found that updating the parameters indirectly provided more stable simulations. The pedotransfer function keeps the soil hydraulic parameters reasonably correlated to each other. In this study, the parameters are chosen to optimize the SWC estimation and not ET estimation to study the effects of SWC improvements on ET. To more directly improve the ET estimation, parameters that are critical to the ET process should be added, e.g., vegetation hydraulic parameters that are related to the transfer of water from the root to leaf or parameters related to stomatal conductance.

2.4 Model setup

2.4.1 Domain setup

Since we only use local field measurements, we represent each study site as a single grid cell in CLM5. This approach is also consistent from the viewpoint of larger regional-scale models, where each of these sites would only be part of a grid cell. The CLM5 grid cells are vertically divided into 25 layers from the surface down to 50 m depth of which the first 20 layers (until 8.6 m depth) may be hydrologically and biogeochemically active depending on the variable soil depth for each site (Lawrence et al., 2018). For the more than 70 different surface parameters of CLM5, we used the default values generated by the tools provided with CLM5 (e.g., soil depth to bedrock, sand, clay, and organic matter fractions, PFTs). These default values are generated from remapping various global files (Lawrence et al., 2019). Only the PFTs were manually assigned for each site. For the ensemble creation, the fractions of sand, clay, and organic matter are modified for each ensemble member. The perturbations are normally distributed with mean zero and a standard deviation of 10 %.

2.4.2 Atmospheric forcings

Meteorological observations were also available at the selected study sites and were used to force CLM5. The existing gaps in the observation time series were gap-filled with data from the COSMO-REA6 reanalysis data product (Bollmeyer et al., 2015). For the ensemble generation precipitation (PR), shortwave radiation (SW), longwave radiation (LW), and air temperature (TA ) were perturbed taking into account cross-correlations between variables according Reichle et al. (2007). The perturbations are multiplicative PR  logN(1, 0.5), multiplicative SW  logN(1, 0.3), additive LW N(0, 20) (W m−2), and additive TA N(0, 1) (K). The following cross-correlation coefficients between variables were used: PR–SW, 0.8; PR–LW, 0.5; PR–TA, 0; SW–LW, 0.5; SW–TA, 0.4; and LW–TA, 0.4.

2.4.3 Data assimilation experimental setups

Three different simulation scenarios were considered: (1) open-loop (OL) simulations without data assimilation, (2) data assimilation with updating of soil water content (DAS), and (3) data assimilation with soil water content updating and parameter updating (DASP). For all scenarios, data assimilation is performed at a daily frequency and with daily averages from the observations. The observation error is assumed to be constant and set to a root mean square of 2 %.

2.5 Statistical metrics

For the comparison of simulation results with observations, we use four statistical metrics: the squared correlation coefficient (R2), the mean bias error (MBE), the root-mean-square error (RMSE), and the unbiased root-mean-square error (ubRMSE):


where o stands for observations, m represents the ensemble average of the simulated values, t is the time step, Nt is the total number of time steps, and the overbar represents the average over all time steps.

3 Results

3.1 Soil water content and related parameters

Figures 2 and 3 show the results of the soil water content simulations at 20 cm depth of the OL, DAS, and DASP simulations compared to the soil water content observed at the nine sites. Figure 2 compares the OL and DAS results, and Fig. 3 compares the OL and DASP results. The corresponding scatter diagrams for the depths 5, 20, and 50 cm can be found in the Appendix (Figs. A1–A7). Overall, the results show expected improvements by data assimilation of observed soil water content. For the OL simulations, Fig. 2 shows particularly large RMSE values for CZ-BK, DE-Obe, FI-Sod, and NL-Loo. Figure 2 also illustrates the improved performance achieved by DASP, with a RMSE reduction from 29.3 to 6.25 cm3 cm−3 and a MBE reduction from 28.06 to 2.94 cm3 cm−3 for FI-Sod. Parameter updating, as shown in Fig. 3, further improves the simulation results, but the improvement from DAS to DASP is significantly less than from OL to DAS.

Figure 2Scatter plots of observed soil water content at 20 cm depth at nine study sites versus OL- and DAS-simulated soil water content. The points represent daily averages for the days on which observation data are available. Green points are OL, and blue points are DAS results.


Figure 3Scatter plots of observed soil water content at 20 cm depth at nine study sites versus OL- and DASP-simulated soil water content results at 20 cm depth. The points represent daily averages for the days on which observation data are available. Green points are OL, and purple points are DASP results.


The results of the three COSMOS-Europe sites are shown in Fig. 4, in which the observed SWC values are compared with the weighted SWC mean of the model layers corresponding to the measurement depth of the CRNS. This comparison again shows the large improvement from OL to DAS and a smaller improvement or even a small deterioration from DAS to DASP.

Figure 4Scatter plots of observed soil water content at three CRNS study sites (DE-HoH left column, DE-Wue middle column, DK-Glu right column) versus simulation results (OL results in the top row, DAS results in the middle row, and DASP results in the bottom row). The points represent daily averages for the days on which observation data are available.


Figure 5Profile plots for the first 10 layers, showing the root fraction and the time-averaged SWC per depth for each site. In the SWC profiles, the red and green lines represent the SWC from the open-loop simulations (OL) and DASP simulations, respectively.


Figure 6Time series of the saturated soil hydraulic conductivity for each site in the DASP simulation. The grey line is the value at 5 cm depth, the blue line at 20 cm depth, and the green line at 50 cm depth.


Figure 5 shows the depth profile for the root fraction and the SWC average of the OL and DASP simulations for the first 1.2 m (10 layers) for each site. The SWC is updated for all layers, including the layers with the largest root fraction, but depending on the site the magnitude of the update varies with depth. For most sites the data assimilation shifts the SWC values while keeping the profile similar to the OL results. FI-Hyy and FI-Sod are the exception and show a decrease in SWC in the first 25 to 50 cm and an increase in SWC in the deeper layers for DASP.

Figure 6 shows time series of the estimated saturated soil hydraulic conductivity for each of the sites and the three observation layer depths. The DASP scenario results in parameter changes when the first observations are available but converge over the time of the simulation to a new value. The corresponding time series for the other soil hydraulic parameters can be found in the Appendix (Figs. A8, A9, and A10).

Figures 7, 8, and 9 show the initial (prior) and the updated (posterior) vertical profiles for the sand, clay, and organic matter fractions for the upper 1.2 m (10 soil layers). The updated parameters often keep the profile distribution but have reduced or increased values throughout the layers compared to the prior.

Figure 7Profile plot showing the sand fractions for the first 10 layers of all 13 sites.


Figure 8Profile plot showing the clay fractions for the first 10 layers of all 13 sites.


Figure 9Profile plot showing the organic matter fractions for the first 10 layers of all 13 sites.


3.2 Evapotranspiration

The impact of the data assimilation on the ET flux is shown in Figs. 10 and 11. Notably, the difference between the OL and the DASP results is smaller for ET than for SWC. While the data assimilation improves the model results for SWC for all sites, both improvement and deterioration occur for modeled ET. Figure 12 shows the comparison of the improvements by data assimilation for SWC and the positive and negative effect on ET estimation. The average RMSE reduction for the DASP SWC prediction is between 56 % and 64 % compared to OL. Comparing the OL and DASP results for ET shows an average reduction of the MBE of 0.06 mm d−1 but an increase in RMSE for the DASP ET predictions of 4 % on average, with 8 of the 13 sites showing a relative change in ET of only ±1 %. Two outliers (FI-Sod and NL-Loo sites) reduce the average model improvement. These sites show both a large overestimation in SWC in the OL (see Fig. 2) and a large underestimation of ET in the DASP simulation (see Fig. 11). This could be caused by the mismatch of simulated and actual LAI for these sites. To investigate this, we repeated the simulations using CLM5 with satellite-derived phenology (CLM5-SP), and the results are shown in Fig. 13. Because the focus of this study is on CLM5-BGC, these CLM5-SP simulations use the default datasets from CLM5 since in situ LAI measurements for these sites were not available. The CLM5-SP OL and DASP simulations do not use any information from the CLM5-BGC simulations which implies that for the CLM-SP DASP simulations parameters are estimated independently from the CLM5-BGC simulations. For CLM5-SP we observe an average improvement in the RMSE of SWC between 57.6 % and 64.3 % and an average reduction of 5.8 % for the ET estimation. These CLM5-SP simulations use the default datasets from CLM5 and without site specific calibration of the timing or magnitude of the seasonal phenology of LAI. Therefore, even for the CLM5-SP simulations, there is a mismatch between simulated and actual LAI. However, also for this case there are sites with a large improvement in SWC estimation that show deterioration for ET estimation.

Figure 10Scatter plots of observed evapotranspiration at 13 study sites versus OL simulation results. The points represent daily averages for the days on which observation data are available.


Figure 11Scatter plots of observed evapotranspiration at 13 study sites versus DASP simulation results. The points represent daily averages for the days on which observation data are available.


Another possible explanation for the improvement in SWC estimation but no improvement in ET estimation is the underestimation of root water uptake from deeper soil layers for forest sites, as also suggested by Shrestha et al. (2018). Figure 12 shows that the quality of the model results is not dependent on the forest type; i.e., the evergreen needle leaf forest (ENF) sites show both strong and average relative changes in SWC RMSE and ET RMSE. This suggests that the strong deviations in the model results of the FI-Sod and NL-Loo sites are due to other local conditions, e.g., soil properties.

Figure 12Comparison of the SWC and ET characterization for the OL and DASP simulations. Each point represents the overall average RMSE change for one site. The color of the points indicates the classification code for the different forest types (MF – mixed forest, ENF – evergreen needle leaf forest, DBF – deciduous broad leaf forest, EBF – evergreen broad leaf forest, AVG – average over all forest types).


Figure 13Comparison of the SWC and ET characterization for the OL and DASP simulations using CLM5-SP. Each point represents the overall average RMSE change for one site. The color of the points indicate the forest type (MF – mixed forest, ENF – evergreen needle leaf forest, DBF – deciduous broad leaf forest, EBF – evergreen broad leaf forest, AVG – average over all forest types).


The three CRNS sites show an average relative change of ET RMSE of 2.6 %, 0.2 %, and 0.9 % for DE-HoH, DE-Wue, and DK-Glu, respectively. Therefore, although the CNRS measurements are more consistent with the large measurement area of the flux towers, no significant improvement in ET for these three sites can be achieved with the current implementation of the CNRS-SWC assimilation. We anticipate that the implementation of a more accurate observation operator would improve the modeled SWC. The current observation operator does not use vertical weighting to take the decreasing CRNS sensitivity with depth into account.

3.3 Evaluation of other land–atmosphere exchange fluxes

Comparing measured and modeled sensible heat fluxes (SH) (Figs. 14 and 15), similar R2 values are obtained for the OL and the DASP approach. The R2 values range from 0.23 to 0.51, with an average of 0.36. This is similar to the ET results, where the R2 values of measured and modeled (OL and DASP) ET range from 0.01 to 0.58, with an average of 0.37. Comparing Figs. 14 and 15 shows the impact of data assimilation of SWC on SH to be small. On average DASP improves the MBE by 4.66 W m−2 compared to OL. However, for five of the eight sites the improvement of the MBE is smaller than 1 W m−2. But, compared to the ET results, data assimilation of SWC reduces the MBE of SH for all sites.

Figure 14Scatter plots of observed sensible heat flux at eight study sites versus OL simulation results. The points represent daily averages for the days on which observation data are available.


Figure 15Scatter plots of observed sensible heat flux at eight study sites versus DASP simulation results. The points represent daily averages for the days on which observation data are available.


The impact of updating SWC with data assimilation on modeled NEE, GPP, and LAI is shown in Fig. 16. The NEE is negative (land acts as carbon sink) for eight, seven, and six of the field sites for OL, DAS, and DASP, respectively. For DASP the GPP and LAI show an increase for two of the sites and a decrease for three of the sites, and they remain similar for eight of the sites. Figure 17 shows how average SWC in 5 and 50 cm depth, ET, NEE, GPP, and SH (average over all sites and all years) are affected by data assimilation. Although DASP adjusts SWC at 5 cm towards the observations, the correction for SWC at 50 cm depth is smaller because not all sites provide data at this depth. However, for all sites the data assimilation provides some improvement for SWC estimation, even in layers below the observation depth. In spite of improved SWC characterization, ET deviates slightly more from the observations after DASP, while sensible heat flux is very slightly closer to the observations. GPP is lower after DASP, and NEE is less negative. While the overall change for some of these variables is small, different variations throughout the year can be observed. This averaging hides the variations between sites and annual variability but highlights the overall model behavior. Notably, the data assimilation improves SWC estimation at 5 cm throughout the year, while at 50 cm depth the improvement can mainly be observed in late summer and autumn. Similarly, for SH a model structural bias is apparent with large negative simulated SH values in late autumn, winter, and early spring, while the observations only show a few days with negative average values over all sites and all years.

Figure 16Open-loop (OL) and assimilation scenario (DAS and DASP) yearly averages of (a) net ecosystem exchange (NEE), (b) gross primary production (GPP), and (c) leaf area index (LAI) for all selected sites.


Figure 17Seasonality of observed (OBS) and simulated (OL and DASP) states and fluxes based on daily averages from all years (2009 to 2018) and all sites: (a) soil water content (SWC) at 5 cm depth, (b) SWC at 50 cm depth, (c) evapotranspiration (ET), (d) sensible heat flux (SH), (e) net ecosystem exchange (NEE), and (f) gross primary production (GPP).


Figure 18Seasonality of simulated LAI for each of the sites. The red line represents predicted LAI from the CLM5-BGC-DASP simulations, and the blue line represents LAI inputs used in the CLM5-SP-DASP simulations.


Figure 18 shows the LAI for each site averaged over all the simulated years and the difference between the prescribed LAI used in CLM5-SP and the simulated LAI by CLM5-BGC. Sites with the same PFT show clear differences in the yearly LAI cycle.

4 Discussion

4.1 Soil water content improvements

Our results confirm that assimilation of high-quality in situ SWC data improves the prediction of SWC by CLM5, as has been demonstrated in several other studies (Hung et al., 2022; Mahmood et al., 2019; Naz et al., 2019; Liu and Mishra, 2017; Han et al., 2015a). In our study, we were able to show that this also applies to forest sites with different climates, tree species, and soil properties.

Additionally, CRNS observations represent SWC for a larger area in better correspondence to the EC tower footprint. So far, only few studies have used CRNS information in a data assimilation framework (Rosolem et al., 2014; Han et al., 2015b; Baatz et al., 2017; Patil et al., 2021). In line with our study, these studies show the high potential of CRNSs for improved soil moisture prediction with land surface models, in terms of both SWC prediction and improvement of soil hydraulic parameters. Currently, CRNS stations are operated with increasing numbers worldwide (Andreasen et al., 2017), in hydrological observatories (e.g., Bogena et al., 2018; Liu et al., 2018) or as national networks (Zreda et al., 2012; Evans et al., 2016), or even increasing at continental scales (e.g., Hawdon et al., 2014; Bogena et al., 2022), which opens up new opportunities for assimilation of CRNS data in land surface models at various scales.

In our data assimilation approach, we assumed that the CRNS signal shows a constant sensitivity to SWC down to the penetration depth of the CRNS. However, Schrön et al. (2017) have shown that the integrated neutron signal over a vertical soil column exhibits a strong decrease in sensitivity with depth and suggested that this physical behavior of neutrons should be taken into account in model applications. For example, Shuttleworth et al. (2013) developed a simple, physically based analytical model to translate model-predicted soil moisture profiles into aboveground fast neutron counts within a data assimilation framework. A simpler method was proposed by Schrön et al. (2017) using vertical weighting functions that depend on SWC, atmospheric pressure, horizontal distance, and vegetation height. Therefore, in a follow-up study, we will test whether observation operators that account for the vertical weights of the different model soil layers according to the decreasing sensitivity of CRNSs with depth will improve our SWC prediction results.

4.2 Evapotranspiration estimation without improvements from SWC DA

Several studies have demonstrated the potential of improved ET prediction using data assimilation of SWC measurements (Liu and Mishra, 2017; Girotto et al., 2017; Peters-Lidard et al., 2011). These studies focused on regional or global scale and show heterogeneous spatial patterns of improvement to ET estimation. Baatz et al. (2017) showed that assimilation of CRNS observations altered the ET estimation in CLM4.5 in parts of their study area by up to 80 mm yr−1 compared to the OL approach.

However, in our study with the land surface model CLM5, we found that data assimilation of SWC does not improve the ET prediction for European forest sites. We also found that the impact on ET from assimilating CRNS observations is similarly limited, as in the assimilation of other in situ SWC data. Since our study sites cover a variety of climates and soil types, we assume that this result also applies to other forest sites worldwide with similar tree species.

The lack of improvement in ET prediction in the case of data assimilation of in situ soil moisture information is consistent with findings from other studies. Girotto et al. (2017) found a decrease in ET accuracy after assimilating GRACE data over India and attributed the results to the representation of irrigation in the model. Similarly, Peters-Lidard et al. (2011) showed mixed results after assimilating multiple satellite soil water content products over North America with spatial variation of improvements and deterioration of ET estimation. Overall, for 9 of the 13 forested study sites our OL simulations show positive mean bias error, indicating that CLM5 underestimates the ET compared to the FLUXNET observations. These underestimations agree with the results shown in the study by Cheng et al. (2021), showing that CLM5 underestimates ET observations. Additionally, Nearing et al. (2018) investigated the contribution of model structural errors and model inputs for four different LSM and concluded that SWC uncertainty was dominated by soil parameter uncertainty, while ET uncertainty was dominated by forcing uncertainty. Without a similar in-depth benchmark study for CLM5, but from our results and the results of the previously mentioned CLM5 studies, a similar conclusion can be drawn for CLM5.

A different aspect is that we assume that the EC data are correct to validate our simulation results. However, the EC-data might be affected by energy balance closure issues (Foken, 2008; Hendricks Franssen et al., 2010).

4.3 Methods to improve ET estimation

There are various approaches to improve modeled ET estimates. For example, Zhang et al. (2020) identified and optimized four hydraulic and three vegetation parameters in CLM4.0 that improved ET estimation by 7.3 % for the optimization period and 5.3 % for the validation period for China. Similarly, Post et al. (2018) calibrated eight parameters to improve NEE estimation in CLM 4.5, and a similar approach to optimize vegetation parameters in CLM 5.0 for ET estimation could improve simulation results. Tang et al. (2015) implemented a root hydraulic redistribution model in CLM4.5 to improve ET estimation but found that their method was only able to improve ET predictions north of 20° N. They identified the representation of deep roots, soil hydraulic parameterization for certain soils, meteorological forcings, and the parameterization of the water table dynamics and drainage as the main limitations to improve ET by their method.

Denager et al. (2023) used SWC measurements for an agricultural site in Denmark for parameter calibration of soil texture, LAI, stomata conductance, and the root distribution in CLM5 and obtained improved energy partitioning of ET and SH. However, they also found it difficult to calibrate the parameters to get an improvement in SH estimation throughout the year and suggested that the difference in energy balance closure between LSMs and EC flux observations contributes to the bias.

Fox et al. (2022) concluded that errors in LAI estimations in LSMs lead to substantial flaws in the representation of carbon, water, and energy fluxes. Furthermore, they conclude that data assimilation to remove bias in LAI improves LSMs results significantly and is advisable until the prognostically modeled LAI improves substantially. For example, Zhang et al. (2016) assimilated remotely sensed LAI data into the Biome-BGC model at two sites and improved both ET and NEE estimates, evaluated with EC tower measurements. Rahman et al. (2022) showed that the joint LAI and topsoil SWC assimilation from satellite products improved the ET estimation for the contiguous United States compared with independent validation datasets, while data assimilation of topsoil SWC alone only improved the SWC estimation.

As mentioned, LAI is identified as a key variable to improve ET estimation and representation of land carbon processes. Therefore, in future work we will investigate the effects of data assimilation of LAI and joint state–vegetation parameter estimation on the simulation of carbon, water, and energy fluxes with CLM5.

5 Conclusions

This paper analyzed the impact of the assimilation of in situ soil water content (SWC) data on SWC characterization, evapotranspiration (ET), sensible heat flux (SH), gross primary production (GPP), and net ecosystem exchange (NEE), for 13 forested sites in Europe. Assimilation of SWC, from both point-scale and plot-scale observations, with the ensemble Kalman filter, using the Community Land Model version 5 coupled to the Parallel Data Assimilation Framework (CLM5-PDAF) improves SWC prediction (RMSE reductions between 56 % and 64 % compared to the open-loop run and depending on measurement depth). However, assimilation of in situ SWC does not improve the ET prediction for the investigated European forest sites. For most of the sites, data assimilation showed almost no effect on ET fluxes (RMSE changes between ±1 %), and some sites showed strong negative effects of SWC assimilation on ET predictions (20 % to 30 % change in RMSE). The assimilation of in situ SWC from cosmic-ray neutron sensors (CRNSs), which determine SWC over a larger horizontal footprint more in correspondence with the eddy covariance footprint, for 3 of the 13 sites, also does not improve ET characterization. These results suggest that improving the SWC estimation of state-of-the-art LSM such as CLM5 is not sufficient to improve ET estimation for forest sites. To improve ET estimation, it is also necessary to consider the representation of LAI in magnitude and timing, as well as uncertainties in water uptake by roots and vegetation parameters. In the future, to improve modeled ET using data assimilation we will further examine the potential of assimilating different state variables, for example, leaf area index and updating related vegetation parameters. In addition, we will apply a measurement operator in the data assimilation framework that considers the vertical sensitivity of the CRNS signal.

Appendix A: Additional figures

Figure A1Scatter plots of observed soil water content at 10 study sites versus OL simulation results at 5 cm depth. The points represent daily averages for the days on which observation data are available.


Figure A2Scatter plots of observed soil water content at 10 study sites versus DASP simulation results at 5 cm depth. The points represent daily averages for the days on which observation data are available.


Figure A3Scatter plots of observed soil water content at 10 study sites versus OL simulation results at 20 cm depth. The points represent daily averages for the days on which observation data are available.


Figure A4Scatter plots of observed soil water content at 10 study sites versus DAS simulation results at 20 cm depth. The points represent daily averages for the days on which observation data are available.


Figure A5Scatter plots of observed soil water content at 10 study sites versus DASP simulation results at 20 cm depth. The points represent daily averages for the days on which observation data are available.


Figure A6Scatter plots of observed soil water content at eight study sites versus OL simulation results at 50 cm depth. The points represent daily averages for the days on which observation data are available.


Figure A7Scatter plots of observed soil water content at eight study sites versus DASP simulation results at 50 cm depth. The points represent daily averages for the days on which observation data are available.


Figure A8Time series for the Clapp–Hornberger shape parameter B (BSW) for each site in the DASP simulation. The grey line is the value at 5 cm depth, the blue line at 20 cm depth, and the green line at 50 cm depth.


Figure A9Time series for the saturated soil matric potential for each site in the DASP simulation. The grey line is the value at 5 cm depth, the blue line at 20 cm depth, and the green line at 50 cm depth.


Figure A10Time series for the porosity for each site in the DASP simulation. The grey line is the value at 5 cm depth, the blue line at 20 cm depth, and the green line at 50 cm depth.


Code availability

The code used in this study is available at (last access: 20 February 2024) or available via Zenodo (, Strebel et al., 2021).

Data availability

Data for the European FLUXNET sites are available at (European Fluxes Database Cluster, 2024). Some additional data used in this study are from the eLTER data portal (, eLTER, 2024) and ICOS data portal (, ICOS, 2024). The CRNS data are published in Bogena and Ney (2021,

Author contributions

LS pre-processed the data, adjusted the code, performed the simulations, and prepared the manuscript. MA and SAB provided data and contributed to the manuscript. HB, HJHF, and HV supervised the research, co-designed the experiments, and contributed to the manuscript.

Competing interests

At least one of the (co-)authors is a member of the editorial board of Hydrology and Earth System Sciences. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.


The authors gratefully acknowledge the computing time granted through JARA on the supercomputer JURECA at Forschungszentrum Jülich. The authors also gratefully acknowledge the support from the eLTER Plus project. We are thankful for all the data provided by FLUXNET, eLTER, ICOS, and COSMOS-Europe projects, and we thank all the site PI and technical staff of the sites mentioned in this study.

Financial support

This project is co-funded by the LIFE programme of the European Union under contract number LIFE 17 CCA/ES/000063, with additional funding from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – SFB 1502/1-2022 – project number 450058266.

The article processing charges for this open-access publication were covered by the Forschungszentrum Jülich.

Review statement

This paper was edited by Nunzio Romano and reviewed by two anonymous referees.


Andreasen, M., Jensen, K. H., Desilets, D., Franz, T. E., Zreda, M., Bogena, H. R., and Looms, M. C.: Status and perspectives on the cosmic‐ray neutron method for soil moisture estimation and other environmental science applications, Vadose Zone J., 16, 1–11, 2017. 

Arora, V. K., Katavouta, A., Williams, R. G., Jones, C. D., Brovkin, V., Friedlingstein, P., Schwinger, J., Bopp, L., Boucher, O., Cadule, P., Chamberlain, M. A., Christian, J. R., Delire, C., Fisher, R. A., Hajima, T., Ilyina, T., Joetzjer, E., Kawamiya, M., Koven, C. D., Krasting, J. P., Law, R. M., Lawrence, D. M., Lenton, A., Lindsay, K., Pongratz, J., Raddatz, T., Séférian, R., Tachiiri, K., Tjiputra, J. F., Wiltshire, A., Wu, T., and Ziehn, T.: Carbon–concentration and carbon–climate feedbacks in CMIP6 models and their comparison to CMIP5 models, Biogeosciences, 17, 4173–4222,, 2020. 

Baatz, R., Hendricks Franssen, H.-J., Han, X., Hoar, T., Bogena, H. R., and Vereecken, H.: Evaluation of a cosmic-ray neutron sensor network for improved land surface model prediction, Hydrol. Earth Syst. Sci., 21, 2509–2530,, 2017. 

Baldocchi, D. D.: How eddy covariance flux measurements have contributed to our understanding of Global Change Biology, Glob. Change Biol., 26, 242–260, 2020. 

Boas, T., Bogena, H., Grünwald, T., Heinesch, B., Ryu, D., Schmidt, M., Vereecken, H., Western, A., and Hendricks Franssen, H.-J.: Improving the representation of cropland sites in the Community Land Model (CLM) version 5.0, Geosci. Model Dev., 14, 573–601,, 2021. 

Bogena, H. and Ney, P.: Dataset of “COSMOS-Europe: A European network of Cosmic-Ray Neutron Soil Moisture Sensors”, Forschungszentrum Jülich [data set],, 2021. 

Bogena, H. R., Montzka, C., Huisman, J. A., Graf, A., Schmidt, M., Stockinger, M., von Hebel, C., Hendricks-Franssen, H. J., van der Kruk, J., Tappe, W., Lücke, A., Baatz, R., Bol, R., Groh, J., Pütz, T., Jakobi, J., Kunkel, R., Sorg, J., and Vereecken, H.: The TERENO‐Rur hydrological observatory: A multiscale multi‐compartment research platform for the advancement of hydrological science, Vadose Zone J., 17, 1–22, 2018. 

Bogena, H. R., Schrön, M., Jakobi, J., Ney, P., Zacharias, S., Andreasen, M., Baatz, R., Boorman, D., Duygu, M. B., Eguibar-Galán, M. A., Fersch, B., Franke, T., Geris, J., González Sanchis, M., Kerr, Y., Korf, T., Mengistu, Z., Mialon, A., Nasta, P., Nitychoruk, J., Pisinaras, V., Rasche, D., Rosolem, R., Said, H., Schattan, P., Zreda, M., Achleitner, S., Albentosa-Hernández, E., Akyürek, Z., Blume, T., del Campo, A., Canone, D., Dimitrova-Petrova, K., Evans, J. G., Ferraris, S., Frances, F., Gisolo, D., Güntner, A., Herrmann, F., Iwema, J., Jensen, K. H., Kunstmann, H., Lidón, A., Looms, M. C., Oswald, S., Panagopoulos, A., Patil, A., Power, D., Rebmann, C., Romano, N., Scheiffele, L., Seneviratne, S., Weltin, G., and Vereecken, H.: COSMOS-Europe: a European network of cosmic-ray neutron soil moisture sensors, Earth Syst. Sci. Data, 14, 1125–1151,, 2022. 

Bollmeyer, C., Keller, J. D., Ohlwein, C., Wahl, S., Crewell, S., Friederichs, P., Hense, A., Keune, J., Kneifel, S., Pscheidt, I., Redl, S., and Steinke, S.: Towards a high-resolution regional reanalysis for the European CORDEX domain, Q. J. Roy. Meteor. Soc., 141, 1–15,, 2015. 

Burgers, G., Jan van Leeuwen, P., and Evensen, G.: Analysis scheme in the ensemble Kalman filter, Mon. Weather Rev., 126, 1719–1724, 1998. 

Cheng, Y., Huang, M., Zhu, B., Bisht, G., Zhou, T., Liu, Y., Song, F., and He, X.: Validation of the Community Land Model version 5 over the contiguous United States (CONUS) using in situ and remote sensing data sets, J. Geophys. Res.-Atmos., 126, e2020JD033539,, 2021. 

Clapp, R. B. and Hornberger, G. M.: Empirical equations for some soil hydraulic properties, Water Resour. Res., 14, 601–604, 1978. 

Denager, T., Sonnenborg, T. O., Looms, M. C., Bogena, H., and Jensen, K. H.: Point-scale multi-objective calibration of the Community Land Model (version 5.0) using in situ observations of water and energy fluxes and variables, Hydrol. Earth Syst. Sci., 27, 2827–2845,, 2023. 

Dirmeyer, P. A., Chen, L., Wu, J., Shin, C.-S., Huang, B., Cash, B. A., Bosilovich, M. G., Mahanama, S., Koster, R. D., Santanello, J. A., Ek, M. B., Balsamo, G., Dutra, E., and Lawrence, D. M.: Verification of land–atmosphere coupling in forecast models, reanalyses, and land surface models using flux site observations, J. Hydrometeorol., 19, 375–392, 2018. 

Dombrowski, O., Brogi, C., Hendricks Franssen, H.-J., Zanotelli, D., and Bogena, H.: CLM5-FruitTree: a new sub-model for deciduous fruit trees in the Community Land Model (CLM5), Geosci. Model Dev., 15, 5167–5193,, 2022. 

Duarte, H. F., Raczka, B. M., Ricciuto, D. M., Lin, J. C., Koven, C. D., Thornton, P. E., Bowling, D. R., Lai, C.-T., Bible, K. J., and Ehleringer, J. R.: Evaluating the Community Land Model (CLM4.5) at a coniferous forest site in northwestern United States using flux and carbon-isotope measurements, Biogeosciences, 14, 4315–4340,, 2017. 

eLTER: eLTER Central Data Tools, eLTER [data set],, last access: 1 February 2024. 

European Fluxes Database Cluster: European Fluxes Database [data set],, last access: 1 February 2024. 

Evans, J. G., Ward, H. C., Blake, J. R., Hewitt, E. J., Morrison, R., Fry, M., Ball, L. A., Doughty, L. C., Libre, J. W., Hitt, O. E., Rylett, D., Ellis, R. J., Warwick, A. C., Brooks, M., Parkes, M. A., Wright, G. M. H., Singer, A. C., Boorman, D. B., and Jenkins, A.: Soil water content in southern England derived from a cosmic‐ray soil moisture observing system–COSMOS‐UK, Hydrol. Process., 30, 4987–4999, 2016. 

Evensen, G.: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics, J. Geophys. Res.-Oceans, 99, 10143–10162, 1994. 

Fertig, E., Baek, S.-J., Hunt, B., Ott, E., Szunyogh, I., Aravéquia, J., Kalnay, E., Li, H., and Liu, J.: Observation bias correction with an ensemble Kalman filter, Tellus A, 61, 210–226, 2009. 

Foken, T.: The energy balance closure problem: an overview, Ecol. Appl., 18, 1351–1367,, 2008. 

Fox, A. M., Huo, X., Hoar, T. J., Dashti, H., Smith, W. K., MacBean, N., Anderson, J. L., Roby, M., and Moore, D. J. P.: Assimilation of global satellite leaf area estimates reduces modeled global carbon uptake and energy loss by terrestrial ecosystems, J. Geophys. Res.-Biogeo., 127, e2022JG006830., 2022. 

Friedland, B.: Treatment of bias in recursive filtering, IEEE T. Automat. Contr., 14, 359–367, 1969. 

Girotto, M., Lannoy, G. J., Reichle, R. H., Rodell, M., Draper, C., Bhanja, S. N., and Mukherjee, A.: Benefits and pitfalls of GRACE data assimilation: A case study of terrestrial water storage depletion in India, Geophys. Res. Lett., 44, 4107–4115,, 2017 

Han, X., Franssen, H. J. H., Montzka, C., and Vereecken, H.: Soil moisture and soil properties estimation in the Community Land Model with synthetic brightness temperature observations, Water Resour. Res., 50, 6081–6105, 2014. 

Han, X., Li, X., He, G., Kumbhar, P., Montzka, C., Kollet, S., Miyoshi, T., Rosolem, R., Zhang, Y., Vereecken, H., and Franssen, H.-J. H.: DasPy 1.0 – the Open Source Multivariate Land Data Assimilation Framework in combination with the Community Land Model 4.5, Geosci. Model Dev. Discuss., 8, 7395–7444,, 2015a. 

Han, X., Franssen, H.-J. H., Rosolem, R., Jin, R., Li, X., and Vereecken, H.: Correction of systematic model forcing bias of CLM using assimilation of cosmic-ray Neutrons and land surface temperature: a study in the Heihe Catchment, China, Hydrol. Earth Syst. Sci., 19, 615–629,, 2015b. 

Hendricks Franssen, H.-J., Stöckli, R., Lehner, I., Rotenberg, E., and Seneviratne, S. I.: Energy balance closure of eddy-covariance data: A multisite analysis for European FLUXNET stations, Agr. Forest Meteorol., 150, 1553–1567, 2010. 

Hudiburg, T. W., Law, B. E., and Thornton, P. E.: Evaluation and improvement of the Community Land Model (CLM4) in Oregon forests, Biogeosciences, 10, 453–470,, 2013. 

Hung, C. P., Schalge, B., Baroni, G., Vereecken, H., and Hendricks Franssen, H.-J.: Assimilation of groundwater level and soil moisture data in an integrated land surface-subsurface model for southwestern Germany, Water Resour. Res., 58, e2021WR031549,, 2022. 

ICOS: ICOS Central Data Tools, ICOS [data set],, last access: 1 February 2024. 

Jung, M., Reichstein, M., Margolis, H. A., Cescatti, A., Richardson, A. D., Arain, M. A., Arain, M.A., Arneth, A., Bernhofer, C., Bonal, D., Chen, J., Gianelle, D., Gobron, N., Kiely, G., Kutsch, W., Lasslop, G., Law, B.E., Lindroth, A., Merbold, L., Montagnani, L., Moors, E.J., Papale, D., Sottocornalo, M., Vaccari, F., and Williams, C.: Global patterns of land-atmosphere fluxes of carbon dioxide, latent heat, and sensible heat derived from eddy covariance, satellite, and meteorological observations, J. Geophys. Res., 116, G00J07,, 2011. 

Kennedy, D., Swenson, S., Oleson, K. W., Lawrence, D. M., Fisher, R., Lola da Costa, A. C., and Gentine, P.: Implementing plant hydraulics in the Community Land Model, version 5, J. Adv. Model. Earth Sy., 11, 485–513,, 2019. 

Köhli, M., Schrön, M., Zreda, M., Schmidt, U., Dietrich, P., and Zacharias, S.: Footprint characteristics revised for field‐scale soil moisture monitoring with cosmic‐ray neutrons, Water Resour. Res., 51, 5772–5790, 2015. 

Lawrence, D., Fisher, R., Koven, C., Oleson, K., Swenson, S., Vertenstein, M., Andre, B., Bonan, G., Ghimire, B., van Kam- penhout, L., Kennedy, D., Kluzek, E., Knox, R., Lawrence, P., Li, F., Li, H., Lombardozzi, D., Lu, Y., Perket, P., Riley, W., Sacks, W., Shi, M., Wieder, W., Xu, C., Ali, A., Badger, A., Bisht, G., Broxton, P., Brunke, M., Buzan, J., Clark, M., Craig, T., Dahlin, K., Drewniak, B., Emmons, L., Fisher, J., Flanner, M., Gentine, P., Lenaerts, J., Levis, S., Leung, L., Lipscomb, W., Pelletier, J., Ricciuto, D., Sanderson, B., Shuman, J., Slater, A., Subin, Z., Tang, J., Tawfik, A., Thomas, Q., Tilmes, S., Vitt, F., and Zeng, X.: Technical description of version 5.0 of the Community Land Model (CLM), National Center for Atmospheric Research, University Corporation for Atmospheric Research, Boulder, CO, (last access: 14 February 2024), 2018. 

Lawrence, D. M., Fisher, R. A., Koven, C. D., Oleson, K. W., Swenson, S. C., Bonan, G., Collier, N., Ghimire, B., van Kampenhout, L., Kennedy, D., Kluzek, E., Lawrence, P. J., Li, F., Li, H., Lombardozzi, D., Riley, W. J., Sacks, W. J., Shi, M., Vertenstein, M., Wieder, W. R., Xu, C., Ali, A. A., Badger, A. M., Bisht, G., Brunke, M. A., Burns, S. P., Buzan, J., Clark, M., Craig, A., Dahlin, K., Drewniak, B., Fisher, J. B., Flanner, M., Fox, A. M., Gentine, P., Hoffman, F., Keppel-Aleks, G., Knox, R., Kumar, S., Lenaerts, J., Leung, L. R., Lipscomb, W. H., Lu, Y., Pandey, A., Pelletier, J. D., Perket, J., Randerson, J. T., Ricciuto, D. M., Sanderson, B. M., Slater, A., Subin, Z. M., Tang, J., Thomas, R. Q., Val Martin, M., and Zeng, X.: The Community Land Model version 5: Description of new features, benchmarking, and impact of forcing uncertainty, J. Adv. Model. Earth Sy., 11, 4245–4287, 2019. 

Liu, D. and Mishra, A. K.: Performance of AMSR_E soil moisture data assimilation in CLM4.5 model for monitoring hydrologic fluxes at global scale, J. Hydrol., 547, 67–79, 2017. 

Liu, S., Li, X., Xu, Z., Che, T., Xiao, Q., Ma, M., Liu, Q., Jin, R., Guo, J., Wang, L., Wang, W., Qi, Y., Li, H., Xu, T., Ran, Y., Hu, X., Shi, S., Zhu, Z., Tan, J., Zhang, Y., and Ren, Z.: The Heihe Integrated Observatory Network: A basin‐scale land surface processes observatory in China, Vadose Zone J., 17, 1–21, 2018. 

Mahmood, T., Xie, Z., Jia, B., Habib, A., and Mahmood, R.: A Soil Moisture Data Assimilation System for Pakistan Using PODEn4DVar and CLM4. 5, J. Meteorol. Res., 33, 1182–1193, 2019. 

Naz, B. S., Kurtz, W., Montzka, C., Sharples, W., Goergen, K., Keune, J., Gao, H., Springer, A., Hendricks Franssen, H.-J., and Kollet, S.: Improving soil moisture and runoff simulations at 3 km over Europe using land surface data assimilation, Hydrol. Earth Syst. Sci., 23, 277–301,, 2019. 

Nearing, G. S., Ruddell, B. L., Clark, M. P., Nijssen, B., and Peters-Lidard, C.: Benchmarking and process diagnostics of land models, J. Hydrometeorol., 19, 1835–1852, 2018. 

Nerger, L., Hiller, W., and Schröter, J.: PDAF-the parallel data assimilation framework: experiences with Kalman filtering, in: Use of high performance computing in meteorology, World Scientific, 63–83,, 2005. 

Parr, T. W., Ferretti, M., Simpson, I. C., Forsius, M., and Kovács-Láng, E.: Towards a long-term integrated monitoring programme in Europe: network design in theory and practice, Environ. Monit. Assess., 78, 253–290, 2002. 

Patil, A., Fersch, B., Hendricks Franssen, H.-J., and Kunstmann, H.: Assimilation of Cosmogenic Neutron Counts for Improved Soil Moisture Prediction in a Distributed Land Surface Model, Front. Water, 3. 729592,, 2021. 

Peters-Lidard, C. D., Kumar, S. V., Mocko, D. M., and Tian, Y.: Estimating evapotranspiration with land data assimilation systems, Hydrol. Process., 25, 3979–3992,, 2011. 

Post, H., Hendricks Franssen, H.-J., Han, X., Baatz, R., Montzka, C., Schmidt, M., and Vereecken, H.: Evaluation and uncertainty analysis of regional-scale CLM4.5 net carbon flux estimates, Biogeosciences, 15, 187–208,, 2018. 

Raczka, B., Duarte, H. F., Koven, C. D., Ricciuto, D., Thornton, P. E., Lin, J. C., and Bowling, D. R.: An observational constraint on stomatal function in forests: evaluating coupled carbon and water vapor exchange with carbon isotopes in the Community Land Model (CLM4.5), Biogeosciences, 13, 5183–5204,, 2016. 

Rahman, A., Maggioni, V., Zhang, X., Houser, P., Sauer, T., and Mocko, D. M.: The joint assimilation of remotely sensed leaf area index and surface soil moisture into a land surface model, Remote Sensing, 14, 437,, 2022. 

Reichle, R. H., Koster, R. D., Liu, P., Mahanama, S. P., Njoku, E. G., and Owe, M.: Comparison and assimilation of global soil moisture retrievals from the Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E) and the Scanning Multichannel Microwave Radiometer (SMMR), J. Geophys. Res.-Atmos., 112, D09108,, 2007. 

Rosolem, R., Hoar, T., Arellano, A., Anderson, J. L., Shuttleworth, W. J., Zeng, X., and Franz, T. E.: Translating aboveground cosmic-ray neutron intensity to high-frequency soil moisture profiles at sub-kilometer scale, Hydrol. Earth Syst. Sci., 18, 4363–4379,, 2014. 

Schrön, M., Köhli, M., Scheiffele, L., Iwema, J., Bogena, H. R., Lv, L., Martini, E., Baroni, G., Rosolem, R., Weimar, J., Mai, J., Cuntz, M., Rebmann, C., Oswald, S. E., Dietrich, P., Schmidt, U., and Zacharias, S.: Improving calibration and validation of cosmic-ray neutron sensors in the light of spatial sensitivity, Hydrol. Earth Syst. Sci., 21, 5009–5030,, 2017. 

Shrestha, P., Kurtz, W., Vogel, G., Schulz, J. P., Sulis, M., Hendricks Franssen, H. J., Kollet S., and Simmer, C.: Connection between root zone soil moisture and surface energy flux partitioning using modeling, observations, and data assimilation for a temperate grassland site in Germany, J. Geophys. Res.-Biogeo., 123, 2839–2862, 2018. 

Shuttleworth, J., Rosolem, R., Zreda, M., and Franz, T.: The COsmic-ray Soil Moisture Interaction Code (COSMIC) for use in data assimilation, Hydrol. Earth Syst. Sci., 17, 3205–3217,, 2013. 

Strebel, L., Bogena, H., Vereecken, H., and Hendricks Franssen, H.-J.: lstrebel/TSMP: CLM5+PDAF with helper scripts (CLM5+PDAF-with_helper_scripts), Zenodo [code],, 2021. 

Strebel, L., Bogena, H. R., Vereecken, H., and Hendricks Franssen, H.-J.: Coupling the Community Land Model version 5.0 to the parallel data assimilation framework PDAF: description and applications, Geosci. Model Dev., 15, 395–411,, 2022. 

Tang, J., Riley, W. J., and Niu, J.: Incorporating root hydraulic redistribution in CLM4.5: Effects on predicted site and global evapotranspiration, soil moisture, and water storage, J. Adv. Model. Earth Sy., 7, 1828–1848,, 2015. 

Wilson, D. J., Western, A. W., and Grayson, R. B.: Identifying and quantifying sources of variability in temporal and spatial soil moisture observations, Water Resour. Res., 40, W02507,, 2004. 

Wurster, P., Maneta, M., Beguería, S., Cobourn, K., Maxwell, B., Silverman, N., Ewing, S., Jensco, K., Gardner, P., Kimball, J., Holden, Z., Ji, X., and Vicente-Serrano, S. M.: Characterizing the Impact of Climatic and Price Anomalies on Agrosystems in the Northwest United States, Agr. Forest Meteorol., 280, 107778,, 2020. 

Zhang, C., Di, Z., Duan, Q., Xie, Z., and Gong, W.: Improved land evapotranspiration simulation of the community land model using a surrogate-based automatic parameter optimization method, Water, 12, 943,, 2020. 

Zhang, L., Lei, H., Shen, H., Cong, Z., Yang, D., and Liu, T.: Evaluating the representation of vegetation phenology in the Community Land Model 4.5 in a temperate grassland, J. Geophys. Res.-Biogeo., 124, 187–210, 2019. 

Zhang, T., Sun, R., Peng, C., Zhou, G., Wang, C., Zhu, Q., and Yang, Y.: Integrating a model with remote sensing observations by a data assimilation approach to improve the model simulation accuracy of carbon flux and evapotranspiration at two flux sites, Science China Earth Sciences, 59, 337–348, 2016. 

Zreda, M., Desilets, D., Ferré, T. P. A., and Scott, R. L.: Measuring soil moisture content non‐invasively at intermediate spatial scale using cosmic‐ray neutrons, Geophys. Res. Lett., 35, L21402,, 2008. 

Zreda, M., Shuttleworth, W. J., Zeng, X., Zweck, C., Desilets, D., Franz, T., and Rosolem, R.: COSMOS: the COsmic-ray Soil Moisture Observing System, Hydrol. Earth Syst. Sci., 16, 4079–4099,, 2012. 

Short summary
We present results from using soil water content measurements from 13 European forest sites in a state-of-the-art land surface model. We use data assimilation to perform a combination of observed and modeled soil water content and show the improvements in the representation of soil water content. However, we also look at the impact on evapotranspiration and see no corresponding improvements.