Correction of systematic model forcing bias of CLM using assimilation of cosmic-ray Neutrons and land surface temperature: a study in the Heihe Catchment,

. The recent development of the non-invasive cosmic-ray soil moisture sensing technique ﬁlls the gap between point-scale soil moisture measurements and regional-scale soil moisture measurements by remote sensing. A cosmic-ray probe measures soil moisture for a footprint with a diameter of ∼ 600 m (at sea level) and with an effective measurement depth between 12 and 76 cm, depending on the soil humidity. In this study, it was tested whether neutron counts also allow correcting for a systematic error in the model forcings. A lack of water management data often causes systematic input errors to land surface models. Here, the assimilation procedure was tested for an irrigated corn ﬁeld (Heihe Watershed Allied Telemetry Experimental Research – HiWATER, 2012) where no irrigation data were available as model input although for the area a signiﬁcant amount of water was irrigated. In the study, the measured cosmic-ray neutron counts and Moderate-Resolution Imaging Spectroradiometer (MODIS) land surface temperature (LST) products were jointly assimilated into the Community Land Model (CLM) with the local ensemble transform Kalman ﬁlter. Different data assimilation scenarios were evaluated, with assimilation of LST and/or cosmic-ray neutron counts, and possibly parameter estimation of leaf area index (LAI). The results show that the direct assimilation of cosmic-ray neutron counts can improve the soil moisture and evapotranspiration (ET) estimation signiﬁcantly, correcting for lack of information on irrigation amounts. The joint assimilation of neutron counts and LST could improve further the ET estimation, but the information content of neutron counts exceeded the one of LST. Additional improvement was achieved by calibrating LAI, which after calibration was also closer to independent ﬁeld measurements. It was concluded that assimilation of neutron counts was useful for ET and soil moisture estimation even if the model has a systematic bias like neglecting irrigation. However, also the assimilation of LST helped to correct the systematic model bias introduced by neglecting irrigation and LST could be used to update soil moisture with state augmentation.

on soil moisture. Neutron count intensity is measured noninvasively at an intermediate scale between the point-scale and the coarse remote sensing scale (Zreda et al., 2008). A network of cosmic-ray sensors (CRSs) has been set up over North America .
Cosmic rays are composed of primary protons mainly. The fast neutrons generated by high-energy neutrons colliding with nuclei lead to "evaporation" of fast neutrons, and the generated and moderated neutrons in the ground can diffuse back into the air, where their intensity can be measured by the cosmic-ray soil moisture probe. Soil moisture affects the rate of moderation of fast neutrons and controls the neutron concentration and the emission of neutrons into the air. Dry soils have low moderating power and are highly emissive; wet soils have high moderating power and are less emissive. The neutrons are mainly moderated by the hydrogen atoms contained in the soil water and emitted to the atmosphere, where the neutrons mix instantaneously at a scale of hundreds of meters. The measurement area of a cosmic-ray soil moisture probe represents a circle with a diameter of ∼ 600 m at sea level (Desilets and Zreda, 2013), and the measurement depth decreases nonlinearly from ∼ 76 (dry soils) to ∼ 12 cm (saturated soils) (Zreda et al., 2008). The measured cosmic-ray neutron counts show an inverse correlation with soil moisture content. The cosmic-ray neutron intensity could be reduced to 60 % of surface cosmic-ray neutron intensity by increasing the soil moisture from 0 to 40 % (Zreda et al., 2008). The soil moisture estimation on the basis of cosmic-ray-probe-based neutron counts over a horizontal footprint of hectometers has received considerable attention in the scientific literature in recent years (Desilets et al., 2010;Zreda et al., 2008Zreda et al., , 2012. Hydrogen atoms are present as water in the soil, lattice soil water, belowground biomass, atmospheric water vapor, snow water, aboveground biomass, intercepted water by vegetation and water on the ground. These additional hydrogen sources contribute to the measured neutron intensity. The role of these additional hydrogen sources should be included in the analysis of the cosmic-ray measurements in order to isolate the main contribution from soil moisture. Formulations for handling water vapor , for lattice water and organic carbon  and for a litter layer present on the soil surface (Bogena et al., 2013) have been developed.
The positive impact of soil moisture data assimilation has been shown in several studies. Importantly, surface soil moisture could be used to obtain better characterization of the root zone soil moisture (Barrett and Renzullo, 2009;Crow et al., 2008;Das et al., 2008;Draper et al., 2011;Li et al., 2010). It has also shown that the assimilation of soil moisture observations can be used to correct rainfall errors (Crow et al., 2011;Yang et al., 2009). Often a systematic bias between measured and modeled soil moisture content can be found; soil moisture estimation can be significantly improved using joint state and bias estimation (De Lannoy et al., 2007;Kumar et al., 2012;Reichle, 2008). Also studies on data assim-ilation of remotely sensed land surface temperature products show a positive impact on the estimation of soil moisture, latent heat flux and sensible heat flux (Ghent et al., 2010;Xu et al., 2011). Also in these studies it was found that bias, in these cases soil temperature bias, of land surface models can be removed with land surface temperature assimilation (Bosilovich et al., 2007;Reichle et al., 2010). Other studies have updated both land surface model states and parameters with soil moisture and land surface temperature data (Bateni and Entekhabi, 2012;Han et al., 2014a;Pauwels et al., 2009). The assimilation of measured cosmic-ray neutron counts in a land surface model was successfully tested, but these studies focused on state updating alone (Rosolem et al., 2014;Shuttleworth et al., 2013). In this paper we focus on the assimilation of measured cosmic-ray neutron counts for improving soil moisture content characterization at the field scale. This paper focuses on the case of model input being biased. Land surface models still are affected by limited knowledge on water resources management, and for regions in China (and elsewhere) typically no information on irrigation amounts is available as irrigation is mainly by the flooding system. We analyze whether measured neutron counts are able to correct for such biases. This case is not only relevant for neglecting irrigation in China, but also for other water resources management issues (e.g., groundwater pumping) which are neglected in the simulations. Neglecting irrigation in land surface models results in a large bias in the simulated soil moisture content because of a lack of water input. The bias in soil moisture content also results in a too-small latent heat flux and too-high sensible heat flux. We hypothesize that data assimilation also can play an important role for removing such biases in data-deficient areas. One possible strategy in data assimilation studies for handling this type of bias, which is not followed in this paper, is to calibrate the simulation model (e.g., land surface model) prior to data assimilation to remove biases  and use the corrected simulation model in the context of sequential data assimilation. A different strategy was followed in this paper, and no a priori bias correction was carried out because this type of problem (neglecting water resources management) does not allow for such an a priori bias correction. The bias can be attributed to the model structure, model parameters, atmospheric forcing or observation data, and the bias-aware assimilation requires the assumption that the bias comes from a particular source. If the source of bias is not attributed to the right source, model predictions cannot be improved (Dee, 2005). Therefore bias-blind assimilation was used for safety, and the bias estimation was not handled explicitly. Instead, we investigated whether neutron counts measured by cosmic-ray probe were able to correct for the bias. The aim is to improve the soil moisture profile estimation in a crop land with seed corn as the main crop type.
In CLM, land surface fluxes are calculated based on the Monin-Obukhov similarity theory. The sensible heat flux is formulated as a function of temperature and leaf area index (LAI), and the latent heat flux is formulated as a function of the temperature and leaf stomatal resistances. The leaf stomatal resistance is calculated from the Ball-Berry conductance model (Collatz et al., 1991). The updates of soil temperature and vegetation temperature are derived based on the solar radiation absorbed by top soil (or vegetation), longwave radiation absorbed by soil (or vegetation), sensible heat flux from soil (or vegetation) and latent heat flux from soil (or vegetation). Measured land surface temperature is composed of the ground temperature and vegetation temperature. Therefore a difference between measured and calculated land surface temperature can be adjusted by changing land surface fluxes. As land surface fluxes are sensitive to soil moisture content, land surface temperature is sensitive to soil moisture content.
Therefore, the land surface temperature (LST) products measured by the Moderate-Resolution Imaging Spectroradiometer (MODIS) Terra (MOD11A1) and Aqua (MYD11A1) are also assimilated jointly to improve the soil temperature profile estimation because the evapotranspiration (ET) is sensitive to the soil temperature. Two Terra LST products can be obtained per day at 10:30/22:30 and two Aqua LST products can be obtained per day at 01:30/13:30. Soil moisture, land surface temperature and LAI influence the estimation of latent and sensible heat fluxes (Ghilain et al., 2012;Jarlan et al., 2008;Schwinger et al., 2010;van den Hurk, 2003;Yang et al., 1999), and therefore this study also focused on the calibration of LAI with the help of the assimilation of land surface temperature. However, there are large discrepancies between the remotely retrieved LAI and measured values, and the MODIS LAI product underestimates in situ measured LAI by 44 % on average (http: //landval.gsfc.nasa.gov/), and therefore the LAI is also calibrated by data assimilation. In summary, the novel aspects of this work are the following: (1) investigating whether data assimilation is able to correct for missing water resources management data without a priori bias correction; (2) joint assimilation of cosmic-ray neutron counts, LST and updating of LAI; and (3) application of this framework to real-world data in an irrigated area where detailed verification data were available.

Study area and measurement
The Heihe River basin is the second-largest inland river basin of China; it is located at 97.1-102.0 • E and 37.7-42.7 • N and covers an area of approximately 143 000 km 2 . In 2012, a multi-scale observation experiment of evapotranspiration with a well-equipped superstation (Daman superstation) to measure the atmospheric forcings and soil moisture at 2, 4, 10, 20, 40, 80, 120 and 160 cm depth  was carried out from June to September in the framework of the Heihe Watershed Allied Telemetry Experimental Research (HiWATER) . SoilNet wireless network nodes (Bogena et al., 2010) were deployed to measure soil moisture content and soil temperature at four layers (4, 10, 20 and 40 cm). One cosmic-ray soil moisture probe (CRS-1000B) was installed (Han et al., 2014b) with 23 SoilNet nodes (Jin et al., , 2014 in the footprint (Fig. 1). The main crop type within the footprint of the cosmic-ray probe is seed corn. The irrigation is applied through channels using the flooding irrigation method. Exact amounts of applied irrigation are therefore not available.
The measured cosmic-ray neutron count data were processed to remove the outliers according to the sensor voltage (≤ 11.8 Volt) and relative humidity (≥ 80 %) . The surface fluxes were measured using the eddy covariance technique, and data were processed using EdiRe (http://www.geos.ed.ac.uk/abs/research/ micromet/EdiRe) software, in which the anemometer coordinate rotation, signal lag removal, frequency response correction, density corrections and signal de-spiking were done for the raw data. The energy balance closure was not considered in this study. The LAI was measured by the LAI-2000 scanner during the field experiment; there are 17 samples collected on 14 days over 3 months.

Land surface model and data
The CLM was used to simulate the spatiotemporal distribution of soil moisture, soil temperature, land surface temperature, vegetation temperature, sensible heat flux, latent heat flux and soil heat flux of the study area. The coupled water and energy balance are modeled in CLM, and the land surface heterogeneity is represented by patched plant functional types and soil texture (Oleson et al., 2013). The soil properties used in CLM were from the soil database of China with 1 km spatial resolution (Shangguan et al., 2013). The MODIS 500 m resolution plant functional type product (MCD12Q1) (Sun et al., 2008), which was resampled by nearest-neighbor interpolation to 1 km resolution, and the MODIS LAI product (MCD15A3) with 1 km spatial resolution (Han et al., 2012) were used as input. Due to a lack of measurement data, two atmospheric forcing data sets were used: the Global Land Data Assimilation System reanalysis data (Rodell et al., 2004) was interpolated using the National Centers for Environmental Prediction (NCEP) bilinear interpolation library iplib in spatial and temporal dimensions and used in the CLM for the spin-up period (http://www.nco.ncep.noaa.gov/pmb/ docs/libs/iplib/ncep_iplib.shtml). For the 3-month data assimilation period, hourly forcing data (incident longwave radiation, incident solar radiation, precipitation, air pressure, specific humidity, air temperature and wind speed) from the Daman superstation of HiWATER were available and used.

Cosmic-ray forward model
In this study, the newly developed COsmic-ray Soil Moisture Interaction Code (COSMIC) model  was used as the cosmic-ray forward model to simulate the cosmic-ray neutron count rate using the soil moisture profile as input. The effective measurement depth of the cosmic-ray soil moisture probe ranges from 12 cm (wet soils) to 76 cm (dry soils) (Zreda et al., 2008), within which 86 % of the aboveground measured neutrons originate. COSMIC also calculates the effective sensor depth based on the cosmic-ray neutron intensity and the soil moisture profile values Shuttleworth et al., 2013).
COSMIC makes several assumptions to calculate the number of fast neutrons reaching the cosmic-ray soil moisture probe (N COSMOS ) at a near-surface measurement location. The soil layer with a depth of 3 m for the complete soil profile was discretized into 300 layers for the integration of Eq. (2) in COSMIC. The number of fast neutrons reaching the cosmic-ray probe N COSMOS is formulated as where N is the high-energy neutron flux; z denotes the soil layer depth (m); ρ s the dry soil bulk density (g cm −3 ); ρ w the total water density, including the lattice water (g cm −3 ); and α denotes the ratio of fast-neutron creation factor. L 1 is the high-energy soil attenuation length with value of 162.0 g cm −2 and L 2 the high-energy water attenuation length of 129.1 g cm −2 . In Eq.
(2) θ is the angle between the vertical below the detector and the line between the detector and each point in the plane; m s (z) and m w (z) are the integrated mass per unit area of dry soil and water (g cm −2 ), respectively. L 3 denotes the fast-neutron soil attenuation length (g cm −2 ), and L 4 stands for the fast-neutron water attenuation length with a value of 3.16 g cm −2 . The cosmic-ray neutron intensity reaching the land surface is influenced by air pressure, atmospheric water vapor content and incoming neutron flux. In order to isolate the contribution of soil moisture content to the measured neutron density, it is important to take these effects into account, and the calibrated neutron count intensity can be derived as follows where N Corr represents corrected neutron counts and N Obs the measured neutron counts. f P is the correction factor for air pressure, f wv the correction factor for atmospheric water vapor and f i the correction factor for incoming neutron flux. The correction factor for air pressure f P can be calculated as where P (mbar) is the local air pressure, P 0 (mbar) the average air pressure during the measurement period and L (g cm −2 ) is the mass attenuation length for high-energy neutrons; the default value of 128 g cm −2 was used in this study .
The correction factor f wv for atmospheric water vapor is calculated as where ρ v0 (k gm −3 ) is the absolute humidity at the measurement time and ρ ref v0 (kg m −3 ) is the average absolute humidity during the measurement period.
Fluctuations in the incoming neutron flux should be removed because the cosmic-ray probe is designed to measure the neutron flux based on the incoming background neutron flux. The correcting factor f i for the incoming neutron flux is calculated as where N m is the measured incoming neutron flux and N avg is the average incoming neutron flux during the measurement period. The measured data at the Jungfraujoch station in Switzerland at 3560 m (http://cosray.unibe.ch/) were used to calculate N m and N avg . The temporal (secular or diurnal) variations caused by the sunspot cycle could be removed after this correction . In this study, the soil moisture for the CRS footprint scale was calculated from the arithmetic mean of the 23 Soil-Net soil moisture observations. The calibration of the highenergy neutron intensity parameter N in Eq. (1) was done using the measured cosmic-ray neutron counts rate and averaged soil moisture content at the CRS footprint scale. Because lattice water was unknown for this site, a value of 3 % was assumed in this study . Hourly soil moisture measurements for a period of 2.5 months were used for COSMIC calibration. Inside the cosmic-ray probe footprint, the amount of applied irrigation was spatially variable due to the different management practice of each farmer. The gradient search algorithm L-BFGS-B (Zhu et al., 1997) was used to minimize the root mean square error (RMSE) of the differences between simulated cosmic-ray neutron counts (using measured soil moisture by SoilNet as input to COS-MIC) and the measured neutron counts N Corr . The optimized parameter value of N was 615.96 counts h −1 in this case.
The simulated soil moisture content for 10 CLM soil layers (3.8 m depth) was used as input to COSMIC in order to simulate the corresponding neutron count intensity and compare it with the measured neutron count intensity. It should be mentioned that it is unlikely that anything beyond 1 m depth will substantially impact the results because the effective measurement depth of the cosmic-ray probe is between 12 and 76 cm. The COSMIC model assumes a more detailed soil profile. COSMIC interpolates the soil moisture information from the 10 CLM soil layers to information for 300 soil layers of 1 cm depth. The contribution of each soil layer to the measured neutron flux will change temporally depending on the soil moisture condition. Therefore the effective measurement depth of the cosmic ray probe will also change temporally. COSMIC calculates the vertically weighted soil moisture content based on the vertical distribution of soil moisture content.

Two-source formulation -TSF
The land surface temperature products of MODIS are composed of a ground temperature and vegetation temperature component, which are however unknown. CLM models the ground temperature and vegetation temperature separately, but it does not model the composed land surface temperature as seen by MODIS. The corresponding land surface temperature of CLM should therefore be modeled for data assimilation purposes. The two-source formulation (Kustas and Anderson, 2009) was used in this study to calculate the land surface temperature from the MODIS view angle using ground temperature and vegetation temperature simulated by CLM: where T S (K) is the composed surface temperature as seen by the MODIS sensor, F c ( ) is the fraction vegetation cover observed from the sensor view angle (radians), T c (K) is the vegetation temperature and T g (K) is the ground temperature (Kustas and Anderson, 2009): where ( ) is a clumping index to represent the nonrandom leaf area distributions of farmland or other heterogeneous land surfaces (Anderson et al., 2005); it is defined as

Assimilation approach
The local ensemble transform Kalman filter (LETKF) was used as the assimilation algorithm, which is one of the square-root variants of the ensemble Kalman filter (Evensen, 2003;Hunt et al., 2007;Miyoshi and Yamane, 2007). The model uncertainties are represented using the ensemble simulation of model states, and LETKF derives the background error covariance using the model state ensemble members. LETKF uses the non-perturbed observations to update all the ensemble members of model states at each assimilation step. In this study, x b 1 , . . ., x b N denote the model state ensemble members; x b is the ensemble mean of x b 1 , .., x b N ; N is the ensemble size; y b 1 , . . ., y b N denote the mapped model state ensemble members; y b is the ensemble mean of y b 1 , . . ., y b N ; and H is the observation operator (COSMIC for soil moisture or the two-source function for land surface temperature). The analysis step of LETKF can be summarized as follows.
Prepare the model state vector X b : where x b is composed of one vertically weighted soil moisture content and soil moisture content for 10 CLM layers, resulting in a state dimension equal to 11 if only the neutron count observation was assimilated; and x b is composed of surface temperature, ground temperature, vegetation temperature and soil temperature for 15 CLM layers if only the land surface temperature observations were assimilated without soil moisture update, giving a state dimension of 18. The water and energy balance are coupled, and in CLM the energy balance is firstly solved; then the derived surface fluxes are used for updating soil moisture content. The cross correlation between the soil temperature and soil moisture can be calculated using the ensemble prediction in LETKF, and this makes the updating of soil moisture by assimilating land surface temperature possible. We also used the land surface temperature to update the soil moisture profile; in this case the soil moisture vector was augmented to the LETKF state vector of land surface temperature assimilation, resulting in a state dimension of 28. Construct the mapped model state vector Y b after transformation of observation operator: The following analysis is looped for each model grid cell to calculate the update of model state ensemble members.
Calculate analysis error covariance matrix P a : where I is the identity matrix. The perturbations in ensemble space are calculated as Calculate the analysis mean w a in ensemble space and add to each column of W a to get the analysis ensemble in ensemble space: Calculate the new analysis: where R is the observation error covariance matrix, y o is the observation vector and X a contains the updated model ensemble members.
The LETKF method can also be extended to do parameter estimation using a state augmentation approach (Bateni and Entekhabi, 2012;Li and Ren, 2011;Moradkhani et al., 2005;Nie et al., 2011). Alternative strategies for parameter estimation are a dual approach (Moradkhani et al., 2005) with separate updating of states and parameters. Vrugt et al. (2005) also proposed a dual approach with parameter updating in an outer optimization loop using a Markov chain Monte Carlo method, and state updating in an inner loop. The a priori calibration of model parameters is also an option . With the augmentation approach, the state vector of LETKF can be augmented by the parameter vector including soil properties (sand fraction, clay fraction and organic matter density) and vegetation parameters (LAI, etc.). In a preliminary sensitivity study it was found that for this site simulation results were more sensitive to the LAI than to soil properties. Soil texture is also quite well known for this site from measurements. Therefore in this study, only the LAI was in some of the simulation scenarios calibrated. In the different scenarios of land surface temperature assimilation, the LETKF state vector was also augmented to include LAI as a calibration target. As a consequence, the augmented state vector contains surface temperature, ground temperature, vegetation temperature, 15 layers of soil temperature and LAI, making up a state dimension equal to 19 for the scenarios of land surface temperature assimilation without soil moisture update; for the scenarios of land surface temperature with soil moisture update, the state dimension is 29. The 10 layers of soil moisture and 15 layers of soil temperature are the standard CLM layout for both soil moisture and soil temperature. The hydrology calculations are done over the top 10 layers, and the bottom 5 layers are specified as bedrock. The lower 5 layers are hydrologically inactive layers. Temperature calculations are done over all layers (Oleson et al., 2013).

Experiment setup
First the 50 ensemble members of CLM with perturbed soil properties and atmospheric forcing data were driven from 1 January to 31 May 2012 to do the CLM spin-up; second an additional assimilation period of cosmic-ray neutron counts was done from 1 June to 30 August 2012 to reduce the spinup error. The final CLM states on 30 August 2012 were used as the initial states for 1 June 2012 for the data assimilation scenarios. Perturbed soil properties were generated by adding a spatially uniform perturbation sampled from a uniform distribution between −10 and 10 % to the values extracted from the Soil Database of China for Land Surface Modeling (1 km spatial resolution). The LAI was perturbed with multiplicative uniform distributed random noise in the range of [0.8-1.2]. The perturbations added to the model forcings show correlations in space and time. The spatial correlation was induced by a fast Fourier transform, and the temporal correlation by a first-order auto-regressive model (Han et al., 2013;Kumar et al., 2009;Reichle et al., 2010). The statistics on the perturbation of the forcing data are summarized in Table 1. The values of standard deviations and temporal correlations in Table 1 were chosen based on previous catchment-scale and regional-scale data assimilation studies (De Lannoy et al., 2012;Kumar et al., 2012;Reichle et al., 2010).
The cosmic-ray neutron intensity was assimilated every 3 days at 12:00 Z from 1 June 2012 onwards. We found that the differences between daily assimilation and 3-day assimilation were small; therefore only the results of the 3-day assimilation are shown. The measured neutron count intensity showed large temporal fluctuations in time, and these fluctuations did not correspond to the temporal variations of soil moisture. Therefore the measured neutron count intensity was smoothed with the Savitzky-Golay filter using a moving average window of size 31 h and a polynomial of order 4 (Savitzky and Golay, 1964). The originally measured neutron counts and smoothed neutron counts are plotted in Fig. 2. The assimilation frequency of MODIS LST products of MOD11A1 and MYD11A1 was up to 4 times (maximum) per day depending on the data availability. There are 230 observation data (including cosmic-ray probe neutron counts, MODIS LST, MOD11A1 and MYD11A1 LST) in the whole assimilation window. The variance of the instantaneous measured neutron intensity is equal to the measured neutron count intensity  and smaller for temporal averaging for daily or sub-daily applications. The instantaneous neutron intensity was assimilated in this study. The variance of MODIS LST was assumed to be 1 K (Wan and Li, 2008). The 4-day MODIS LAI product was aggregated and used as the CLM LAI parameter. Because the LAI from MODIS is usually lower than the true value (compared with the fieldmeasured LAI in the HiWATER experiment) and because the surface flux and surface temperature are sensitive to the LAI, two additional scenarios were investigated where LAI was calibrated to study the impact of LAI estimation on surface flux estimation within the data assimilation framework.
The following assimilation scenarios were compared: 1. CLM: open-loop simulation without assimilation.
3. Only_LST: only the MODIS LST products were assimilated. The quality control flags of LST products were used to select the data with good quality for assimilation.
4. CRS_LST: the measured neutron counts and MODIS LST products were assimilated jointly. In the above scenarios, the neutron count data were used to update the soil moisture and the LST data were used to update the ground temperature, vegetation temperature and soil temperature.
5. LST_Feedback: we also evaluated the scenario of assimilating the LST measurements to update the soil moisture profile.
6. CRS_LST_Par_LAI: the LAI was included as variable to be calibrated; otherwise the scenario was the same as CRS_LST.
7. LST_Feedback_Par_LAI: the LAI was included as variable to be calibrated; otherwise the scenario was the same as LST_Feedback.
8. CRS_LST_True_LAI: the in situ measured LAI during the HiWATER experiment was used in the model simulation.

Results and discussion
In order to evaluate the assimilation results for the different scenarios outlined in Sect. 3, the RMSE was used: where "estimated" is the ensemble mean without assimilation or the ensemble mean after assimilation, and "measured" is measured soil moisture content evaluated at the SoilNet nodes (or latent heat flux, sensible heat flux or soil heat flux). N is the number of time steps. For the soil moisture analysis in this study, N is equal to 2184. The smaller the RMSE value is, the closer assimilation results are to measured values, which is in general considered to be desirable. The temporal evolution of soil moisture content at 10, 20, 50 and 80 cm depth for different scenarios is plotted in Figs. 3 and 4. The RMSE values for different scenarios are summarized in Table 2. Assimilating the land surface temperature could improve the soil moisture profile estimation in the scenario of LST_Feedback_Par_LAI; the soil moisture results are better than the open-loop run at all depths. With the assimilation of CRS neutron counts, the soil moisture RMSE values at 10 and 20 cm depth (scenarios CRS_LST_Par_LAI and CRS_LST_True_LAI) CRS_LST_Par_LAI, which indicates that the main improvement for the soil moisture profile characterization is achieved by neutron count assimilation; and land surface temperature assimilation and LAI estimation play a minor role. Without assimilation of cosmic-ray probe neutron counts, the soil moisture simulation cannot be improved (scenario Only_LST). However, the scenarios of LST_Feedback and LST_Feedback_Par_LAI improve the soil moisture profile characterization, which shows that explicitly using LST to update soil moisture content in the data assimilation routine gives better results than using LST only to update soil moisture by the model equations. Results of LST_Feedback and LST_Feedback_Par_LAI are similar; therefore only results for LST_Feedback_Par_LAI are shown in Figs. 3 and 4. This implies that the improved soil moisture characterization due to LAI calibration is low. The results for the cosmic-ray probe neutron count assimilation proved that the cosmic-ray probe sensor can be used to improve the soil moisture profile estimation at the footprint scale. Figure 5 depicts the scatterplots of measured ET versus modeled ET for different scenarios, and the accumulated ET for all scenarios are summarized in the lower-right corner of Fig. 5. The EC-measured ET is 384.7 mm for the assimilation period, without energy balance closure correction. The true evapotranspiration is therefore likely larger, but not much larger as the energy balance gap was limited (3.7 %). The CLM-estimated ET, without data assimilation, using only precipitation as input is 223.7 mm and is much smaller than the measured value as applied irrigation is not considered in the model. This open-loop simulated value would imply water stress and a limitation of canopy transpiration and soil evaporation due to low soil moisture content. Assimilation of land surface temperature only (Only_LST) hardly affected the estimated ET and was not able to correct for the artificial water stress condition. However, if land surface temperature was used to update soil moisture directly, taking into account correlations between the two states in the data assimilation routine, the ET estimates improved to 336.8 and 354.8 mm for the scenarios of LST_Feedback and LST_Feedback_Par_LAI, respectively. The assimilation of land surface temperature of MODIS with soil moisture update results in significant improvements of ET.
The different neutron count assimilation scenarios also resulted in significantly improved estimates of ET. Univariate assimilation of cosmic-ray neutron data (Only_CRS) resulted in 301.9 mm ET. This shows that the impact of neutron count assimilation to correct evapotranspiration estimates is slightly smaller than the impact of land surface temperature with soil moisture update. Joint assimilation of land surface temperature data and cosmic-ray neutron data (CRS_LST) gave a slightly larger ET of 310.6 mm than Only_CRS. The scenarios of CRS_LST_Par_LAI and CRS_LST_True_LAI gave the best ET estimates (360.5 and 349.3 mm). This shows that correcting the biased LAI estimates from MODIS by in situ data or calibration helped to improve model estimates.
The RMSE values of latent heat flux, sensible heat flux and soil heat flux for all scenarios are summarized in Fig. 6. It is obvious that the RMSE values are very large for both the latent heat flux (123.9 W m −2 ) and sensible heat flux ( all other scenarios where the soil moisture was not updated. When the land surface temperature was assimilated to update the soil moisture, the latent heat flux RMSE decreased to 60.5 (LST_Feedback) and 62.5 W m −2 (LST_Feedback_Par_LAI). The scenario where soil moisture and LAI are jointly updated (LST_Feedback_Par_LAI) gave worse results than the scenario of LST_Feedback. Again, the assimilation of neutron counts also resulted in a strong RMSE reduction for the latent heat flux (76.5 W m −2 for Only_CRS). When in addition land surface temperature was assimilated and LAI optimized, the RMSE value of latent heat flux further decreased to 56.1 W m −2 (70.7 W m −2 without LAI optimization). When the field-measured LAI was used instead in the assimilation (CRS_LST_True_LAI), the RMSE was 61.0 W m −2 . These results are in correspondence with the ones discussed before for soil moisture characterization. Evidently, the combined assimilation of cosmicray probe neutron counts and land surface temperature, and calibration of LAI (or use of field-measured LAI as model input) shows the strongest improvement for the estimation of land surface fluxes. The soil heat flux did not show a clear improvement related to assimilation and showed only some improvement when LAI was calibrated. For the scenario of land surface temperature assimilation without soil moisture update (Only_LST), estimates of latent and sensible heat flux are not improved. It means that, under water stress conditions, the improved characterization of land surface temperature (and soil temperature) does not contribute to a better estimation of land surface fluxes.
The updated LAI for the scenarios of LST_Feedback_Par_LAI and CRS_LST_Par_LAI is shown in Fig. 7. The MODIS LAI product was used as input for CLM, and time series are plotted as blue line in Fig. 7 (Background). The LAI was also measured in the HiWATER experiment, and the measured values are shown as a green star (Observation). Ens_Mean represents the mean LAI of all ensemble members (Ensembles). It is obvious that MODIS underestimates the LAI compared with the observations. With the assimilation of land surface temperature, the LAI could be updated and be closer to the observations, but there is still a significant discrepancy between the measured LAI and the updated one. The LAI values for the scenario with LAI calibration (CRS_LST_Par_LAI) are close to the measured LAI values (CRS_LST_True_LAI), which is an encouraging result. The calibrated LAI shows some unrealistic increases and decreases during the assimilation period, which is inherent to the data assimilation approach. A smoothed representation of the LAI might provide a more realistic picture.
This study illustrates that, for an irrigated farmland, the measured cosmic-ray probe neutron counts can be used to improve the soil moisture profile estimation significantly. Without irrigation data, CLM underestimated soil moisture content. The cosmic-ray neutron count data assimilation can be used as an alternative way to retrieve the soil moisture content profile in CLM. The improved soil moisture simulation was helpful for the characterization of the land surface fluxes. The univariate assimilation of land surface temperature without soil moisture update is not helpful for the estimation of land surface fluxes and even worsened the sensible heat flux characterization (Fig. 6). However, in a multivariate data assimilation framework where land surface temperature was assimilated together with measured cosmic-ray probe neutron counts, the land surface temperature assimilation contributed significantly to an improved ET estimation. The simulated canopy transpiration in CLM was in general too low, even when the water stress condition was corrected by assimilating neutron counts, which was related to small values of the LAI. The additional estimation of LAI through the land surface temperature assimilation resulted in an increase of the LAI, yielding an increase of estimated ET.
In general, land surface models need to be calibrated before use in land data assimilation, especially if there is an apparent large bias in the model simulation (Dee, 2005). The simulation of soil moisture and surface fluxes was biased in our study, mainly due to the lack of irrigation water as input. This bias cannot be corrected a priori without exact irrigation data, which are not available in the field. The data assimilation was proven to be an efficient way to remove the model bias in this case. We also calculated the equivalent water depth to analyze the equivalent irrigated water after each step of soil moisture update. For the scenarios of CRS_LST_Par_LAI and CRS_LST_True_LAI, the equiva-lent irrigation in 3 months was 693.6 and 607.6 mm, respectively. Because the irrigation method is flood irrigation, it is not easy to evaluate the true irrigation applied in the field. From the results we see, however, that the applied irrigation (in the model) is much larger than actual ET (∼ 600 to 700 mm vs. ∼ 400 mm). This could indicate that the amount of applied irrigation in the model is too large, but irrigation by flooding is also inefficient and results in excess runoff and infiltration to the groundwater, because it cannot be controlled as well as sprinkler irrigation or drip irrigation. Therefore, the calculated amount of irrigation could be realistic but might also be too large if soil properties are erroneous in the model.
The soil moisture content measured by the cosmic-ray probe represents the depth between 12 cm (very humid) and 76 cm (extremely dry case) depending on the amount of soil water (soil moisture content and lattice water). Therefore the effective sensor depth of the cosmic-ray probe will change over time. In order to model the variable sensor depth and the relationship between the soil moisture content and neutron counts, the new developed COSMIC model was used as the observation operator in this study. Additionally the influences of air pressure, atmospheric vapor pressure and incoming neutron counts were removed from the originally measured neutron counts. Because there is still some water in the crop which also affects the cosmic-ray probe sensor, the COSMIC observation operator could be improved to include vegetation effects. Several default parameters proposed by Shuttleworth et al. (2013) were used in the COSMIC model, and these parameters probably need further calibration following the development of the COSMIC model.
The spatial distribution of soil moisture for the study area was very heterogeneous due to the small farmland patches and different irrigation periods for the different farmlands. Therefore the soil moisture content inferred by SoilNet may not represent the true soil moisture content of the cosmicray probe footprint, which is a further limitation of this study. Although the Cosmic-ray Soil Moisture Observing System (COSMOS) has been designed as a continentalscale network by installing 500 COSMOS probes across the USA , there are still some disadvantages of COSMOS compared with remote sensing. COS-MOS is also expensive for extensive deployment to measure continental/regional-scale soil moisture.

Summary and conclusions
In this paper, we studied the univariate assimilation of MODIS land surface temperature products, the univariate assimilation of measured neutron counts by the cosmic-ray probe, the bivariate assimilation of land surface temperature and neutron count data, and the additional calibration of LAI for an irrigated farmland at the Heihe Catchment in China, where data on the amount of applied irrigation were lack-ing. The most important objective of this study was to test whether data assimilation is able to correct for the absence of information on water resources management as model input, a situation commonly encountered in large-scale land surface modeling. For the specific case of lacking irrigation data, no prior bias correction is possible. The bias-blind assimilation without explicit bias estimation was used. We focused on the model bias introduced by the forcing data and the LAI, and neglected the other sources of bias. When LAI was calibrated, this was done at each data assimilation step of land surface temperature. The data assimilation experiments were carried out with the CLM, and the data assimilation algorithm used was the LETKF. A likely further model bias, besides missing information on irrigation, is the underestimation of LAI by MODIS, which was used to force the model.
The results show that the direct assimilation of measured comic-ray neutron counts improves the estimation of soil moisture significantly, whereas univariate assimilation of land surface temperature without soil moisture update does not improve soil moisture estimation. However, if the land surface temperature was assimilated to update the soil moisture profile directly with the help of the state augmentation method, the evapotranspiration and soil moisture could be improved significantly. This result suggests that the land surface temperature remote sensing products are needed to correct the characterization of the soil moisture profile and the evapotranspiration. The improved soil moisture estimation after the assimilation of neutron counts resulted in a better ET estimation during the irrigation season, correcting the toolow ET of the open-loop simulation. The joint assimilation of neutron counts and MODIS land surface temperature improved the ET estimation further compared to neutron count assimilation only. The best ET estimation was obtained for the joint assimilation of cosmic-ray neutron counts, MODIS land surface temperature including calibration of the LAI (or if field-measured LAI was used as input). This shows that bias due to neglected information on water resources management can be corrected by data assimilation if a combination of soil moisture and land surface temperature data is available.
We can conclude that data assimilation of neutron counts and land surface temperature is useful for ET and soil moisture estimation of an irrigated farmland, even if irrigation data are not available and excluded from model input. The land surface temperature measurements are an alternative data source to improve the soil moisture and land surface flux estimation under water stress conditions. This shows the potential of data assimilation to correct also a systematic model bias. LAI optimization further improves simulation results, which is also likely related to a systematic underestimation of LAI by the MODIS remote sensing product. The results of using the calibrated LAI are comparable to the results of using field-measured LAI as model input.