Land surface modeling over the Dry Chaco: the impact of model structures, and soil, vegetation and land cover parameters

In this study, we tested the impact of a revised set of soil, vegetation and land cover parameters on the performance of three different state-of-the-art land surface models (LSMs) within the NASA Land Information System (LIS). The impact of this revision was tested over the South American Dry Chaco, an ecoregion characterized by deforestation and forest degradation since the 1980s. Most large-scale LSMs may lack the ability to correctly represent the ongoing deforestation processes in this region, because most LSMs use climatological vegetation indices and static land cover information. The default LIS parameters were revised with (i) improved soil parameters, (ii) satellite-based interannually varying vegetation indices (leaf area index and green vegetation fraction) instead of climatological vegetation indices, and (iii) yearly land cover information instead of static land cover. A relative comparison in terms of water budget components and “efficiency space” for various baseline and revised experiments showed that large regional and long-term differences in the simulated water budget partitioning relate to different LSM structures, whereas smaller local differences resulted from updated soil, vegetation and land cover parameters. Furthermore, the different LSM structures redistributed water differently in response to these parameter updates. A time-series comparison of the simulations to independent satellite-based estimates of evapotranspiration and brightness temperature (Tb) showed that no LSM setup significantly outperformed another for the entire region and that not all LSM simulations improved with updated parameter values. However, the revised soil parameters generally reduced the bias between simulated surface soil moisture and pixel-scale in situ observations and the bias between simulated Tb and regional Soil Moisture Ocean Salinity (SMOS) observations. Our results suggest that the different hydrological responses of various LSMs to vegetation changes may need further attention to gain benefits from vegetation data assimilation.


Abstract.
In this study, we tested the impact of a revised set of soil, vegetation and land cover parameters on the performance of three different state-of-the-art land surface models (LSMs) within the NASA Land Information System (LIS). The impact of this revision was tested over the South American Dry Chaco, an ecoregion characterized by deforestation and forest degradation since the 1980s. Most large-scale LSMs may lack the ability to correctly represent the ongoing deforestation processes in this region, because most LSMs use climatological vegetation indices and static land cover information. The default LIS parameters were revised with (i) improved soil parameters, (ii) satellite-based interannually varying vegetation indices (leaf area index and green vegetation fraction) instead of climatological vegetation indices, and (iii) yearly land cover information instead of static land cover. A relative comparison in terms of water budget components and "efficiency space" for various baseline and revised experiments showed that large regional and long-term differences in the simulated water budget partitioning relate to different LSM structures, whereas smaller local differences resulted from updated soil, vegetation and land cover parameters. Furthermore, the different LSM structures redistributed water differently in response to these parameter updates. A time-series comparison of the simulations to independent satellite-based estimates of evapotranspiration and brightness temperature (T b ) showed that no LSM setup significantly outperformed another for the entire region and that not all LSM simulations improved with updated parameter values. However, the revised soil parameters generally reduced the bias between simulated surface soil moisture and pixel-scale in situ observations and the bias between simulated T b and regional Soil Moisture Ocean Salinity (SMOS) observations. Our results suggest that the different hydrological responses of various LSMs to vegetation changes may need further attention to gain benefits from vegetation data assimilation.

Introduction
Land surface models (LSMs) aim at providing a complete and self-consistent description of the temporal and spatial distribution of water and energy over land (Clark et al., 2015). The output from LSMs is used for many applications, such as the monitoring of water resources, floods and droughts, and their impact on natural hazards, biomass production, ecology or soil salinity. In many cases, the LSM performance is improved by including remotely sensed observations through (i) the dynamic integration of observations into LSMs via data assimilation, (ii) the mapping of model input parameters to characterize the representation of land properties within the model (e.g., soil properties, land cover) and (iii) the validation and development of LSMs. In addition, contrasting model output with remote sensing is a powerful method to identify unmodeled processes in a LSM, such as irrigation Brocca et al., 2018) or groundwater withdrawal . Furthermore, LSMs are an essential part of weather forecast systems and of climate models that simulate past, present and future climate (Pitman, 2003;Clark et al., 2015). They also offer ancillary information to decompose, interpolate and extrapolate sparse ground measurements and remote sensing data. However, the degree to which LSMs can serve these various purposes depends on how well their given structure, forcing data and parameters can represent regional land surface processes (Wood et al., 2011;Clark et al., 2015). This study tests the impact of a revised set of soil, vegetation and land cover parameters on the performance of different LSMs.
Most LSMs use climatological or time-invariant parameters related to vegetation, land cover and soil properties and thereby assume stationary land processes; i.e., given similar meteorological input, the statistical distribution of the land surface variables would by design not change in time. These parameters can be properly calibrated for small-scale applications when suitable historical local data are available. However, for large-scale applications, it is common practice to provide the best possible, often satellite-based, large-scale input datasets to existing modeling systems (Jiang et al., 2010).
Satellite-based green vegetation fraction (GVF) and leaf area index (LAI) are example input datasets that directly or indirectly provide vegetation parameters (also referred to as vegetation indices) to represent the horizontal and vertical density of plant vegetation (Gutman and Ignatov, 1998), used for the calculation of transpiration, interception and radiative shading. Large-scale LSMs without dynamic vegetation modeling are strongly limited by the assumption that vegetation has a recurring annual cycle, i.e., using climatological LAI and GVF input. In reality, the vegetation's response to meteorological and climate conditions varies due to interand intra-annual weather and climate anomalies (Case et al., 2013).
The current abundance of satellite-based vegetation datasets allows us to constrain LSMs and account for unmodeled processes in order to better understand the impact of vegetation changes on the water budget components. High-quality and long-term vegetation products from various remote-sensing platforms (Tucker et al., 2005;Liang et al., 2013;Zhu et al., 2013) can provide temporally varying parametric input to LSMs (Boussetta et al., 2015). In addition, they can also be assimilated for state updating in LSMs with dynamic vegetation simulation (Sabater et al., 2008;Barbu et al., 2011Barbu et al., , 2014Albergel et al., 2017;Kumar et al., 2019). Earlier studies indicated that replacing climatological vegetation by interannually varying satellite-derived indices can improve modeled energy fluxes as well as surface temperature and moisture in both offline LSM simulations (Miller et al., 2006;Case et al., 2013;Yin et al., 2016) and atmosphere-coupled LSMs (Crawford et al., 2001;James et al., 2009;Boussetta et al., 2013;Ge et al., 2014;Kumar et al., 2014). The largest improvements are obtained during extreme meteorological anomalies (Case et al., 2013). In this study, it is expected that, besides meteorological anomalies, large-scale land cover conversions, such as deforestation, also alter the vegetation strongly from its climatological representation. Therefore, it is tested whether the use of satellite-derived vegetation indices in LSMs is also gainful in regions characterized by land cover changes.
Besides temporally varying vegetation indices, a temporally varying description of land cover is also required over regions with major land cover change. In LSMs, land cover is represented by the use of plant functional types (Kumar et al., 2006;Peters-Lidard et al., 2007). These are groups of plant species that share similar structural, phenological, and physiological traits (Bonan et al., 2002a). These features are integrated into several model-specific surface parameters for each land cover type, summarized in lookup tables (Dickinson, 1995). The sensitivity of LSMs to plant functional types or land-cover-related parameters has been illustrated in both offline  and atmosphere-coupled LSMs (Pitman et al., 2009;Grossman-Clarke et al., 2010;Cao et al., 2015;Ruiz-Vasquez et al., 2020). Most of these studies solely focused on changes in model-specific surface parameters without taking into account changes in LAI or GVF. In contrast, this study aims at implementing large-scale land cover changes in LSMs by feeding them with both temporally varying vegetation indices and land cover parameters.
Soil properties also form a suite of crucial input parameters in LSMs (Dai et al., 2019). Most LSMs derive soil hydraulic properties (SHPs) from lookup tables or pedotransfer functions, using soil texture information. Different LSMs have different soil parameterizations and use model-specific, often historically tuned, SHPs. Dai et al. (2019) stated that popular soil datasets currently used in LSMs are often outdated or have limited accuracy. Furthermore, the derivation of SHPs from soil texture is highly uncertain (Wösten et al., 2001). At the global scale, there are only a few generally accepted global soil maps, such as the Food and Agriculture Organization (FAO) Soil Map of the World (FAO, 1971), the Harmonized World Soil Database (HWSD) product (FAO and ISRIC, 2012) and SoilGrids (Hengl et al., 2017). As shown by De Lannoy et al. (2014), the implementation of more accurate soil texture and related SHPs can lead to reduced bias and error estimates of soil moisture when compared to in situ surface soil moisture and even impact simulated runoff and evaporation estimates. During the last decade, several operational institutes have improved their soil parameters to enhance global land surface simulations (Balsamo et al., 2009;De Lannoy et al., 2014;Chadburn et al., 2015).
In this study, soil, vegetation and land cover parameters are updated in three LSMs. The study domain is the South American Dry Chaco. The region covers parts of Argentina, Bolivia, and Paraguay (Vallejos et al., 2015) and has been characterized by deforestation since the 1980s, now being one of the largest deforestation hotspots in the world (Hansen et al., 2013). These large-scale land cover conversions impact the local hydrology of the region. Under natural circumstances, deep percolation of water towards groundwater is low or even absent due to intensive evapotranspiration of the original dry forest vegetation (Giménez et al., 2016;Jobbágy et al., 2020). After deforestation, Giménez et al. (2016), Magliano et al. (2017), Marchesini et al. (2017) and Nosetto et al. (2012) all observed increases in soil moisture and deep drainage, resulting in a rise in the groundwater table. If this trend of a  rising water table continues, this may result in salt accumulation close to the soil surface and eventually result in reduced  plant growth and soil degradation (Giménez et al., 2016).
Given the Dry Chaco's recent and large-scale deforestation history, it is a unique area to test the impact of temporally evolving vegetation and land cover parameters. It is expected that by feeding LSMs with time-varying vegetation and land cover, together with an updated set of soil parameters, the most accurate spatial and temporal representation of the Chaco's water distribution could be obtained. Additionally, it is hypothesized that the use of similar soil, vegetation and land cover parameters in various LSMs would result in similar accurate estimates of the long-term simulated water budgets. Lastly, it is hypothesized that soil and vegetation parameter updates would contribute differently towards the model performance improvement.
The general objectives of this study are (i) to evaluate the simulated water budget components over the Dry Chaco using three different LSMs within the NASA Land Information System (LIS), (ii) to quantify how the simulated water budget components respond when more accurate soil texture and related SHPs are implemented or when the static climatological vegetation (LAI and GVF) indices and land cover map are replaced with interannually varying satellite-based indices and yearly updated land cover maps, and (iii) to identify the remaining deviations in the modeled hydrology compared to different satellite-based and in situ observations of water budget components. Besides these general objectives, the simulated water budget components are further framed within the hydrological context of the Dry Chaco; i.e., it is verified whether the different LSMs simulate the increased deep percolation and higher soil moisture values after deforestation, similarly to the field-based findings of Nosetto et al. (2012), Giménez et al. (2016), Magliano et al. (2017) and Marchesini et al. (2017).
The impact of the revised set of soil parameters and updated vegetation and land cover treatment is analyzed incrementally. In a first phase, models were run with their default model-specific soil parameters, climatological vegetation (LAI and GVF) and static land cover. Next, the models were supplied with more accurate soil texture and related SHPs, and their impact on the simulated water budgets was quantified. In a third phase, the ongoing land cover changes were implemented using interannually varying satellite-based indices and yearly updated land cover maps, and it was analyzed how the major land cover changes alter the hydrological balance. Lastly, the impact of the various model structures, soil texture and dynamic vegetation input was assessed using the concept of "efficiency space", and the performance of each set of experiments was evaluated against independent satellite-based estimates of evapotranspiration, brightness temperature and in situ soil moisture.
2 Study area, models and datasets

Study area
The Dry Chaco is a relatively flat plain covering parts of northwestern Argentina, western Paraguay and southeastern Bolivia and has an area of approximately 787 000 km 2 . The ecoregion has a semi-arid climate with a north-south gradient in the mean annual temperature (from 24 to 19 • C) and annual rainfall (from 898 to 712 mm yr −1 ) . Minetti et al. (1999) report precipitation values up to 1000 mm yr −1 in the eastern and western parts of the region and 400 mm yr −1 in the central Dry Chaco. Soils in the Dry Chaco are the result of alternating aeolian and alluvial deposits whereby loess prevails. Field dunes and paleochannels with coarse sediments are common in the western Dry Chaco (Marchesini et al., 2017). The Dry Chaco hosts the largest dry forest in the world and, historically, land use in the region was limited to extensive cattle ranching and semiindustrial or manual logging for timber and charcoal (Clark et al., 2010). Since the 1980s, the region has been characterized by large-scale deforestation for soybean production and intensive cattle ranching (Vallejos et al., 2015). The region had the world's highest rate of subtropical forest loss between 2000 and 2012, and already 20 % of the original dry forest in the region has been lost (Hansen et al., 2013;Vallejos et al., 2015), with a transformation of 158 000 km 2 between 1976 and 2012. Marchesini et al. (2017) mentioned agricultural and technological evolution together with growing international demand for grain as some of the causes of the agricultural expansion. In addition, gradual changes in forest structure, biomass and functioning are observed due to forest logging and forest grazing. Figure 1 shows the location of the Dry Chaco, together with the spatial and temporal extent of land cover changes for the period 1992-2015 derived from the European Space Agency Climate Change Initiative (ESA-CCI) land cover product upscaled to a 0.125 • resolution (see Sect. 2.3.4). The derived spatiotemporal pattern of deforestation agrees well with the 30 to 60 m resolution deforestation product of Vallejos et al. (2015) (not shown). The deforested area of the Dry Chaco in the overlapping period (1992-2013) of both products is 23 % for the 0.125 • ESA-CCI data and only 15 % for the finer-scaled dataset of Vallejos et al. (2015). The main reasons for this discrepancy are the different spatial resolutions of both products.

Models
Three LSMs within the NASA LIS (Kumar et al., 2008) were selected to simulate land surface states and fluxes over the Dry Chaco: the Community Land Model version 2 (CLM2.0) (Bonan et al., 2002b;Oleson et al., 2004), Catchment LSM-Fortuna 2.5 (CLSM-F2.5)  and Noah LSM version 3.6 (Ek et al., 2003), hereafter simply referred to as CLM, CLSM and NOAH, respectively. More recent versions of these models are available and might provide better simulations over the Dry Chaco but are not yet implemented in the LIS. This study relied on the LIS to facilitate a consistent parameter revision across multiple LSMs, and one of the main goals was to see the impact of various LSM structures, regardless of their version. To demonstrate the impact of the updated soil treatment and dynamically evolving vegetation and land cover, a baseline simulation for each LSM was conducted and followed by various revised experimental runs. For all simulations, the spatial resolution was 0.125 • , and the output was created daily (model integration timestep of 15 min). The LSMs were spun up for 10 years from 1 January 1982 through 31 December 1991 using the land cover of 1992. The subsequent simulations from 1 January 1992 through 31 December 2015 were used for further analysis. The meteorological forcing data (precipitation, temperature, specific humidity, radiation, wind and surface pressure) were extracted from the Modern-Era Retrospective analysis for Research and Applications (MERRA-2) product (Gelaro et al., 2017) including gauge-based precipitation corrections .
By default, the LIS LSMs use climatological LAI data based on a 4-year average derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) Collection 4 LAI product (Kaufmann et al., 2000) and climatological global GVF data (Gutman and Ignatov, 1998) derived from 5 years of normalized difference vegetation index (NDVI) data from the Advanced Very-High-Resolution Radiometer (AVHRR) (Miller et al., 2006). The default land cover map is the University of Maryland (UMD) global land cover product (Hansen et al., 2000) based on AVHRR data from 1 April 1992 through 31 March 1993. The default soil properties in the LIS are derived from the FAO Soil Map of the World (FAO, 1971). In this study, vegetation, land cover and soil input data were revised using the Global Land Surface Satellite (GLASS) LAI (Liang et al., 2013;Xiao et al., 2016), the Global Inventory Modelling and Mapping Studies (GIMMS) NDVI (Tucker et al., 2005;Pinzon and Tucker, 2014), the ESA-CCI land cover product (Kirches et al., 2014;ESA, 2017) and soil properties derived from the HWSD v1.2 (FAO and ISRIC, 2012;De Lannoy et al., 2014).

GLASS LAI
The GLASS LAI product is a global spatiotemporally complete dataset, based on MODIS and AVHRR reflectance time-series data (Liang et al., 2013). This product is available for the period 1981-2015, with a temporal resolution of 8 d and a spatial resolution of 0.05 • (Liang et al., 2013). Cloud-contaminated data are filled in using an optimum interpolation algorithm (Xiao et al., 2016). According to Liang et al. (2013) and Xiao et al. (2016), the GLASS LAI features more realistic and smoother seasonal variations than the MODIS LAI product (MOD15) (Knyazikhin et al., 1998) and the first version of the Geoland2 (GEOV1) LAI product (Baret et al., 2013). For the baseline simulations, a climatological LAI dataset was created using 24 years of GLASS data  to replace the default 4-year AVHRR climatology. This allowed us to solely display the effect of interannual and short-term vegetation variations in the revised experiments with time-varying GLASS data and land cover changes. The LAI observations were upscaled to the 0.125 • resolution by spatial averaging.

GIMMS NDVI
The GIMMS NDVI product was assembled from AVHRR NDVI data (Tucker et al., 2005). The GIMMS dataset covers the period 1982-2015 and has a 15 d temporal resolution. The spatial resolution is 0.0833 • . The maps are 15 d max-imum value composites, and cloud-contaminated pixels are replaced by NDVI values derived from either spline interpolation or average season profiles (Pinzon and Tucker, 2014). In this study, the GIMMS GVF was derived from its NDVI and used as input for the LSMs. According to Gutman and Ignatov (1998), the GVF i,j (-) is given by the following equation: where NDVI i,j is the NDVI value observed at time i and pixel j and NDVI min and NDVI max are NDVI values over barren vegetation classes and fully covered vegetation classes, respectively. For simplicity, we chose to use the values proposed by Gutman and Ignatov (1998), i.e., 0.04 for NDVI min and 0.52 for NDVI max , and the GVF is restricted to the 0-1 interval. Multiple other approaches exist to derive GVF from NDVI, each with their own advantages and limitations (Jiang et al., 2010). For the baseline simulations, a climatological GVF dataset was created using 24 years of GIMMS GVF data . For the revised experiments with inclusion of land cover changes, the time-varying GIMMS data were used. To match the model's 0.125 • resolution, NDVI values were upscaled by spatial averaging.

ESA-CCI land cover
The ESA-CCI land cover product offers yearly varying information on 37 land cover classes from 1992 through 2015 at a spatial resolution of 300 m (Kirches et al., 2014;ESA, 2017). This long-term land cover time series was achieved by combining surface reflectance data of different observation systems (Medium Resolution Imaging Spectrometer (MERIS), AVHRR, Satellite Pour l'Observation de la Terre (SPOT) Vegetation and PROBA-V) (Bontemps et al., 2012;Kirches et al., 2014). In this study, the ESA-CCI land cover maps were reclassified into 13 UMD classes (see Appendix A) and upscaled to a 0.125 • resolution assigning the most dominant land cover to each pixel. Figure 1 shows how deforestation in the Dry Chaco has progressed in both time and space, based on this ESA-CCI land cover information. A static water mask file was created based on pixels that were classified as water at least once during the period 1992-2015. For CLSM, the UMD land cover classes were further combined into modelspecific classes.

HWSD soil texture and SHPs
The global 1 km soil texture data were taken from the Soil and Terrain database for Latin America and the Caribbean as part of the HWSD v1.21 and were translated to 0.125 • dominant soil texture. Similarly to in De Lannoy et al. (2014), the soil texture and organic matter data were used to estimate SHPs using the pedotransfer functions of Wösten et al. (1999). The derived SHPs include the soil porosity, bulk density, Clapp-Hornberger parameter B, saturated hydraulic conductivity, saturated matric potential, soil wilting point and field capacity (see Sect. 3.2.1). These HWSD-based soil parameters are also used in the operational Soil Moisture Active Passive (SMAP) Level-4 soil moisture product .

Evaluation data
Precipitation input and surface soil moisture and evapotranspiration output were evaluated against in situ measurements and satellite-based estimates. Furthermore, an integrated evaluation of surface soil moisture, surface temperature and LAI was performed through the comparison of diagnosed L-band brightness temperature (T b ) simulations against satellite-observed T b obtained from Soil Moisture and Ocean Salinity (SMOS).

Precipitation
The quality of the MERRA-2 precipitation over the Argentinean Chaco was evaluated against in situ data obtained from the Instituto Nacional de Tecnología Agropecuaria (INTA, 2020). Two sets of daily precipitation data were downloaded: data from meteorological stations covering the period 1992-2015 (10 stations) and data covering the period 2010-2015 (8 stations). The exact locations of these stations are shown in Fig. 1. As this study mainly focused on long-term simulations, the precipitation evaluation was conducted using monthly averaged data.

Monte Buey soil moisture
Pixel-scale in situ surface soil moisture was obtained from 17 nodes of the Monte Buey soil moisture network. The network is located in fields surrounding the town of Monte Buey, Cordoba, Argentina, just outside the Dry Chaco ( Fig. 1), and covers three 0.125 • model pixels. At each node, the surface soil moisture is measured using a Hydra Probe II buried at a depth of 5 cm. The network adopts the highest quality standards (Thibeault et al., 2015) and serves as a calibration and validation site for various satellite missions, e.g., the SAtélite de Observación COn Microondas and SMAP. To allow for comparison with the daily modeled surface soil moisture, averaged over three 0.125 • model pixels, the hourly in situ measurements were averaged to daily values across the 17 nodes. The period of validation was from 1 January 2013 through 31 December 2015.

Evapotranspiration
The modeled evapotranspiration components were evaluated against Global Land Evaporation Amsterdam Model (GLEAM) data. The GLEAM consists of a set of algorithms that are driven by satellite-based observations and globally estimate all the daily evapotranspiration com-ponents at 0.25 • spatial resolution (Miralles et al., 2011;Martens et al., 2017). Central in the GLEAM is the use of the modified Priestley and Taylor (1972) equation. Daily LIS simulations of latent heat flux were upscaled to 0.25 • to allow for comparison to GLEAM data for the entire simulation period, i.e., 1992-2015.

L-band microwave brightness temperature
The satellite-observed L-band T b data were extracted from the SMOS SCLF1C product, version 620, projected onto the 36 km Equal-Area Scalable Earth Grid, version 2 (EASEv2), angular fitted and quality controlled as in De Lannoy et al. (2015), e.g., excluding areas close to open water, urban areas, and areas with high radiofrequency interference or with a poor angular fit. Both the horizontally (H ) and vertically (V ) polarized data at a 40 • incidence angle were used for evaluation of the daily simulations (see Sect. 3.5) from 1 January 2011 through 31 December 2015, using both ascending (06:00 LT) and descending (18:00 LT) half-orbits.
3 Methodology Table 1 summarizes the different experiments that were performed for each of the three selected LSMs. The experiments include a baseline (BL) simulation and two revised simulations, one with updated soil parameters (REV S ) and one with both updated soil, vegetation and land cover parameters (REV SV ) and two sensitivity simulations (SENS V , SENS LC ), which are explained in Sect. 3.2.3.

Baseline simulations
The set of baseline (BL in Table 1) simulations used the default FAO surface soil texture data and the related default model-specific SHPs, assuming a vertically homogeneous soil profile for CLM and NOAH. By design, CLSM uses the 0-30 cm soil texture to compute parameters related to surface water transport, whereas the 0-100 cm soil texture is used for computation of all other parameters De Lannoy et al., 2014). The vegetation input for the BL simulations consisted of monthly climatological vegetation datasets newly created based on GLASS LAI and GIMMS GVF, together with the ESA-CCI 1992 land cover map. This specific setup with climatological vegetation and static land cover does not account for major interannual changes in vegetation, e.g., due to deforestation. Note that CLM only requires LAI data, and GVF is not an input parameter.

Revised soil parameters
In a first set of revised experiments, we replaced the modelspecific SHPs (derived from FAO surface texture) for each LSM with the "topsoil" SHPs associated with a vertically homogeneous 0-100 cm HWSD soil texture and organic matter similar to in De Lannoy et al. (2014). Corresponding to the default structure of CLSM, only the 0-30 cm HWSD soil texture is used to obtain model-specific parameters for the surface soil water transport. In these experiments, only the soil parameters were revised (REV S in Table 1), whereas the climatological vegetation and land cover information was the same as in the BL simulations. Our hypothesis was that by feeding the different LSMs with similar SHPs, differences between model output could be reduced.
It is important to note that the different LSMs do not use the same subsets of SHPs. By default, CLM uses pedotransfer functions (Cosby et al., 1984) to relate sand and clay fractions to SHPs. The used SHPs in CLM are the saturated conductivity (K sat ), porosity (θ sat ), Clapp-Hornberger parameter B, and saturated matric potential (ψ sat ) (Han et al., 2014). The soil profile is divided into 10 layers, and K sat decreases exponentially with depth throughout those layers.
CLSM's soil parameterization differs fundamentally from the other two LSMs. It uses several model parameters and moisture deficit model prognostic variables to dynamically partition the computational pixel into three distinctly different moisture regimes (saturated, transpiring and wilting regimes) to account for spatial variability of soil moisture within that pixel. CLSM's soil parameterization uses the spatial distribution of topographic indices and SHPs to derive model parameters . For baseflow generation only, CLSM assumes an exponential decay of K sat with depth, with the decay factor ν = 2.17 in the BL simulations and ν = 1 in the REV S simulations, whereas for the vertical moisture transport, CLSM assumes K sat to be vertically homogenous. Other important SHPs in CLSM are ψ sat , the parameter B and soil wetness at wilting point θ wilt .
The soil parameters in NOAH are discussed by Kishné et al. (2017), and the default SHPs are calculated from texture-based lookup tables (Cosby et al., 1984). The SHPs include the parameter B, θ sat , ψ sat , and K sat . The soil profile is divided into four layers but with a constant K sat value over depth (Ek et al., 2003). Other parameters include the soil water content threshold for ceasing evaporation from the top soil layer (DRYSMC), the saturated hydraulic diffusivity (SATDW), the soil water content at field capacity (REF-SMC), soil wetness at wilting point θ wilt , parameter for soil thermal diffusivity (F11) and quartz content (QTZ). Note that DRYSMC, SATDW, F11 and QTZ are not included in the SHPs of De Lannoy et al. (2014), and their values for the revised SHPs were derived from lookup tables based on the HWSD soil texture data.  The spatial boxplots in Fig. 2a-c show three SHPs used in the BL simulations of each LSM and their updated values used for the REV S and REV SV simulations over the Dry Chaco. The mean values for the updated B parameter, θ sat and ψ sat are smaller than for BL but have larger spatial variances. The slightly higher mean and standard deviation in the associated HWSD 0-100 cm sand fraction map (REV S : 43±25 %, Fig. 2e), compared to that of the FAO surface sand fraction map (BL: 41 ± 15 %, Fig. 2d), explain the change in these updated parameter values.

Revised vegetation and land cover parameters
To analyze the response of the Chaco's hydrology to deforestation or vegetation changes in general, time-varying vegetation observations from GLASS and GIMMS were incorporated into the LSMs together with yearly updated ESA-CCI land cover maps in a second set of revised experiments (REV SV in Table 1). This specific setup includes the effect of vegetation changes (such as deforestation) compared to the use of climatological vegetation parameters and static land cover in the BL simulations. As described in Sect. 2.3.2 and 2.3.3, the GLASS LAI and GIMMS NDVI were available every 8 and 15 d, respectively. However, the data were linearly interpolated to obtain daily parameter updates in the LIS. The ESA-CCI land cover maps were updated each year on 1 January. LSMs only use land cover information to define model-specific parameters associated with each land cover type; i.e., ESA-CCI land cover is used as a predictor of parameters such as rooting depth, stomatal conductance and surface roughness. The exact values and implementation of each parameter are model-dependent.
Note that albedo calculations in each model were kept unchanged, as the focus of this study is primarily on the water balance. In CLM, the albedo for vegetated areas is calculated based on soil properties, LAI and plant functional types (Oleson et al., 2003). CLSM uses a surface albedo parameterization scheme (based on the Simple Biosphere Model, Sellers et al., 1986) that incorporates LAI, GVF and sun incidence angle to calculate the albedo and is rescaled to fit the annual cycle of MODIS-observed albedo. NOAH uses an albedo climatology based on 5 years (1985)(1986)(1987)(1988)(1989) of visible and near-infrared radiation from AVHRR (Csiszar and Gutman, 1999). To see how deforestation influences the water budget component, the water budget analysis of the REV SV output mainly focused on the subset of 197 px that were deforested between 2002 and 2006 (based on the ESA-CCI land cover).

Sensitivity experiments
Our results will indicate (see Sect. 4.1.3) that the various LSMs react differently to the simultaneously updated interannually varying vegetation and land cover parameters. To disentangle the impact of vegetation and land cover parameters separately for each LSM, two sets of synthetic sensitivity experiments were conducted. A first set of sensitivity experiments (SENS V in Table 1) included five experiments ("exp1" through "exp5") in which monthly spatially distributed climatological GLASS LAI values of the BL simulations were systematically increased by 1 and GVF values were raised proportionately. To estimate the corresponding GVF values, an empirical exponential relationship between LAI and GVF (GVF = a +(b·exp(c·LAI))) was derived for each individual pixel based on daily LAI and GVF values from 1992 to 2000. LAI values much larger than about 6 do not occur in reality; nevertheless, they are kept in this sensitivity analysis to understand the LSM response in a synthetic setting. In CLM only the LAI was altered because CLM does not use GVF as an input, whereas both LAI and GVF were altered in CLSM and NOAH. Each experiment was run for the years 1992-2015 with a fixed ESA-CCI land cover map of 1992.
To test the sensitivity to land cover (SENS LC in Table 1), two simulations were conducted per LSM. In a first simulation, the entire Dry Chaco was assumed to be covered by deciduous broadleaf forest, whereas in the second scenario cropland was assumed. These vegetation classes are associated with the major land cover conversion in the Dry Chaco. The climatological GLASS LAI and GIMMS GVF datasets were used in SENS LC simulations.

Model evaluation in terms of water budget components
To assess the relative impact of different LSM structures, vegetation and land cover parameters, various water budget components were compared in each simulation: where P is the precipitation (mm), ET is the total evapotranspiration (mm), Q is the total runoff (mm), and S is the change in soil water storage (mm) for a given time period.
The total ET and Q in the water budget equation (Eq. 2) can be written as the sum of their different components where E V is the vegetation transpiration (mm), E B is the bare soil evaporation (mm), E I is the evaporation from canopy interception (mm), Q S is the surface runoff (mm), and Q SB is the subsurface runoff (mm).

Model evaluation with efficiency curves
The relative differences in the various LSM behaviors were further analyzed using efficiency curves, without associating any performance assessment. Q and ET are strongly regulated by soil moisture. However, soil moisture is also a highly model-dependent quantity (Koster et al., 2009), complicating the evaluation of the modeled Q and ET. To avoid this issue, Koster and Mahanama (2012) and Koster (2015) proposed to evaluate hydrological behavior in terms of "efficiency space". The main reasoning behind this approach is that higher soil moisture content generally leads to both increased ET for a given amount of net incoming radiation (ET efficiency) and increased Q for a given amount of P (Q efficiency). The ET efficiency (-) is the ratio of the latent heat flux to the net radiation: where λ (J kg −1 ) is the latent heat of vaporization, λET (J m −2 ) is the latent heat flux, R net (J m −2 ) is net incoming radiation and β(.) (-) is a function of moisture content, here assumed to be the 1 m soil moisture content mc 1 m . The Q efficiency (-) is the ratio of the total Q production to the total P : where Q is total runoff, P is the precipitation and F (.) (-) again is a function of mc 1 m . By plotting the Q efficiency against ET efficiency and identifying their relationship, a unique signature of the hydrological behavior (without model-dependent soil moisture) of a LSM can be obtained. Ideally, this would allow us to shift a LSM signature towards an in situ observed land surface signature. Due to lack of suitable and publicly available Q measurements for the Dry Chaco, this study uses the efficiency space to relatively compare the hydrological behavior of different LSMs with various parameter settings (BL, REV S , REV SV ). For each pixel, water budget components (λET, R net , Q, mc 1 m and P ) were averaged over each month for the period 1992-2015. Different locations had different minimum mc 1 m values, induced by differences in soil texture. Prior to plotting, the mc 1 m values for each pixel were normalized to reduce spatial differences in mc 1 m and to dominantly focus on the temporal LSM signature. Both the Q and ET efficiency were plotted as a function of the normalized mc 1 m for each LSM to visualize the model-specific soil moisture dependencies. For each pixel, a curve drawn through the points was constructed by computing the median of the ET and Q efficiencies over narrow bins of normalized mc 1 m . The combination of all fitted curves for all pixels was visualized using a scatter density cloud. The re-sulting "efficiency space" plots (showing the obtained ET efficiency as a function of Q efficiency) show how evaporation and runoff efficiencies vary with each other as the soil gets drier or wetter. The efficiency plots are used to see whether the three LSMs provide a consensus on the simulated hydrological behavior and on the impact of the updated soil, vegetation and land cover treatment.

Model evaluation with independent data
For each experiment, the quality of model input P and the model performance in terms of output surface soil moisture content (sfmc) and ET were evaluated against independent data (Sect. 2.4), for simplicity referred to as "observations", using four skill metrics: bias: long-term mean difference between simulations and observations (i.e., model minus observation), -ubRMSD: unbiased root-mean-square difference, calculated by first removing the bias from both the simulated and observed time series (Entekhabi et al., 2010), -R: temporal Pearson correlation coefficient between simulations and observations, -aR: temporal anomaly Pearson correlation coefficient between simulations and observations, calculated after removing the mean climatology from each time series. The climatology is computed as the multi-year average of 30 d smoothed time series of daily values. This removed the trivial agreement in seasonality and allowed us to focus on interannual and short-term dynamics.
For an integrated evaluation of the model sfmc, surface temperature and LAI of the various experiments, a zero-order tau-omega microwave radiative transfer model (RTM) was used to convert these modeled variables into L-band T b (K) estimates. The modeled T b were compared to SMOS T b observations, similarly to the approach presented by Albergel et al. (2012). For a detailed description of the tau-omega RTM, we refer to De . Feldman et al. (2018) showed that a zero-order tau-omega model would suffice for the dry forests of the Dry Chaco. Instantaneous sfmc, surface temperature and LAI at 06:00 and 18:00 LT were used as input in the RTM, and simulated T b were evaluated against 06:00 and 18:00 LT SMOS-observed T b (both at the top of vegetation). The T b simulations were computed at the 0.125 • model resolution and then spatially averaged to the 36 km EASEv2 grid for comparison to SMOS-observed T b for the period 2011-2015. Short-term variability in both simulations and observations was reduced using a 1-month averaging window. This allowed us to focus on the effect of the implemented vegetation changes, which occur on a longer seasonal timescale.
The RTM required specific input parameters related to soil properties (Wang and Schmugge, 1980) and vegetation classes. The related soil parameters differ thus for the BL and REV S simulations, and the parameters related to land cover class also vary in time for the REV SV simulations due to yearly updated land cover. The literature offers various lookup tables to estimate the RTM parameters related to vegetation classes (soil roughness, scattering albedo, vegetation structure parameters). Here, we used the lookup tables of the SMAP soil moisture retrieval (O'Neill et al., 2015;Quets et al., 2019). Another choice would affect the overall T b bias for all experiments but would minimally affect the relative comparison of the various experiments with revised soil and vegetation parameters.  2020) and Minetti et al. (1999). All LSMs confirm a water storage ( S) deficit for the dry season (April-September), which is compensated for during the wetter months with a water surplus. Figure 3a shows the corresponding yearly averaged water budget components relative to the total P whereby the total ET is subdivided into its three components (see Sect. 3.3). The variation in the total amount of ET and its components shows that there is large variability between the models, even though all of them use the same forcing data, land cover, vegetation, soil texture (not SHP) and topography input. In addition, there are significant disagreements in the ET partitioning. The largest ET component is different in each LSM; i.e., E I is largest for CLM, E B for CLSM and E V for NOAH. The high fraction of E I for CLM is physically unrealistic (see Sect. 5). CLM and NOAH both have a Q SB component, whereas CLSM does not. The latter has a very large fraction of Q S .

Revised soil parameters
The impact of the revised soil parameters (REV S ) on the yearly mean relative water budget components is shown in Fig. 3b, again for the period 1992-2015. Most striking is that, despite the similarity in SHPs, the various models still produce a very different partitioning of the water budget components. For CLM, the revised SHPs cause a yearly reduction in Q S by 4 % which is compensated for by an increase in E V and Q SB . For CLSM, the mean water budget compo-  nents hardly change. For NOAH, there is a small increase in Q SB and a decrease in ET, mainly due to a large decrease in E B (−9 %), of which 6 % is compensated for by extra E V . For each model, the differences in the water budget components between the BL and REV S simulations, averaged over the entire Dry Chaco, are relatively small. However, within the study domain, there is a large local spatial variability in the changes in the various water budget components (not shown). Figure 4 summarizes the impact of the updated SHPs on the soil moisture content in the first meter of the soil (mc 1 m ). The boxplots of time-averaged (1992-2015) mc 1 m over the Dry Chaco (Fig. 4a) show that the REV S simulations are drier than for BL: the mc 1 m decreases from 0.16 to 0.11 m 3 m −3 for CLM, from 0.22 to 0.13 m 3 m −3 for CLSM and from 0.19 to 0.17 m 3 m −3 for NOAH. These lower mean soil moisture values can be related to the larger fraction of sandy soils (Fig. 2d and e) and the associated reduced water retention (SHPs) in the REV S simulations. However, again, at the local scale, some areas become wetter and others drier. Despite the relatively large difference in long-term mean soil moisture between BL and REV S , the differences in ET and Q S fluxes are small: the meteorological forcings are the primary drivers of these fluxes, whereas the SHPs only have a secondary impact. Figure 4c and d show how the texture pattern dominates the long-term averaged mc 1 m for the period 1992-2015 for the NOAH BL and REV S simulations (similarly for CLM and CLSM). Figure 4b illustrates that the updated SHPs change the temporal standard deviation of mc 1 m , which significantly decreased for NOAH.

Revised vegetation and land cover parameters
The impact of deforestation over the Dry Chaco was assessed via a relative comparison of the water budget components of the REV S and REV SV simulations for each LSM, using a subset of 197 px that was deforested between 2002 and 2006. Keep in mind that the REV S uses climatological vegetation indices and a static land cover map, which do not account for interannual vegetation changes. The REV SV simulations include the effects of vegetation changes by the implementation of satellite-derived dynamic vegetation indices as well as yearly updated land cover parameters. The differences between the REV S and REV SV simulations thus solely stem from interannual vegetation variations and land cover changes. By design, the former would only introduce minimal long-term differences over the entire simulation period. Therefore, the output differences were only analyzed after deforestation, i.e., for the period 2007-2015. Figure 5 compares the water budget components for the REV S and REV SV simulations after deforestation, i.e., for the period 2007-2015. For CLM, we observed a relative decrease in E V (−6 %) that is compensated for by an increase in E B and E I , maintaining a constant amount of ET. For CLSM, there is a 5 % decrease in total ET (with a similar relative distribution of the ET components), which is compensated for by a Q S increase of 5 %. For NOAH, deforestation also resulted in smaller ET (−3 %), which is compensated for by an increase in Q SB . Regarding the relative ET distribution of NOAH, there is an increase in E B of 4 % and a decrease in E V of 6 %.
The impact of vegetation changes on the temporal evolution of LAI and moisture content in the first 2 m of the soil (mc 2 m ) is illustrated in Fig. 6 for a representative pixel (28.0625 • S, 63.6875 • W). Figure 6a shows the climatological REV S LAI and the interannually varying REV SV LAI, to-  gether with P for a pixel that was deforested in 2004 (based on ESA-CCI land cover). The LAI values are generally low both before and after deforestation. Years where the REV SV LAI is larger than its climatology mainly correspond to wetter years. Deforestation does not cause a sudden drop in LAI, probably because the LAI for fully developed crops is not very different from that of dry forests, or the smallerscale deforestation signal may be suppressed in the upscaled 0.125 • LAI values. However, the REV SV LAI is significantly smaller than its climatology in drier years after 2004, which may be indicative of less crop cover and earlier deforestation. In short, the coarse-resolution LAI does not necessarily reflect land cover changes and is strongly influenced by P .
The combined impact of time-varying LAI and land cover on mc 2 m is shown in Fig. 6b-d for the REV S and REV SV simulations of each LSM. From the deforestation in 2004 onwards, the mc 2 m of the REV SV simulations increases for each model, but at a different rate. The main driver for the increase in mc 2 m is the change in land cover. The updated parameters related to a land cover change push the model out of its equilibrium, and water is gradually redistributed to achieve a new balance with a higher soil moisture con- The reason for the different behavior of each LSM will be explained in the following paragraph.

Sensitivity experiments
Sensitivity experiments with synthetic vegetation and land cover parameters were conducted to get more insight into the distinct behavior of soil moisture for the different LSMs in response to dynamic vegetation and land cover input. Figure 7a-c illustrate how domain-averaged water budget components change when the LAI is systematically increased by 1 to 5 units ("exp1" through "exp5") as explained in Sect. 3.2.3. To mimic the effect of deforestation, assumed to result in smaller LAI values, one should read the figure from exp5 (high LAI) to exp1 (low LAI). For CLM, a reduction in LAI introduces a reduction in E I , an increase in E B and a slight decrease in E V . For CLSM, a reduction in LAI and GVF yields a strong decrease in E I , fully compensated for by an increase in E B , whereas E V is barely affected. Concerning NOAH, smaller LAI and GVF values result in a significant decrease in E V . When the LAI and/or GVF are reduced, the Q SB increases in both CLM and NOAH.
Figure 7d-f illustrate how the various LSM soil moisture profiles respond to changing LAI (and corresponding GVF for CLSM and NOAH). A decrease in LAI barely affects soil moisture in CLSM, whereas it increases soil moisture for CLM (moderately) and NOAH (strongly), in line with the respective LSM's degree of decrease in E V (water extraction from the soil). The increase in soil moisture climatology for an imposed decrease in LAI (while assuming a persistent vegetation type) is partly an artifact resulting from a one-way dependency of soil moisture to vegetation in the absence of a dynamic vegetation growth module. In nature, LAI and soil moisture evolve together. Figure 7g-i show soil moisture profiles for the sensitivity experiment in which the land cover was either uniformly cropland or forest. For CLM, there is almost no difference in soil moisture between both. CLSM has a slightly wetter soil moisture profile for cropland than for forest. For NOAH, the soil moisture content is higher under cropland than forest in the deeper layers. The distinct model response to land cover changes is related to the fact that each model uses a modelspecific set of land cover parameters, each with different values and behavior. These parameters impact the total E V (not shown) and the related water extraction from the soil. For example, the rooting depth (the depth to which roots of plants can extract water from the soil profile) is defined differently in each LSM. In NOAH, the forest land cover class has a rooting depth of 2 m; i.e., roots take up water over the whole profile. Roots of crop are parameterized to 1 m and are not able to extract water from the deepest soil layers, resulting in wetter soils at the bottom of the soil profile (Fig. 7i). CLM makes use of a root fraction distribution whereby root fraction is a function of depth and land cover type as described in Zeng (2001). CLSM uses a rooting depth of 1 m, regardless of the land cover.
The sensitivity experiments were designed to explain the distinct behavior of soil moisture in response to changing land cover or vegetation parameters separately and do not provide insights into possible interactions between both. The relative change in the water budget components of REV SV against REV S , as shown in Fig. 5, originates from changes in LAI and GVF, land cover, and their interactions. This explains, for example, the increase in model Q S and decrease  Long-term (1992Long-term ( -2015 water budget components averaged over the Dry Chaco for SENS V experiments, for CLM, CLSM and NOAH, respectively. Exp1 to exp5 refer to simulations whereby the monthly climatological LAI maps were increased by, respectively, 1 to 5 units (Q SB : subsurface runoff, Q S : surface runoff, E V : vegetation transpiration, E B : bare soil evaporation, E I : interception evaporation). (d-f) Associated impact on the soil moisture profile (spatially averaged profile) for CLM, CLSM and NOAH, respectively. (g-i) Impact of changing land cover parameters (SENS LC ) from deciduous forest to cropland on the soil moisture profile for CLM, CLSM and NOAH, respectively. in total ET after deforestation for CLSM. This cannot be explained by the CLSM results of the SENS V (Fig. 7b) or SENS LC experiments alone.

Baseline simulations
To compare the BL hydrological behavior of the three different LSMs, Fig. 8 presents the Q and ET efficiency as a function of mc 1 m along with the "efficiency space". For each pixel within the Dry Chaco, these quantities are computed for each month across the 24-year time period, and a curve is fitted using a simple binning procedure. Figure 8 summarizes the hydrological signature of the Chaco by showing the density scatter cloud of all fitted curves. Figure 8a, d and g show a very different Q efficiency as a function of mc 1 m for the three LSMs. For CLSM, a majority of the curves reaches Q efficiency values up to 0.4, whereas the median values are around 0.2 for CLM and even below 0.1 for NOAH. This different behavior was also observed in the annual water budget analysis (see Table 2). Note that for CLM and NOAH, some high Q efficiencies are found at relatively low mc 1 m values (black circle). This is caused by high Q SB rates, which allow precipitation from the wet season to run off during the dry season months even if P in the latter period is small.
For all LSMs, the ET efficiency (Fig. 8b, e and h) is larger than the Q efficiency, and the relationship of the ET efficiency with mc 1 m is more similar across the three LSMs than that of the Q efficiency. The distinct BL signature of the three LSMs in efficiency space is summarized in Fig. 8c, f, i and mainly explained by differences in Q. Figure 9 shows the impact of the REV S and REV SV experiments in efficiency space. The impact of the updated SHPs is summarized by a small shift between the mean values of the BL and REV S in efficiency space, with a decrease in Q efficiency for CLM and increase for CLSM and NOAH. Figure 9 also shows the REV S and REV SV scatter density clouds (and mean value) in efficiency space for the 197 px deforested between 2002 and 2006 for the period 2007-2015. The inclusion of vegetation changes (REV SV ) causes a shift towards larger Q efficiencies for CLSM (increased Q S ) and NOAH (increased Q SB ). The opposite is found for CLM. When analyzing the effect of deforestation on ET efficiency, it is important to keep in mind that these efficiencies are calculated as the ratio of the latent heat flux to the net incoming radiation (available energy). By definition, the net radia- tion is also impacted by vegetation changes. Therefore, one cannot directly relate changes in ET efficiency to changes in total ET. For CLM, the REV SV density cloud tends to larger ET efficiencies, indicating that croplands need less energy than forests for the same amount of ET. For CLSM and NOAH, the scatter cloud shows a slight shift to the right: croplands use a slightly larger portion of the available energy than forest for ET.

Precipitation
Because the quality of input P will greatly influence the quality of ET, sfmc and T b simulations, independently of the used LSM, the quality of the MERRA-2 P product was first evaluated against in situ P data. Table 3 summarizes how well the monthly averaged MERRA-2 P corresponds to in situ observations for two types of stations. For the 10 stations that cover the 1992-2015 period, the mean correlation coefficient is 0.83 (±0.07), the mean monthly bias is 5 (±11) mm per month and the mean ubRMSD equals 38 (±17) mm per month. For the 2010-2015 stations (n = 8), the overall skill metrics are lower: a mean correlation of 0.74 (±0.11), a mean bias of 18(±13) mm per month and a mean ubRMSD of 42 (±14) mm per month, respectively. The positive bias indicates that MERRA-2 P is overestimated.   Figure 10 shows daily modeled sfmc time series for CLM, CLSM and NOAH, together with in situ Monte Buey sfmc (5 cm) observations (both averaged over 3 px) for the period 2013-2015. For CLSM and NOAH, the sfmc depths are 2 and 10 cm, respectively. For CLM, the soil moisture content in the four upper soil layers was averaged to obtain a 5 cm sfmc value.

Monte Buey soil moisture
The largest R and aR values between observed and modeled sfmc are obtained with CLM and NOAH (R and aR values between 0.72 and 0.81 for BL). By updating the soil parameters (REV S ), some R and aR values increase, but others decrease. However, the bias is decreased for all models. For NOAH, the updated soil parameters result in substantially wetter sfmc during the dry season. Note that these results are only valid for the 3 model pixels covering the Monte Buey site and cannot be extended to the entire Dry Chaco. Table 3. Skill metrics for monthly MERRA-2 relative to in situ precipitation: spatial average ± spatial standard deviation.

Evapotranspiration
The skill of daily simulated total ET relative to that of GLEAM-based ET estimates is shown in Fig. 11a-d for the period 1992-2015 over the entire Dry Chaco. In general, NOAH has the largest R and aR and the smallest ubRMSD. When averaged across the entire region and time period, the impact of the updated soil, vegetation and land cover parameters on the total ET is negligible and even deteriorates the overall CLSM performance relative to GLEAM. The relative distribution of the different ET components is very different for GLEAM than for the selected LSMs. Figure 11e shows that all the models (REV S ) are possibly underestimating the E V . For GLEAM, 89 % of the total ET is E V , whereas this is only 47 % for NOAH, 31 % for CLSM and 18 % for CLM. E I is the second largest component in GLEAM (7 %). CLM (48 %) and NOAH (27 %) are likely overestimating the E I , whereas CLSM overestimates the E B .

L-band microwave brightness temperature
Time series of the simulated 40 • T b at horizontal polarization (T b H ) with NOAH BL, REV S and REV SV input and SMOS T b H are shown in Fig. 12a. The inputs used in the RTM for the simulated T b include simulated surface soil moisture (using FAO texture and related SHPs in the BL simulations and HWSD-based texture and SHPs in the REV S and REV SV simulations), temperature, LAI (climatological in BL and REV S , interannually varying in REV SV ), land cover (static in BL and REV S , yearly updated in REV SV ) and the associated literature-based lookup RTM parameters. These time series are for the 36 km EASEv2 pixel that includes the deforested 0.125 • pixel shown in Fig. 6. The corresponding sfmc and LAI are shown in Fig. 12b and c. Time series of NOAH T b at vertical polarization (T b V ) and the simulations for CLM and CLSM showed similar behavior and are not shown. Figure 12 highlights how the REV SV LAI deviates from the long-term climatology (REV S ) during the SMOS observation period. Not shown here is that the LAI also deviates from the climatology earlier in time; i.e., on average the interannual variations cancel each other out by design, as in Fig. 6.
The updated soil parameters (REV S ) affect the sfmc and RTM parameters, causing a different behavior in simulated T b H . The reduced LAI and changes in land cover (REV SV ) result in wetter sfmc, but the sfmc increase is not as pronounced as for the mc 2 m in Fig. 6d. This wetter sfmc and reduced LAI propagate through the RTM and result in colder simulated T b H . Figure 13a-d summarize the T b skill metrics for the different experiments, only for the subset of pixels that were deforested between 2002 and 2006. The metrics for both polarizations (T b H and T b V ) were averaged before creating the spatial boxplots. In general, the T b bias for each model is negative, indicating that the simulated T b is colder than the SMOS T b . NOAH has the largest absolute bias, and the updated soil treatment reduces the T b bias of each model. CLSM has the best R and aR values, but the updated soil parameters reduce its performance in terms of R, aR and ubRMSD values. This is in line with the reduced performance of CLSM relative to the GLEAM ET. For NOAH, the updated soil treatment increases the R and aR values and reduces the ubRMSD. The additional impact of the updated vegetation and land cover treatment is small.
To visualize the local effects of the vegetation changes, spatial maps of the differences in temporal R values ( R) between the REV SV and REV S are shown for each model in Fig. 13e-g. The modeled T b and the spatial pattern of R over the Dry Chaco are mainly driven by changes in sfmc, and these are driven by vegetation and land cover changes.
The R values are relatively small for CLM because the model sfmc is only slightly sensitive to changes in vegetation and land cover parameters. For CLSM, the REV SV simulations yield larger R values over certain areas compared to the REV S simulations because temporal changes in modeled sfmc mainly result from changes in land cover parameters. Consequently, the R pattern is similar to the deforestation pattern (see Fig. 1). In NOAH, the sfmc is sensitive to land cover and LAI changes and, therefore, the R pattern depends on both land cover and LAI changes. For NOAH, the R values do not increase everywhere. At some pixels with reduced REV SV performance, we noticed unexpected trends in the LAI time series (not shown); i.e., LAI would not show the expected decrease during the dry season. This possibly deteriorated the T b simulations. In short, the extent of the simulated T b response to vegetation and land cover changes is model-specific and is mainly driven by the sensitivity of the model's sfmc to vegetation and land cover parameters.

LSM comparison
The relative evaluation of LIS CLM, CLSM and NOAH in terms of their water budget components (Table 2 and Fig. 3) and efficiency space (Fig. 8) highlights that each LSM has a distinct hydrological signature. CLSM has more Q S than NOAH and CLM, but CLSM has no Q SB component, unlike NOAH and CLM. It is striking how the various LSMs partition the total ET very differently into its three components, as was reported before by Wang and Dickinson (2012), Kumar et al. (2018), Rigden et al. (2018) and Zhang et al. (2020). For example, we found that CLM strongly overestimates E I and E B compared to GLEAM over the Dry Chaco (Fig. 11e). Lawrence et al. (2007) came to similar findings and indicated that by reducing the tuning parameter α l from 1 to 0.25 in the canopy interception equation, the amount of E I could be reduced to more realistic values at the global scale. They also indicated that the relative contribution of E V and E B in vegetated regions could be improved by altering two parameters (C s,dense and α) in the equation to calculate the turbulent transfer coefficient between the soil and canopy air. For CLSM, the portion of E I over the Dry Chaco is similar to that of the GLEAM product. Reichle et al. (2011) described how the amount of interception evaporation is influenced by the rainfall interception parameters FWETL and FWETC and that realistic interception rates are found when both parameters equal 0.02 (applied in our simulations). These parameters describe the fractional areas of canopy leaves (interception reservoir) on which large-scale and convective rainfall is applied. The bias between modeled ET and GLEAM is high for NOAH, indicating an overestimation of the total ET. Wei  CLSM and NOAH, respectively. et al. (2013) observed that NOAH yields substantial biases in latent heat flux, total Q and land surface skin temperature when compared to independent data over the United States. They showed that adapting model parameterizations, such as including a seasonal factor for the root distribution and selecting optimal model parameters related to the canopy resistance calculations (an important factor for E V simulation), could reduce these biases.
Our results show that constraining the LSMs with the same optimal soil, vegetation and land cover parameter input (REV S and REV SV ) does not result in more similar water budget partitioning, because the results are dominated by internal model-specific equations and parameters. The latter have historically been tuned globally with particular input datasets and may compensate for errors in those datasets, so that "better" input does not necessarily lead to "better" output. In addition, some LSMs are inherently more or less sensitive to time-varying parameters, and therefore climatological vegetation input may suffice if the LSM is simply not very sensitive to vegetation changes  or if the vegetation changes are not extensive. LSMs could benefit from further development to obtain a more realistic hydrologic response to vegetation changes and to better include dynamic vegetation phenology. This should lead to more realistic simulations of the interaction between the carbon and water cycles.
Lack of sufficient in situ Q and ET measurements over the Dry Chaco prevents identification of which LSM setup best approaches the long-term observed water budget and its components. However, our study highlights the large estimation uncertainty due to model structures and the secondary impact of (soil and vegetation) input parameter uncertainties. It is important to mention again that LSMs are continuously upgrading and that our study provides a relative comparison of older LSM versions within the LIS.

Effects of soil and vegetation parameters on soil moisture
Despite soil moisture being a highly model-dependent variable, it is important that soil moisture simulations closely mimic natural patterns to allow their use as underlying information in applications such as soil salinity monitoring over the Dry Chaco. Providing LSMs with the best available soil and vegetation parameters is one step towards this goal. The soil parameters strongly determine the spatial (horizontal and vertical) pattern of soil moisture (Fig. 4) and only have a secondary impact on its temporal variability, which is mainly driven by meteorology. This was also reported by Morgan et al. (2017), Kishné et al. (2017) and Zhao et al. (2018). The long-term soil moisture response to revised vegetation and land cover parameters differs among the three LSMs for two reasons.
-Soil moisture sensitivity to LAI and GVF indices (SENS V ): a decrease in LAI and GVF barely affects soil moisture in CLSM, whereas it increases soil moisture for CLM (moderately) and NOAH (strongly). This is in line with the respective LSM's degree of E V extracting more or less water from the soil.
-Soil moisture sensitivity to land cover parameters (SENS LC ): a land cover change from forest to cropland barely affects CLM soil moisture, slightly increases soil moisture in CLSM, and mainly increases deep soil moisture in NOAH. This is related to the distinct root distribution and root water uptake (controlled by the stomatal conductance and rooting depth) in the various LSMs, impacting the E V and related water extraction from the soil.
The REV SV simulations showed that all three LSMs simulate wetter soils after deforestation in the Dry Chaco. In addition, larger values of total Q are found for two LSMs, which might implicitly be indicative of groundwater recharge (groundwater is only simulated in CLSM). This is in agreement with studies that have been carried out on a local scale by Giménez et al. (2016), Marchesini et al. (2017), Magliano et al. (2017) and Nosetto et al. (2012). They all reported wetter soils on agricultural land compared to forest across different regions of the Chaco. However, it must be mentioned that deforestation itself does not necessary lead to wetter soils. The impact of deforestation on soil moisture is a combination of an increased rate of E B (drier soils) and a decreased rate of E V (wetter soils). The net effect depends on the local climate and the length of the wet or dry periods (Chen et al., 2009). The impact may also vary across the vertical soil profile depending on the change in root distribution and soil characteristics after deforestation (de Queiroz et al., 2020). Amdan et al. (2013) stated that the agriculture system applied in northwestern Argentina also influences the soil moisture status. The prevalent agriculture system in the Dry Chaco is mostly based on no-tillage management followed by fallow seasons. The crop residue that remains on the field decreases E B and therefore increases available soil water content (Villegas et al., 2010). During the fallow season, there is no consumption of soil water over agriculture areas, increasing deep percolation of water towards the groundwater table. These changes in soil characteristics, soil management and natural mulching are not included in LSM simulations.

Model evaluation using independent data
The reasonable agreement between MERRA-2 and the longterm in situ P data supports an analysis of the water budget components in absolute values (as in Table 2), but with caution: the more recent P stations suggest larger biases and may possibly not have been included in the gauge-based P corrections of MERRA-2. Due to its coarse resolution, MERRA-2 does not take into account microclimatic effects over the mosaic landscape associated with deforestation.
Relative to independent data, there is no LSM setup that yields significantly better output for the entire Dry Chaco in terms of time-series metrics. CLM sfmc yields the largest R and aR values when compared to the in situ Monte Buey sfmc. NOAH outperforms both CLM and CLSM in terms of ET (when compared to GLEAM ET), whereas CLSM yields the best T b results (when compared to SMOS T b ). Updating the model soil parameters with more recent (assumed to be more accurate) information reduces the sfmc bias relative to in situ observations for the 3 Monte Buey pixels as well as the T b bias relative to region-wide SMOS observations for all models. On the other hand, the update reduces the R values between CLSM and GLEAM ET and between CLSM and SMOS T b .
The spatially complete time-series evaluation with longterm SMOS T b observations is unique for assessing model performance over areas with limited long-term in situ data. A similar approach was conducted by Albergel et al. (2012) over the United States. The high R values between the SMOS T b observations and RTM simulations indicate that LSMs simulate realistic temporal variations and that their use over the Dry Chaco is justified (despite the lower agreement between MERRA-2 and recent in situ P ). After implementation of time-varying vegetation and land cover changes, we found no significant or unanimous model improvement when compared to independent data. This is most likely due to the coarse spatial resolution of the satellite-based evaluation, which suppresses the signal of the rather small-scale mosaic deforestation activities over the Chaco, and to the fact that model simulations do not necessarily perform better when given supposedly better input parameters. Furthermore, removal of subtropical dry forest is likely to only have a little impact on L-band T b , whereas the removal of dense tropical rainforest would probably have a much stronger impact on both T b simulations and observations. However, Fig. 13 indicates that time-varying vegetation and land cover parameters did introduce slightly better agreements between simulated and SMOS-observed T b for some deforested areas. Local calibration of different RTM parameters could possibly further improve the agreements between modeled and observed T b .
An evaluation with finer-scale sfmc retrievals, e.g., obtained from Sentinel-1 or after downscaling passive microwave products, is not included in this study because by design, sfmc retrievals do not necessarily account well for deforestation, are prone to vegetation bias in general (Zwieback et al., 2018), and would thus not serve well as reference data. A finer-scale evaluation in terms of land surface temperature would add value, but an analysis of the energy budget was beyond the scope of this study.

Shortcomings and scope for further research
Our simulations of the Dry Chaco's hydrology with revised soil, vegetation and land cover parameters have some shortcomings and caveats that could partly be overcome in future research. First, feeding the models with similar, e.g., satellite-based, albedo input could further lead to a homogenization in model output, as this is a key parameter in the calculation of the land surface energy budget (Alton, 2009;Yin et al., 2016;Houspanossian et al., 2017). Second, we showed that the spatial pattern of simulated soil moisture is closely related to the spatial pattern of the soil texture data. Regional simulations could thus benefit from regional soil maps, where available. In addition, land cover changes can affect soil properties (Wiekenkamp et al., 2020), which means that the adjustment of SHPs in response to vegetation changes (such as deforestation) could further improve LSM simulations. Third, the relatively coarse spatial resolution of our simulations (0.125 • ) required an upscaling of the LAI, GVF and land cover input data, and the satellite-based evaluation even required an aggregation to 36 km. This caused a suppression of the deforestation signature in our simulations. Fourth, our LSM simulations did not include any dynamic vegetation growth module, which would couple soil moisture and vegetation in two ways instead of only one way. More generally, LSMs can benefit from further developments to better represent soil-water-vegetation processes, because the three tested LSMs now show a questionable range in their partitioning of the water budget components and in their response to parameter changes. Some of the above caveats will be addressed in future research towards using LSM output as underlying information for the assessment of dryland salinity over the Dry Chaco.

Conclusions
In this study, we updated the soil-and vegetation-related parameters of three LSMs (CLM2.0, CLSM-F2.5 and Noah3.6), grouped within NASA's LIS, to obtain the best modeled representation of the hydrology over the South American Dry Chaco. We used HWSD v1.21 soil texture and time-varying satellite-based GLASS and GIMMS vegetation indices, along with yearly updated ESA-CCI land cover information. The impact of the various model structures, soil texture and dynamic vegetation input was assessed in terms of water budget partitioning and efficiency space. Our results indicate that the three LSMs yield a different partitioning of the water budget, with 74 % to 95 % of the total annual P over the Dry Chaco contributing to ET; the soil texture pattern is the main driver of the spatial pattern of soil moisture; introducing similar soil, vegetation and land cover parameters in the various LSMs does not result in a homogenization of the long-term water budget components; i.e., the various LSM structures primarily determine the water distribution, whereas soil, vegetation and land cover parameters only have a secondary impact.
The updated vegetation and land cover treatment allowed us to explore to which extent large-scale land cover changes in the Dry Chaco affect the different water budget components. It was found that deforestation increases soil moisture for all LSMs, but the degree of increase depends on the model structure; a change in land cover results in a shift of the model climatology and a (non-stationary) redistribution of the water budget, which is different for each LSM; and the implemented satellite-based vegetation indices do not fully depict deforestation, because the 0.125 • spatial resolution partially suppresses the deforestation signal, and the replacing agricultural crop may have similar LAI and GVF values to the initial dry forest.
The model input and output were further evaluated against independent data of in situ P and sfmc and spatially covering GLEAM ET and SMOS T b . The latter offers the unique possibility for an integral evaluation of simulated soil moisture, soil temperature and LAI after forwarding these variables through a zero-order RTM. Relative to independent data, no specific LSM structure, soil or vegetation input is significantly better than another in terms of time-series metrics. Updated soil parameters reduce T b bias relative to region-wide SMOS observations for all models, improve the R and ubRMSD values for CLSM and NOAH, but reduce the R values between CLSM and GLEAM ET and CLSM and SMOS T b . Interannually varying vegetation and land cover input generally have a marginal impact. Our methodology is a first step towards a better representation of the Dry Chaco ecosystem using dynamic LSM simulations. The assimilation of multi-source remote-sensing data for state (e.g., soil moisture and temperature) and parameter (e.g., albedo) updating may help to further constrain the models and correct for unmodeled processes, such as land cover changes. However, the impact of such updates will depend on the used LSM and on the optimization of data assimilation systems in the presence of non-stationary processes that are associated with land cover changes (such as deforestation). Vegetation data assimilation can only have the desired impact if the sensitivity of simulated hydrological fluxes to vegetation changes is realistic, and this may need further research, especially at the global scale. To better represent small-scale land cover conversions and their impact on the water distribution for research on dryland salinity or ecology, simulations should also be conducted at a finer spatial resolution.   (Kumar, 2021).
Data availability. The updated vegetation, land cover and soil parameters, as well as validation datasets are available online, as documented in Sect. 2. The Monte Buey soil moisture data can be obtained from the Comisión Nacional de Actividades Espaciales (CONAE) upon request. The LIS simulation output can be obtained via https://doi.org/10.5281/zenodo.5091408 (Maertens et al., 2021).
Author contributions. MM performed the model simulations, produced figures and wrote the paper. GJMDL and SA contributed to the experiment setup and the writing of the paper. SVK and SPPM provided relevant input data for the simulations and contributed to the writing of the paper.
Competing interests. The authors declare that they have no conflict of interest.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Acknowledgements. The authors thank Mark Thibeault (CONAE) for providing the Monte Buey soil moisture data and David Mocko and Randy Koster for insightful discussions. We thank Clément Albergel and one anonymous reviewer for their very thorough and excellent reviews.
Financial support. This study was performed within the framework of the Belgian Science Policy (BELSPO) project REFOR15 CHA (SR/00/338), with support of the Flemish Research Foundation (FWO 1512817N), C1 KU Leuven (C14/16/045) and the High Performance Computing system of the Vlaams Supercomputer Centrum.
Review statement. This paper was edited by Wouter Buytaert and reviewed by Clément Albergel and one anonymous referee.