Assessment of soil moisture fields from imperfect climate models with uncertain satellite observations

We demonstrate that global satellite products can be used to evaluate climate model soil moisture predictions but conclusions should be drawn with care. The quality of a limited area climate model (LAM) was compared to a general circulation model (GCM) using soil moisture data from two different Earth observing satellites within a model validation scheme that copes with the presence of uncertain data. Results showed that in the face of imperfect models and data, it is difficult to investigate the quality of current land surface schemes in simulating hydrology accurately. Nevertheless, a LAM provides, in general, a better representation of spatial patterns and dynamics of soil moisture compared to a GCM. However, in months when data uncertainty is higher, particularly in colder months and in periods when vegetation cover is too dense (e.g. August in the case of Western Europe), it is not possible to draw firm conclusions about model acceptability. For periods of higher confidence in observation data, our work indicates that a higher resolution LAM has more benefits to soil moisture prediction than are due to the resolution alone and can be attributed to an overall enhanced representation of precipitation relative to the GCM. Consequently, heterogeneity of rainfall patterns is better represented in the LAM and thus adequate representation of wet and dry periods leads to an improved acceptability of soil moisture (with respect to uncertain satellite observations), particularly in spring and early summer. Our results suggest that remote sensing, albeit with its inherent uncertainties, can be used to highlight which model should be preferred and as a diagnostic tool to pinpoint regions where the hydrological budget needs particular attention. Correspondence to: G. Schumann (guy.schumann@bristol.ac.uk)


Introduction
The land surface is a key component in climate models (CMs) and controls the partitioning of available energy at the surface between sensible and latent heat, and of available water between evaporation and runoff (Pitman, 2003).The dynamics of soil moisture content is a key process that controls the partitioning of the heat fluxes which in turn influence the variability in both weather and climate (Entekhabi et al., 1996).Simulations from climate models are increasingly being used in other applications.In particular for hydrological models being applied to assess future or past changes in hydrological behaviour, land surface schemes (LSSs) in CMs need to be evaluated against some sort of data.Given that these models are run over scales for which ground data collection is hardly possible, evaluation of land surface variables is difficult to perform.It is argued by Cornwell and Harvey (2007) that while the quality of CM-LSSs seems to be improving (although there still are large inconsistencies in predictions), without high-quality long-term observations there remains significant uncertainty in the evolution of CM predictions of soil moisture change.
However, satellites acquire data over very similar scales to climate models, and thus present an invaluable source of validation data.For instance, Bastiaanssen et al. (2005) used radiance values from satellites in association with an energy balance model to derive large scale evapo-transpiration.Similarly, techniques exist to derive soil moisture from passive as well as active microwave sensors.These range from less complex change detection algorithms (Wagner et al., 1999) to more sophisticated integration of land parameter retrieval models (Owe et al., 2001).It is important to note the many differences in definition of soil moisture that exist.LSSs are most concerned with heat and energy fluxes and soil G. Schumann et al.: Assessment of soil moisture from climate models moisture is treated and defined differently by different CM-LSSs (Cornwell and Harvey, 2007).Although efforts are being made to improve soil moisture retrieval from remote sensing, in some existing algorithms, soil moisture is represented as a wetness index whereby changes with depth or the effects of heat and moisture fluxes are only partially represented.
Despite considerable limitations and uncertainties, the success of satellite remote sensing to help improve the functioning of land surface schemes has been illustrated in some studies.Large scale vegetation cover and land cover from remote sensing have been shown to possess potential for integration with land surface schemes (e.g.Lu and Shuttleworth, 2002;Crawford et al., 2001).Also, large scale satellitederived soil moisture has been used to initialize numerical weather prediction models (Drusch, 2007) and climate models (e.g.Walker and Houser, 2001).
To complement and extend this part of the literature, we use the global soil moisture products from the Advanced Microwave Scanning Radiometer onboard the Earth Observation satellite (AMSR-E) and the scatterometer sensor on the European Remote Sensing satellite (ERS-2) to propose a scheme to assess the quality and value of spatial patterns of monthly soil moisture simulated by a limited area model (LAM), HadRM (Jones et al., 1995), and a general circulation model (GCM), HadAM3 (Pope et al., 2000).In other words, this paper introduces a methodology to evaluate the acceptability of soil moisture heterogeneity from imperfect climate models in the presence of uncertain observation data.By doing so, we hypothesize that this helps justify the development and use of a LAM.For instance, more reliable higher resolution limited area models could provide valuable input data for large scale hydrological modelling under different climate change scenarios.While it is reasonable to assume that a quantitative estimation of all sources of uncertainty is hardly possible when dealing with global-scale imperfect models and observation, we believe that it is sensible to introduce an acceptability -rather than performance or model matching -scheme for climate models that is partly based on qualitative analysis of the different error sources.

Passive AMSR-E data
Parameters for soil moisture retrieval are derived from passive microwave remote sensing data using the Land Parameter Retrieval Model (LPRM).The LPRM is based on a forward radiative transfer model to retrieve surface soil moisture and vegetation optical depth (VOD), i.e. vegetation wetness (Owe et al., 2001).A unique feature of this method is that it may be applied at any microwave frequency (<20 GHz), making it very suitable to exploit all the available passive microwave data from historic satellites (Owe et al., 2008).This dataset describes volumetric soil moisture (in m 3 m −3 ) of the first top centimeters (1-2 cm) with an average accuracy of 0.06 m 3 m −3 for sparse to moderate vegetated regions (de Jeu et al., 2008).For direct data use, five years (2002)(2003)(2004)(2005)(2006)(2007) of daily 0.25 degree surface soil moisture data from AMSR-E C-band are available free of charge at http: //www.geo.vu.nl/ ∼ jeur/lprm/.

Soil moisture from climate models
Soil moisture is simulated within both CMs using MOSES (Meteorological Office Surface Exchange Scheme) from the UK Meteorological Office (UKMO) (Cox et al., 1999).MOSES is a land surface scheme that reproduces terrestrial processes according to a simplified surface flux partitioning scheme.For moisture flux, the surface hydrology is defined in terms of the soil moisture vertical profile, snow lying on the ground and water on plant leaves or on the soil surface.The soil hydrology component of MOSES is based on a finite difference approximation to Richards' equation (Cox et al., 1999) and moisture content is output for four vertical profiles (0.1, 0.25, 0.65 and 2.0 m), of which the top one (10 cm) is used for comparison in this study.Both CMs generate atmospheric conditions that interact with MOSES to output soil moisture.Runoff, which is another important hydrological variable and closely linked to soil moisture, is generated via infiltration excess and drainage from the bottom of the soil column.Surface runoff takes place when water flux at the soil surface exceeds the saturated hydraulic conductivity.An exponential sub-grid distribution of rainfall intensities is assumed in order to calculate gridbox mean values of surface runoff (Cox et al., 1999).
HadAM3, with a grid resolution of 2.5 degrees latitude by 3.75 degrees longitude, is an improved version of the former Hadley Centre atmospheric model HadAM2b (Stratton, 1999) and is based on the Unified Model (UM) system.The HadAM3 simulations assessed in this paper are identical to the modern control simulations of Jost et al. (2005).The atmospheric model is initialized by observed climatological monthly mean sea surface temperatures (SSTs) (over 30 years), and so simulations do not correspond to any particular year.In other words, this study compares the seasonal cycle between data sets and not individual years.
HadRM is a limited area atmospheric model also belonging to the UM system, which is driven at its lateral boundaries and at the sea surface by a time series of data archived from a previous integration of HadAM3.Locatable over any part of the globe, it is typically run for short periods (i.e.<20 years) at a horizontal grid resolution of 0.44 degrees, as opposed to global simulations of long periods of time, which generates high-resolution climate change information for particular regions; however, the relative computational cost is high (Jones et al., 1995).Again, the HadRM simulations are identical to the modern control simulations of Jost et al. (2005).common grid resolution of 0.5 degrees, as it does not assume any special arrangement of the data points used.Thereafter, for sake of comparing different sources of data, soil moisture values were re-scaled in the spatial domain between the maximum and minimum value for each month (Fig. 1).
Given that climate model outputs are available on a monthly basis for six years constrained by climatological mean SSTs (with an additional eight years spin-up time) for both the LAM and GCM, averaging all daily remote sensing values over a month for 3-4 years from 2004 to 2006/2007 for each sensor separately was assumed sensible.Although AMSR-E data are available on a daily basis whilst ERS data are acquired every 3 days, it is assumed that this difference is negligible when averaging data from each sensor separately over a month.It is worth noting that although data from both sensors are also available for the entire year of 2003, this year was omitted because (a) the ERS sensor was switched off a number of times and (b) the strong abnormal heat wave over Europe that summer had considerable effects on soil moisture conditions (Fischer et al., 2007), and as a result, both climate models (for normal atmospheric conditions) would be inappropriately penalized during evaluation.
Arguably averaging quasi daily values over one month might introduce more uncertainty, given the different sampling intervals and high variability of surface soil moisture.We try to minimise this effect by treating monthly soil moisture from both space-borne sensors separately in the proposed climate model acceptability scheme and comparing both climate models in terms of their ability to adequately reproduce spatial patterns of soil moisture rather than absolute values in the presence of uncertain observations.Moreover, given the inherent uncertainties and process simplifications, we believe that current climate models cannot be expected to reproduce temporal and/or spatial variability of soil moisture adequately on a daily scale.

Defining an appropriate LSS assessment scheme
After data pre-processing, the agreement of monthly soil moisture values between the two remote sensing sensors was assessed using histogram distributions of absolute differences.This gave an appreciation of data consistency over a full year.A strong positive skewness of the distribution and a mean absolute difference value over Europe of ∼0.2 in rescaled soil moisture for most months indicated a good agreement (pixel by pixel) between the two products.Given the uncertainties affecting both products, establishing the exact quality of either one is hardly possible (de Jeu et al., 2008).Uncertainties, other than differences in sampling resolution and interval and monthly averaging, include: (i) different instrument technologies and retrieval approaches, (ii) effects of dense vegetation and remaining snow and frozen soil pixels (n.b. products are available with quality flags for these pixels), and (iii) minor differences in retrieval depth.
Reciprocally, the degree of disagreement between the two products may be viewed as a measure of the uncertainty associated with the data.In regions where both products agree in particular over large spatial distances, the values are likely to represent actual soil moisture characteristics (de Jeu et al., 2008).This information can be used in an evaluation scheme where both model and observation data are known to be in error.Such schemes have recently gained popularity in hydrological studies (Beven, 2006) and an adaptation is used here to assess the acceptability of both climate models.It is clear that other satellite products available at similar scales, such as global precipitation (e.g.Hong et al., 2004) or/and evaporation data (e.g.Bastiaanssen et al., 2005;Cleugh et al., 2007), might be used within similar acceptability schemes.Given the rather limited availability of global-scale spatially distributed hydrological parameters and the simplicity as well as difficulties with which current climate models reproduce spatially averaged seasonal hydrological behaviour, we believe that assessing the acceptability of models to adequately reproduce monthly changes in spatial heterogeneity of near-surface soil moisture saturation is a fair test to evaluate surface hydrological response.
Given that soil moisture is highly variable in its nature, is poorly defined and reproduced differently by climate modeling as well as remote sensing, a fit-for-purpose model evaluation scheme, which copes with errors in both data and models as described above, needed to be defined.In the hydrology literature a number of different fuzzy rules-based membership functions have been used (e.g.Beven, 2006;Pappenberger et al., 2007).We implemented a trapezoidal-like non-linear function (Fig. 2) that defines an adequate number of degrees of freedom within a fuzzy membership definition based on an interval of monthly mean remotely sensed soil moisture, the bounds of which are defined by the two satellite observations (RS) from 2004 to 2006/2007.Outside either side of this interval a Gaussian function is defined using ±σ , where σ reflects the interannual variability in spatial heterogeneity of soil moisture from both satellites combined and is taken here as a safeguard against climate model over-penalization.As both products have been validated with ground observations for some areas in Europe (Owe et al., 2008;de Jeu et al., 2008), accuracy indications exist and errors are generally below 10% but are difficult to compare (de Jeu et al., 2008).In order to account for product accuracy, we make the top of the function wider by adding 10% of re-scaled soil moisture to each side.The membership function that gives the acceptability, A, of the monthly soil moisture outputs from the LAM and GCM over Europe, denoted CS, is defined in Eq. ( 1) below.
Where A denotes climate model acceptability (not actual performance) and σ is the interannual variability in spatial soil moisture patterns from both satellites, RS 1 and RS 2 .0.9RS 1 and 1.1RS 2 represent the position of the center of the peak, and σ controls the width of the function.By adopting a conditional membership function, it is ensured that the LSS is given a maximum acceptability of 1 if it falls inside an interval defined by the mean soil moisture of the two satellite  The plots show LAM acceptabilities for selected months in each season, where a maximum of 1 is attributed to a simulation that falls within the interval [0.9 RS 1 , 1.1 RS 2 ] while a Gaussian function is applied outside that interval ±σ on either side.The plots in (b) give the interannual variability (σ ) in remote sensingobserved spatial heterogeneity.
instruments over the 3-4 years and decreasing acceptabilities following a Gaussian distribution on either side of the interval ±σ .

Remote sensing soil moisture
Figure 1 shows that there is a strong spatial consistency between the AMSR-E derived soil moisture and that simulated by the LAM.In some places different spatial patterns are observed in the ERS scatterometer soil moisture.This is most probably because the AMSR-E algorithm and MOSES are physically based whereas the scatterometer data are obtained using a multi-incidence change detection algorithm which may lead to greater spatial inconsistencies.Note that AMSR-E soil moisture along the west coast of Ireland and Northern Scotland is largely overestimated as a result of significant excess standing water on the surface of pixels in that area (leading to very low returns of brightness temperature) and values in these parts should thus be interpreted with care or ignored.
A more thorough analysis of the data revealed that there is in general a good agreement between the monthly soil moisture values of the two satellite sensors.Over Europe, absolute differences are for most months generally less than 0.2 in re-scaled soil moisture (Fig. 3).This is supported by the skewness of the data distribution histograms plotted alongside.Figure 4 confirms this by illustrating that observed data variability is generally of 0.2, except for colder months, when the skewness of the distribution in Fig. 3 flattens out.It is believed that alongside the natural variability of soil moisture, differences during colder months are primarily related to difficulties in retrieving soil moisture at lower temperatures and over snow-covered areas, especially for single or extreme events.Larger disagreement also occurs during August which may result (i) from differences in retrieval algorithms and also to a certain degree in soil depth (1-2 cm for AMSR-E, <5 cm for ERS and 10 cm for MOSES) especially in summer, (ii) when vegetation and soil moisture are "out of phase" (for active microwave sensors), and (iii) when vegetation wetness (i.e.VOD) is too high, resulting in larger uncertainties in retrieved soil moisture (de Jeu et al., 2008).Also, at higher altitudes, large topographic variations as well as high vegetation add considerable uncertainties to the retrieval of soil moisture.In such areas correlation (ρ) is fairly low and even negative (15% of all pixels).Nevertheless, there is in general a good correlation between both products, with ρ>0.5 for 60% of the area.

LSS acceptability
The analysis of the remote sensing soil moisture products confirms that both the degree of natural variability and the level of disagreement can be used to set limits of acceptability inside which the LAM and GCM need to fall to be acceptable.Figure 5a illustrates the acceptability of the LAM HadRM for selected months alongside interannual spatial variability (Fig. 5b) which is generally of σ <0.2.In particular months with larger data intervals (e.g.August), apparent model performances increase.With higher degrees of freedom to fit the data, higher levels of model acceptability are to be expected, given the nature of the proposed evaluation scheme.In months of higher natural data variability, there is at the same time more uncertainty in the data retrieval because of the factors outlined earlier.
To assess the percentage improvement that is to be gained with the HadRM over Europe as opposed to the global HadAM3, the acceptability of the latter was computed and Fig. 6.Average precipitation, evaporation, runoff (in mm day −1 ) and top layer soil moisture (in kg m −2 ) for each month as simulated by both climate models.Plots show average values for a selected region (centred at 50 • long.and 13.12 • lat.)where there is considerable improvement in the LAM over the GCM, with respect to uncertain satellite soil moisture (see particularly "May" plot in Fig. 7).In the case of the GCM, May is too wet relative to any other region in the model compared to satellite and the LAM.For comparison with satellite data, soil moisture plotted for both models only represents near surface soil moisture and not the entire vertical soil column.Note that in the lowest panel the notion "hydrological budget" refers to the imbalance between total precipitation, evaporation and runoff.compared to the LAM translated onto the GCM grid.Equation (2) below gives the percentage improvement, I LAM , that can be achieved with a high resolution CM.
In HadRM, processes are better discretized due to the higher resolution, which results in an enhanced representation of precipitation (Jones et al., 1995).This is clearly highlighted in Fig. 6.With respect to other regions, the selected region is too wet in the GCM, particularly for May, when compared to satellite observations.Indeed, the LAM does a lot better in these months, with the region being drier in relative spatial terms.Adequate representation of the total precipitation is crucial to prevent unrealistic soil saturation (see particularly the spring months in Fig. 6).As a result of a better discretization of processes, outputs of highly spatially varying parameters are more heterogeneous in HadRM, which can lead to a better fit with spatially (and also temporally) varying observations.Important to note is that actual values in remote sensing soil moisture cannot be compared to climate model values due to the differences in estimation methods.Figure 7 shows I LAM for four months plotted on the GCM grid in order to highlight improvement not directly related to resolution.For months in which the two satellites disagree most (e.g.August), the gain in acceptability is less meaningful and conclusions should be drawn with care.However, for months when both observation sources give more similar values (e.g.May, and also April) despite often more complex vegetation-soil moisture relations (e.g.April), higher acceptability of either model is physically more meaningful.Figure 7 illustrates that for all months shown the improvement in acceptability of the LAM over the GCM exceeds 100% averaged over Europe (this also applies to all other months).A higher acceptability of the HadRM on average justifies the development and use of a LAM.Percentage improvement of acceptability when using a LAM instead of a GCM, for four months.The LAM is plotted to the GCM grid, which implicitly reflects the enhancement in surface dynamics representation that is to be gained with a LAM.Positive values indicate that the LAM gives a higher acceptability than the GCM and negative values show a lower acceptability of the LAM.Values above each plot give the average percentage improvement over Europe.
of the spread in predictions.Although this might be true, our results indicate that precipitation in the GCM need to be improved first, particularly in warmer months.
This study has demonstrated that satellite observations, albeit with their inherent uncertainties, can pinpoint areas that are either too dry or too wet relative to other areas in a given model and also indicate which model represents processes more adequately.Although satellite observations might not allow a comparison of actual values or indeed tell us the exact quantities a model should simulate because of significant differences in retrieval techniques, space-borne remote sensing can be used as a tool to diagnose imperfect climate models.As remote sensing enables identifying a more adequate model, subsequent improvement can be undertaken by examining hydrological components in more detail (as illustrated in Fig. 6).From our findings, it may be concluded that a high resolution climate model is necessary to model hydrological change which is happening at much smaller scales, such as soil moisture, that cannot be adequately represented by a GCM.

Conclusions
In a review of LSSs, Pitman (2003) points out that significant problems remain to be addressed, including difficulties in parameterizing hydrology and sub-grid-scale heterogeneity.Continued development of land surface models requires more multidisciplinary efforts by scientists with a wide range of skills.In this context, we have presented a possible evaluation scheme for CM-LSSs in the face of imperfect models and uncertain observations.The scheme reflects limitations of both current CM-LSSs and available remote sensing data.
Our work indicates that the use of a higher resolution LAM has more benefits to soil moisture prediction than are due to the resolution alone and can be attributed to improved representation of precipitation and thus the hydrological cycle at an enhanced horizontal resolution (Jones et al., 1995).It is clear that given current data uncertainties, global products from space-borne remote sensing might not yet allow us to fully validate the actual performance of climate models and land surface schemes but clearly possess the potential to indicate where models are in error and which parts need improvement.Currently diagnosing the hydrology in climate models seems only really possible with global satellite observations, as the scale of field data collection is incompatible with the scale at which climate models are run.
In conclusion, we believe that higher resolution CMs need to be evaluated with observations that space-borne remote sensing could potentially provide.Using the proposed fuzzy model acceptability scheme, this study has clearly demonstrated the potential of remote sensing for large-scale model diagnostics.However, our results indicate that for some months model differences are around the same order of magnitude as the uncertainty in current data sets.Therefore, further effort is needed to obtain more accurate soil moisture observations.Availability of adequate estimations of uncertainty associated with the data, such as presented by de Jeu et al. (2008), is also expected to aid evaluation.Noteworthy is also the progress researchers are currently making with higher resolution soil moisture retrieval from Envisat ASAR observations (<1 km resolution, see e.g.Loew et al., 2006).This is still in an experimental stage, but may become routinely available in the very near future.Also, improved data products from new satellite missions, such as the Soil Moisture and Ocean Salinity (SMOS) mission, may allow us to better assess models.

Fig. 1 .
Fig. 1.Study area and soil moisture estimated from space-borne sensors (AMSR-E and ERS scatterometer, left) and climate models (HadRM LAM and HadAM3 GCM, right) for selected months.For comparison, soil moisture values are re-scaled between 0 and 1.

Fig. 2 .
Fig.2.Evaluation scheme based on a non-linear fuzzy membership function.RS 1 and RS 2 represent monthly re-scaled soil moisture values from the two satellites.Note that RS 1 and RS 2 can take the value of either ERS or AMSR-E, where RS 1 <RS 2 (i.e.RS 1 is the lowest of both products and RS 2 the highest).The grey area shows the entire LSS acceptability domain.This scheme is applied to every pixel in the domain separately.

Fig. 3 .
Fig. 3. Absolute errors (AE) between AMSR-E and ERS re-scaled soil moisture for each month (average AE values for Europe are given above each plot).The corresponding histogram distribution is also shown.

Fig. 4 .
Fig.4.Line plot showing for each month the interquartile range as well as the 90% range of the remote sensing soil moisture interval [0.9 RS 1 , 1.1 RS 2 ] used in the proposed model acceptability scheme.In other words, the ranges shown illustrate the distribution of disagreement in spatial heterogeneity between the two remote sensing products.
Hydrol.Earth Syst.Sci.,13,[1545][1546][1547][1548][1549][1550][1551][1552][1553] 2009   www.hydrol-earth-syst-sci.net/13/1545/2009/ Fig. 7.Percentage improvement of acceptability when using a LAM instead of a GCM, for four months.The LAM is plotted to the GCM grid, which implicitly reflects the enhancement in surface dynamics representation that is to be gained with a LAM.Positive values indicate that the LAM gives a higher acceptability than the GCM and negative values show a lower acceptability of the LAM.Values above each plot give the average percentage improvement over Europe.