Pedotransfer functions are used to relate gridded databases of soil texture information to the soil hydraulic and thermal parameters of land surface models. The parameters within these pedotransfer functions are uncertain and calibrated through analyses of point soil samples. How these calibrations relate to the soil parameters at the spatial scale of modern land surface models is unclear because gridded databases of soil texture represent an area average. We present a novel approach for calibrating such pedotransfer functions to improve land surface model soil moisture prediction by using observations from the Soil Moisture Active Passive (SMAP) satellite mission within a data assimilation framework. Unlike traditional calibration procedures, data assimilation always takes into account the relative uncertainties given to both model and observed estimates to find a maximum likelihood estimate. After performing the calibration procedure, we find improved estimates of soil moisture and heat flux for the Joint UK Land Environment Simulator (JULES) land surface model (run at a 1 km resolution) when compared to estimates from a cosmic-ray soil moisture monitoring network (COSMOS-UK) and three flux tower sites. The spatial resolution of the COSMOS probes is much more representative of the 1 km model grid than traditional point-based soil moisture sensors. For 11 cosmic-ray neutron soil moisture probes located across the modelled domain, we find an average 22 % reduction in root mean squared error, a 16 % reduction in unbiased root mean squared error and a 16 % increase in correlation after using data assimilation techniques to retrieve new pedotransfer function parameters.

Land surface models are important tools for translating meteorological forecasts and reanalyses into real-world impacts by providing schemes for how energy, water and other matter will interact with the Earth's surface, outputting relevant diagnostics and variables and understanding the role of variability in the terrestrial hydrological cycle in the Earth system. As the spatial resolution of available meteorological information has become increasingly fine

There now exists a large amount of information from different satellite missions relating to the spatial and temporal variability of soil moisture. These can be based on either active (e.g. the Advanced Scatterometer (ASCAT);

Data assimilation provides methods for combining new observations with land surface models in order to improve predictions. These techniques can either be used for state-estimation to update soil moisture values of the model in real time as new observations are available

We have used the Land Variational Ensemble Data Assimilation Framework (LAVENDAR)

We defined two objectives for this study: firstly, to examine the ability of 9 km SMAP data to update pedotransfer parameters in a 1 km land surface model and, secondly, to assess the resulting prediction of modelled soil moisture against (a) SMAP data from a different time period and (b) independent in situ data from the COSMOS-UK network. We also assess the impact on modelled latent and sensible heat flux at three flux tower sites.

Joint UK Land Environment Simulator (JULES) is a community developed process-based land surface model and forms the land surface component in the next-generation UK Earth System Model (UKESM). A description of the energy and water fluxes is given in

Maps of soil properties from the Harmonised World Soil Database (HWSD)

The JULES model implements both the

Prior values for parameters of the

Static parameter values for the

The NASA Soil Moisture Active Passive (SMAP) satellite mission provides estimates of soil moisture every 2–3 d

Location of COSMOS probes (blue circles) and flux towers (black crosses) used in validation. Red shading indicates number of SMAP observations assimilated in experiment period (2016). No colour corresponds to no observations being assimilated in that location due to low-quality retrieval or surface flag. The black dot shows the location of London, UK.

The COSMOS-UK network has been producing observations of soil moisture and other meteorological variables at an expanding number of stations (currently 52) since 2013

In order to understand how updating the JULES soil parameters of the model might effect the model prediction of latent and sensible heat flux, we compare prior and posterior estimates to observations at three flux towers. The location of these flux towers is shown by black crosses in Fig.

The Met Office site at Cardington (29 m above sea level) is a 18 ha area laid mainly to manicured grass set within generally flat, semi-rural surroundings

The Redmere and Great Fen sites are located on lowland peat soils in the East Anglian Fens. Both sites are nodes of the UK Land Flux Network (UKLFN) operated by the UK Centre for Ecology and Hydrology (UKCEH). The Redmere site is cropland, producing maize and lettuce in 2016 and 2017, respectively. The Great Fen site is an area of extensively managed grassland. Instrumentation is identical at both locations, consisting of a Windmaster ultrasonic anemometer (Gill Instruments Ltd.) and a LI-7500A infrared gas analyser (LI-COR Biosciences, Ltd). Raw (20 Hz) EC (eddy covariance) data were reduced to 30 min flux densities using the EddyPRO v7.0.6 flux calculation software

In order to estimate the identified pedotransfer function parameters, we use the LAVENDAR data assimilation framework

Using a smoother instead of a filter has advantages

We show a schematic of how this system works in Fig.

Schematic of the LAVENDAR data assimilation framework, showing the workflow for the experiment. Here

We conducted our pedotransfer function parameter estimation for the year of 2016 using all SMAP observations in this period. We also ran the prior and posterior JULES ensembles into 2017 so that we could judge the results against independent SMAP observations in a “hindcast” experiment, allowing us to judge if any skill added by the assimilation persisted into the future. For the 2016–2017 period, we then used the available COSMOS probe observations for validation, comparing both prior and posterior JULES soil moisture estimates to these observations. Using the COSMOS-UK observations in this way gives us a better understanding of whether information added by the assimilation of SMAP observations can help to improve model estimates at in situ scales.

The input to the data assimilation routine is an ensemble of 50 unique

Distributions of prior and posterior pedotransfer function parameters grouped by the term in the equations (Eq.

Maps showing the difference between the prior and posterior mean JULES model soil parameters, created by applying the prior and posterior PTFs to the HWSD maps of soil properties. Brown corresponds to a decrease in the soil parameter after data assimilation and green to an increase.

In Fig.

Map showing the difference between yearly mean soil moisture for the prior and posterior ensemble of JULES model runs in 2016. Blue corresponds to the posterior ensemble estimate being wetter, and red corresponds to the posterior being drier.

Figure

Map showing the difference between root mean squared error (RMSE) when JULES spatially aggregated estimates are compared to SMAP observations for the prior and posterior ensemble. Blue corresponds to reductions in RMSE for the posterior ensemble and red to an increase. Grid cells displaying stippling signify low-quality SMAP pixels, which have not been used in the assimilation procedure. Over the whole domain, we find an average reduction in RMSE of 20 % after data assimilation for 2016 and 21 % for 2017.

Time series of soil moisture for 52.96

Time series of soil moisture for 51.81

Spatially averaged RMSE and ensemble spread for JULES prior and posterior model estimate. Solid blue line: prior JULES RMSE, dashed blue line: prior JULES ensemble spread, solid orange line: posterior JULES RMSE, dashed orange line: posterior JULES ensemble spread. The dotted black line represents the end of the assimilation window and start of the hindcast period.

After performing the data assimilation procedure, we use the observation operator described in Sect.

The COSMOS-UK observations we have used for independent validation of the results are representative of depths from 14 cm up to around 40 cm. The SMAP satellite observations, used within the assimilation algorithm to find a new set of pedotransfer functions for the experiment domain, are representative of soil moisture for the top 2.5–5 cm of soil. Therefore the fact that after assimilation we find such a distinct improvement at in situ COSMOS probe locations indicates that although the SMAP observations are only sensitive to shallow depths, by combining these with the JULES model, we are also improving estimates at deeper levels. The large errors in our prior JULES estimates for the COSMOS sites in Figs.

Time series of water budget variables and soil temperature at Cardington COSMOS site. Black plus signs: COSMOS-UK observations, grey crosses: SMAP observations for closest 9 km pixel, blue line: prior JULES estimate for closest 1 km grid cell, orange line: posterior JULES estimate for closest 1 km grid cell.

Time series of water budget variables and soil temperature at Morley COSMOS site. Black plus signs: COSMOS-UK observations, grey crosses: SMAP observations for closest 9 km pixel, blue line: prior JULES estimate for closest 1 km grid cell, orange line: posterior JULES estimate for closest 1 km grid cell.

Time series of water budget variables and soil temperature at Stoughton COSMOS site. Black plus signs: COSMOS-UK observations, grey crosses: SMAP observations for closest 9 km pixel, blue line: prior JULES estimate for closest 1 km grid cell, orange line: posterior JULES estimate for closest 1 km grid cell.

Time series of water budget variables and soil temperature at Redmere COSMOS site. Black plus signs: COSMOS-UK observations, grey crosses: SMAP observations for closest 9 km pixel, blue line: prior JULES estimate for closest 1 km grid cell, orange line: posterior JULES estimate for closest 1 km grid cell.

Summary statistics for comparison of JULES-CHESS soil moisture estimates to COSMOS probe observations over the experiment period. Over all sites, we find a 16 % increase in correlation, 16 % reduction in ubRMSE and 22 % reduction in RMSE after performing the calibration using LAVENDAR.

In this section, we compare our results to heat flux observations made at three flux tower sites during the experiment period. Although updating the soil parameters and soil moisture in our experiments will have an impact on the modelled heat fluxes, there are multiple model components that will effect the heat flux estimates (vegetation schemes, roughness length parameterisations, etc.), so that improving modelled soil moisture does not necessarily lead to improved modelled heat fluxes. However, if these other model components perform adequately, we should see some improvement in heat flux estimates from improved soil moisture predictions. In Figs.

Time series of heat flux variables and soil moisture at Cardington flux tower site. Black crosses: flux tower observations, grey crosses: SMAP observations for closest 9 km pixel, blue line: prior JULES estimate for closest 1 km grid cell, orange line: posterior JULES estimate for closest 1 km grid cell.

Time series of heat flux variables and soil moisture at Great Fen flux tower site. Black crosses: flux tower observations, grey crosses: SMAP observations for closest 9 km pixel, blue line: prior JULES estimate for closest 1 km grid cell, orange line: posterior JULES estimate for closest 1 km grid cell.

Time series of heat flux variables and soil moisture at Redmere flux tower site. Black crosses: flux tower observations, grey crosses: SMAP observations for closest 9 km pixel, blue line: prior JULES estimate for closest 1 km grid cell, orange line: posterior JULES estimate for closest 1 km grid cell.

Summary statistics for comparison of JULES-CHESS latent heat estimates to flux tower observations over the experiment period. Over all sites, we find a 34 % increase in correlation, 15 % reduction in ubRMSE and 26 % reduction in RMSE after performing the calibration using LAVENDAR.

Summary statistics for comparison of JULES-CHESS sensible heat estimates to flux tower observations over the experiment period. Over all sites, we find a 1 % increase in correlation, 16 % reduction in ubRMSE and 22 % reduction in RMSE after performing the calibration using LAVENDAR.

This study aimed to determine the suitability of satellite observations to optimise pedotransfer functions and improve soil moisture estimates for a land surface model. Currently pedotransfer functions are calibrated through analyses of point soil samples, and it is unclear how these calibrations and their resultant soil model parameters relate to the varying spatial resolutions of modern land surface models. Adding additional information from satellite estimates into the calibration of pedotransfer functions should address a key uncertainty with respect to the larger scales of land surface model estimates.

We used the LAVENDAR hybrid data assimilation framework

The correlated nature of the PTF parameters in Eq. (

Within the DA procedure used to optimise the PTF parameters, there are uncertainties that have not been explicitly prescribed. There will be inherent bias and errors in both the observations and model. For SMAP, any bias contained in the observations could cause us to retrieve PTF parameters that result in erroneous soil hydraulic conductivities and ultimately degrade the performance of other model components. It has been shown that the Level-3 9 km SMAP observations used here do not have a significant bias

In the initial application of this technique, we have focused on a specific region at a high resolution. Here we have utilised 256 processors to run the JULES model ensemble, with each JULES run utilising message parsing interfaces to disaggregate the spatial domain of the model and split the computational load across multiple processors. In this setup, it has taken approximately 1.5 d to complete 100 JULES model runs, with each model being for 30 614 grid cells and over 6 years (2016 to 2017, with a 4-year spin-up). In order to find a set of pedotransfer function parameters valid at the global scale, using the technique presented here, we would need to decrease the spatial resolution. Working at the scale of 0.5

Both SMAP and COSMOS-UK observations represent a valuable resource for validation and improvement of land surface models and could be further utilised still. It is possible that our formation of a spatially aggregated observation operator to compare SMAP 9 km estimates to JULES 1 km estimates could be improved upon and that more signal may be coming from the centre of the satellite pixel, so that we could weight these JULES model pixels more highly within the observation operator. In future work, it may also be beneficial to build towards a full radiative transfer scheme on top of JULES to assimilate the raw brightness temperature observations from the SMAP satellite to increase the representativity between the observations and the model and reduce sources of bias that may be introduced by the use of ancillary data in the soil moisture retrieval. Other studies utilising different land surface models have shown this works well

In this paper, we have focused on the optimisation of pedotransfer function parameters to improve estimates of water balance from land surface models. In other regions across the globe where underlying soil texture maps are highly uncertain, it may be necessary to also consider optimising estimates of soil properties per grid cell, given satellite and in situ observations

We have presented novel methods for calibrating pedotransfer functions used to create the soil parameter ancillaries of a land surface model by using satellite data from the NASA SMAP mission. After the retrieval of an optimised parameter set, using new hybrid data assimilation techniques, we find an average 20 % reduction in error for JULES model estimates of soil moisture when compared to SMAP satellite estimates. There are still areas which remain problematic such as working over urban locations and peatlands. These will require additional modelling efforts and new model components. The resultant posterior pedotransfer functions also improve the prediction of soil moisture and heat fluxes for the JULES land surface model when compared to independent in situ estimates from the COSMOS-UK network and three flux tower sites. At 11 COSMOS-UK research sites distributed across the experiment domain, we find an average 16 % increase in correlation, 16 % reduction in ubRMSE and a 22 % reduction in RMSE for the posterior pedotransfer functions compared to the prior.

In this Appendix we summarise the process to get the analysis (or posterior) ensemble of extended state variables (variables and parameters). In the case of this paper, the variables and parameters correspond to the 15 PTF parameters in Table

Let us start with a background ensemble of

To reduce the difficulty in finding the ensemble analysis mean, we use an incremental and preconditioned algorithm. “Incremental” means that we express the analysis mean which is a perturbation from the background mean, i.e.

In practice, we do not compute the linearised version of JULES. Instead one can define statistics in the observation space in the following manner. The background ensemble of

Computing the minimum of the cost function (

In our case, the matrix square root is computed via Cholesky decomposition. Finally the posterior ensemble of

The code used in experiments is available from the MetOffice JULES repository (

The supplement related to this article is available online at:

EP designed the data assimilation system and conducted all experiments with input from all co-authors. JA contributed to the underlying mathematical framework. EC wrote the algorithm to compare JULES soil moisture to cosmic-ray probe measurements. ER processed the HWSD and built the initial system for relating soil textural information to the parameters of JULES used in these experiments. EP prepared the manuscript with input from all co-authors.

The authors declare that they have no conflict of interest.

This work was funded by the UK Natural Environment Research Council's Hydro-JULES project (NE/S017380/1). The contribution of Tristan Quaife and Javier Amezcua was funded via the UK National Centre for Earth Observation (NCEO) at the University of Reading (grant no. nceo020004). The authors gratefully acknowledge the provision by UKCEH of hydrometeorological and soil data collected by the COSMOS-UK project.

This research has been supported by the UK Natural Environment Research Council (grant no. NE/S017380/1).

This paper was edited by Harrie-Jan Hendricks Franssen and reviewed by Roland Baatz, Long Zhao, and one anonymous referee.