SMOS near real time soil moisture product : processor overview and first validation results

Measurements of the surface soil moisture (SM) content are important for a wide range of applications. Among them, operational hydrology and numerical weather prediction, for instance, need soil moisture information in near-real-time (NRT), typically not later than 3 hours after sensing. The European Space Agency (ESA) Soil Moisture and Ocean Salinity (SMOS) satellite is the first mission specifically designed to measure soil moisture from space. The ESA level 2 SM retrieval algorithm is based on a detailed geophysical modelling and cannot provide SM in NRT. This paper presents the new ESA SMOS NRT SM 5 product. It uses a neural network (NN) to provide SM in NRT. The NN inputs are SMOS brightness temperatures for horizontal and vertical polarizations and incidence angles from 30◦ to 45◦. In addition, the NN uses surface soil temperature from the European Centre for Medium Range Weather Forecasts (ECMWF) Integrated Forecast System (IFS). The NN was trained on SMOS Level 2 SM. The swath of the NRT SM retrieval is somewhat narrower (∼ 915 km) than that of the L2 SM dataset (∼ 1150 km), which implies a slightly lower revisit time. The new SMOS NRT SM product was compared to the SMOS Level 10 2 SM product. The NRT SM data shows a standard deviation of the difference with respect to the L2 data of < 0.05 m3m−3 in most of the Globe and a Pearson correlation coefficient higher than 0.7 in large regions of the Globe. The NRT SM dataset does not show a global bias with respect the L2 dataset but can show local biases of up to 0.05 m3m−3 in absolute value. The two SMOS SM products were evaluated against in situ measurements of SM from more than 120 sites of the SCAN (Soil Climate Analysis Network) and the USCRN (United States Climate Reference Network) networks in North America. The NRT dataset 15 obtains similar but slightly better results than the L2 data. In summary, the neural network SMOS NRT SM product exhibits performances similar to those of the Level 2 SM product but it has the advantage of being available in less than 3.5 hours after sensing, complying with NRT requirements. The new product is processed at ECMWF and it is distributed by ESA and via the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) multicast service (EUMETCast).


Introduction
Surface soil moisture (SM) represents less than 0.001 % of the global freshwater budget by volume but it plays an important role in the water, carbon and energy cycles (Lahoz and De Lannoy, 2014).SM is the water reservoir for plants and agriculture and it affects the propagation of diseases such as malaria (Montosi et al., 2012;Peters et al., 2014).The amount of moisture in the soil is an important variable to understand the coupling of the continental surface and the atmosphere (Koster et al., 2004;Seneviratne et al., 2006;Tuttle and Salvucci, 2016).Surface SM softens the effect of precip-Published by Copernicus Publications on behalf of the European Geosciences Union.
Regarding flood forecasting, in the framework of the European Flood Awareness System (EFAS) the forecast accuracy improves significantly (5-10 %) when remotely sensed SM is assimilated in addition to discharge data (Wanders et al., 2014).SM initial conditions are among the most important hydrological properties affecting flash flood triggering (Norbiato et al., 2008;Ponziani et al., 2012).The assimilation of SM products from the Advanced Scatterometer (ASCAT) has been successfully used in the context of flash flood earlywarning systems in Mediterranean catchments (Cenci et al., 2016).
In addition to operational hydrology applications, operational numerical weather prediction also benefits from remotely sensed SM data assimilation.Meteorological agencies such as the European Centre for Medium-Range Weather Forecasts (ECMWF) and the United Kingdom Met Office assimilate ASCAT surface SM into their operational numerical weather prediction models (de Rosnay et al., 2013;Dharssi et al., 2011).The approach has also been tested in offline mode at Meteo France (Barbu et al., 2014).To be useful for operational applications, remotely sensed data should be available in near-real-time (typically less than 3-4 h after sensing, hereafter NRT).
The Soil Moisture and Ocean Salinity (SMOS) European Space Agency (ESA) satellite (Kerr et al., 2010) is the first instrument that has been specifically designed to measure SM from space.It carries an L-band (1.4 GHz) radiometer to perform full polarization and multi-angular (0-60 • ) measurements of the Earth's thermal emission.ECMWF uses SMOS NRT brightness temperatures (T b ) in their operational integrated forecasting system (IFS; Muñoz Sabater et al., 2012).The ESA SMOS operational Level 2 SM retrieval algorithm is based on a point-per-point iterative minimization of the difference of a physical model and the satellite measurements (Kerr et al., 2012).The free parameters are the SM content and the 1.4 GHz opacity, which is mainly due to the water content of the vegetation in between the soil surface and the sensor (which some authors refer to as VOD, vegetation optical depth).
Many studies have evaluated the SMOS L2 SM dataset in comparison to other remote-sensing datasets, models and in situ measurements (Al Bitar et al., 2012;Wanders et al., 2012;Albergel et al., 2012;Bircher et al., 2013;Al-Yaari et al., 2014a, b;Leroux et al., 2014;Louvet et al., 2015;Kerr et al., 2016).SMOS shows very good global performance although other remote-sensing and model products can show better performances at some sites.In any case, datasets from the only two instruments specifically conceived to measure SM, SMOS and NASA's Soil Moisture Active Passive (SMAP), compare very well with each other (Jackson et al., 2016;Burgin et al., 2017).
As already mentioned, most operational users over land, in particular in numerical weather prediction and operational hydrology, require SM information to be available in NRT, typically referring to less than 3-4 h after sensing.For instance, ASCAT SM data are distributed by the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) within 135 min after data acquisition (Wagner et al., 2013), which allows for assimilating the data by operational numerical weather prediction centres such as ECMWF (de Rosnay et al., 2013).In the case of the current SMOS ground segment, the production of Level 1C T b data from raw data takes typically 1 h of processing time and the Level 2 SM inversion up to 80 min for a half-orbit.However, due to data handling operations, the synchronization of some operations and the dissemination orchestration, the total latency time from data acquisition to SM dissemination is of the order of 6 h.Therefore, this processing chain does not fulfil the NRT requirements.However, as already mentioned, SMOS T b measurements are provided in NRT to ECMWF.In addition, with 6 years of SMOS measurements available, statistical algorithms can be exploited to provide faster retrievals and neural networks (NNs) have shown to be a promising technique to generate a SM dataset from SMOS T b measurements (Rodríguez-Fernández et al., 2015).Based on the latter, a NN processing chain to provide SMOS SM in NRT has been implemented by the ESA.The requirements are that the NRT dataset should display at least the same accuracy as the geophysical level 2 SM data product, the data should be retrieved over a large swath and the retrieval should rely on a minimum of auxiliary data files.As with the AS-CAT NRT SM processing chain cited above, the SMOS NRT SM chain handles model parameters derived offline using a database with a large number of past observations.The advantages are that the processing is robust and very fast.In the case that significant changes in the Level 1 data used as input are available, then the model parameters should be updated correspondingly.
The new SMOS NRT SM product is available from 2016 onwards and it is distributed through the World Meteorological Organization's Global Telecommunication System (GTS) and the EUMETCast service from EUMETSAT in NetCDF Hydrol.Earth Syst.Sci., 21, 5201-5216, 2017 www.hydrol-earth-syst-sci.net/21/5201/2017/ format.EUMETCast is a dissemination system that uses commercial telecommunication geostationary satellites and research networks to multi-cast data files to a wide user community.This paper describes the SMOS NRT SM processing chain and discusses the first evaluation results.It is organized as follows.Section 2 describes the data used for the implementation and the validation of the SMOS NRT SM product.Section 3 discusses the NRT SM processing chain (more details are given in the Appendix).Section 4 contains a description of the methods used to evaluate the NRT SM product.Section 5 presents the evaluation results.Finally, a summary is presented in Sect.6.

Data
2.1 SMOS satellite SMOS (Mecklenburg et al., 2012;Kerr et al., 2010) measures the thermal emission from the Earth at a frequency of 1.4 GHz in full-polarization and for incidence angles from 0 to ∼ 60 • .The full incidence-angle range is accessible in the centre of the swath.On the contrary, only angles in the 40-45 • range are accessible all across the swath.SMOS has 69 antennas to perform interferometry and synthesize an aperture of ∼ 7.5 m (Anterrieu and Khazaal, 2008).The spatial resolution on the ground, defined as the projection of the full width at half maximum of the synthesized beam, is 43 km on average over the field of view (Kerr et al., 2010).The satellite follows a sun-synchronous polar orbit with 06:00 LST (local solar time) (ascending half-orbit) and 18:00 LST (descending half-orbit) Equator overpass times.

SMOS level 2 SM
The SMOS level 2 processor performs a detailed modelling of the Earth's emission at 1.4 GHz at two polarizations and a large number of incidence angles using the τ − ω (optical depth − single scattering albedo) approach to account for interaction of L-band radiation with the vegetation (Wigneron et al., 2007).The ground effective temperature is computed from the soil temperature at a deep layer (∼ 1 m) and the surface layer (a few centimetres), using the formulation of Choudhury et al. (1982) with the parameterization by Wigneron et al. (2001).The soil temperature for those two layers is taken from ECMWF IFS model simulations.
For each grid node, the surface is modelled with 4 km × 4 km cells taking into account different land covers.Then the processor computes the contributions of those cells within 123 km × 123 km accounting for the projection of the SMOS synthesized antenna power pattern on the Earth surface to model SMOS-like T b values.The vegetation optical depth (τ ) and the SM content are free parameters that are allowed to vary to minimize the difference of the simulated T b values and SMOS Level 1C T b measurements.In the case of forest, two contributions to the opacity are taken into account: one from the trees, which is estimated from the maximum leaf area index (LAI; Ferrazzoli et al., 2002), and another from the understory vegetation.Soil temperature is obtained from ECMWF IFS data.For footprints with mixed land cover, the SM content of the minor land cover is estimated from ECMWF IFS and its contribution to the T b is fixed.For such cases, the SMOS SM retrieval is only performed for the dominant land cover class within the footprint (Kerr et al., 2012).ESA Level 2 SM data are provided in an icosahedral equal area (ISEA) 4H9 grid (Sahr et al., 2003) with a sampling space of 15 km.
The version of the SMOS L2 SM dataset used in this study is v620, which became operational in May 2015.In order to have enough data for a robust training of the NN, an additional dataset from 1 June 2010 to 30 June 2012 was reprocessed with the same version v620 of the L2 SM algorithm.The evaluation of the NRT SM product has been done from May 2015 to the time of the NRT SM implementation (end of 2015).

SMOS NRT SM
The SMOS NRT SM product was obtained by training a NN using SMOS T b measurements and soil temperature from ECMWF models as input.SMOS T b measurements are provided by ESA to ECMWF in NRT (less than 3 h after sensing) for operational monitoring within the IFS (Muñoz- Sabater, 2015).
The training dataset used for the supervised learning phase of the NN was the SMOS Level 2 SM product.The training was done using data from June 2010 to June 2012 and it is described in Sect.3. The NRT SM processing chain was evaluated using data from May to November 2015.Taking into account the satisfactory results of the evaluation (presented in Sect.5), the SMOS NRT SM product became operational in January 2016.The SMOS NRT SM product is computed at ECMWF and delivered to ESA, where the data are sent to EUMETSAT for dissemination via EUMETCast.SMOS NRT SM data can also be accessed via the SMOS online dissemination service from the ESA Earth Online portal.The SMOS NRT SM data are provided in NetCDF files in the same ISEA 4H9 grid of other ESA SMOS products.The version of the SMOS NRT SM data used in this study is version 100.More details on the NRT SM processor are presented in Sect. 3 and in the Appendix.

In situ SM measurements
The SMOS NRT SM product was evaluated against in situ measurements of SM over a large number of sites.The same evaluation was done with the Level 2 SM product.The in situ data used for those evaluations are described below.
The Soil Climate Analysis Network (SCAN) of the United States Department of Agriculture (Schaefer et al., 2007) has been widely used to evaluate modelled and remote-sensing SM datasets and it contains over 100 sensors or sites.The sensors are located in agricultural regions with a relatively homogeneous landscape in many cases.The sensors used in this study are placed horizontally at 5 cm depth.
The United States Climate Reference Network (USCRN, Bell et al., 2013) is a network of climate monitoring stations with sites across the USA, managed and maintained by the National Oceanic and Atmospheric Administration (NOAA).This network was designed with climate science in mind.The stations are placed in pristine environments expected to be free of development for many decades.There are around 140 stations with sensors at different depths.The sensors used in this study are horizontally installed at 5 cm.
The in situ data have been obtained directly from the teams operating both networks but these datasets are also available from the International Soil Moisture Network (Dorigo et al., 2011).

The SMOS NRT SM processor
The SMOS NRT SMOS processor is based on a NN approach to retrieve SM from SMOS observations.Rodríguez-Fernández et al. (2015) showed that SMOS T b values binned in 5 • width incidence-angle bins (the L3TB product; Al Bitar et al., 2017) can be used to retrieve SM on a daily basis.They used ECMWF SM model fields as reference dataset to train the NN.Rodríguez-Fernández et al. (2013) had previously shown that SMOS Level 3 SM (Al Bitar et al., 2017) can also be used as reference data to train the NN instead of ECMWF modelled SM fields.They also showed that the additional input dataset with the most significant impact on the retrieval is the soil temperature dataset.In the context of the ESA operational NRT SM processor, the goal was to obtain a SM dataset as similar as possible to the ESA Level 2 SM dataset but in NRT.Therefore, the ESA SMOS Level 2 SM dataset (Kerr et al., 2012) was used as reference dataset for the training phase.Finally, taking into account operational constraints, the only complementary data used for the retrieval are soil temperature estimations for the first layer (0-7 cm) of ECMWF models, which is the complementary dataset with the most significant impact (∼ 3 %) on the retrieval (Rodríguez-Fernández et al., 2013).This dataset was chosen because it is the same model data used by the Level 2 SM algorithm (Sect.2).

Input data
The input to the SMOS NRT SM processor are SMOS NRT T b values, which are distributed by ESA to ECMWF in BUFR (binary universal form for the representation of meteorological data) format (Gutierrez and Canales Molina, 2010;de Rosnay et al., 2012) 2016), using these three angle bins is the best trade-off between performances (which improve with a large angular signature) and swath width of the retrieval (which decreases with an increasing number of angle bins used).With this configuration SM is retrieved in the central 914 km of the swath (the maximum possible swath is ∼ 1150 km).A SM retrieval can only be done if there is a well-defined value of the T b for all three angle bins and the two polarizations H and V .The current implementation of the NRT SM processor does not perform any interpolation of the T b vs. incidence-angle profiles.
Using the SMOS T b values measured at a time t for a given latitude (λ) and longitude (φ) grid point and for each polarization and incidence-angle bin, T b λφ (t), a local normalized index can be computed as follows: have been computed using data from 1 June 2010 to 30 June 2012 (the same period used to train the NN; see Sect.3.2).This linear expectation index is inspired by the approach used to retrieve SM with the scatterometers such as ASCAT (Wagner et al., 1999;Bartalis et al., 2007) and helps to improve the performances of the NN retrieval (Rodríguez-Fernández et al., 2015).
The only auxiliary data used by the SMOS SM NRT processor are snow depth and soil temperature from the latest high-resolution forecast produced by the ECMWF IFS, with a typical latency of less than 1 h.The ECMWF IFS soil temperature in the 0-7 cm layer is used as input to the NN, as it increases the performances of the retrieval (Rodríguez-Fernández et al., 2016).SMOS data from a given grid point are not used if snow is found in that point based on the latest ECMWF snow depth forecast field or if the soil temperature forecast of the top 7 cm of soil is below 274 K.The SMOS NRT SM dataset is a land-only product and a SM retrieval is not provided if more than 50 % of the SMOS footprint is covered by water.This filter avoids spurious SM values near the coastlines, for instance.Of course, even if less than 50 % of the footprint surface is covered by water, the SM retrieval can still be affected by the free water.Users interested in regions close to the coast or to water bodies are advised to use the land-sea mask available on the ESA SMOS data products portal.This mask was computed from the 1 km USGS (US Geological Survey) land-sea mask aggregated into the ISEA grid common to Level 1 and Level 2 SMOS products.

The NN processor
The HV angle-binned T b values have been collocated with ECMWF IFS forecasts for the soil temperature and snow cover and finally they have been collocated with version 620 SMOS L2 SM data (Kerr et al., 2012) in the 1 June 2010 to 30 June 2012 period.As discussed above, a local normalized index I has been computed from extreme T b values and the associated L2 SM.In addition to the filters discussed above (hard RFI, Sun tails, frozen or snow covered soil, water fraction), to compute the extreme value tables and for the training of the NN, the following filters have also been applied: the latitude is limited to the (−60, 75 • ) range; a SMOS L2 SM value associated to the maximum or minimum T b is required (otherwise I cannot be defined); the SM uncertainty provided by the Dqx (data quality index) parameter in L2 SM data files was required to be lower than 0.06 m 3 m −3 to use the most reliable data for the training.
The input vectors contain T b values, I indexes for H and V polarizations, the three angle bins from 30 to 45 • , and the soil temperature from 0 to 7 cm from the ECMWF IFS.Therefore, input vectors have a total of 13 elements.All 13 elements must be well-defined to train the NN and there must be a well-defined, associated SM value.
One-fifth of the vectors in the training database were selected randomly to have ∼ 3 × 10 5 vectors.A subset of 60 % of those vectors is used for the actual training, 20 % is used for evaluation of the NN performances during the training and to avoid over-training, the final 20 % is used to test the performances of the trained NN a posteriori.Gradient back-propagation and minimization with the Levenberg-Marquardt algorithm has been used.One single hidden layer with 5 neurons has been used, as it has been shown by Rodríguez-Fernández et al. (2016) that it is enough to capture the relationship between the input data and the reference SM, while keeping the NN as simple as possible.No signs of overtraining were found and the training was stopped after 50 iterations when the mean squared difference was asymptotically approaching a minimum.When the trained NN was applied to the test subset and the NN output was compared to the SMOS L2 SM, the Pearson correlation R was 0.86, the standard deviation of the difference (STDD) was 0.068 m 3 m −3 and the root mean square error or difference (RMSE) was also 0.068 m 3 m −3 , which implies that there was not a significant bias between both SM datasets.These results show that the NN ability to capture the dynamics of the current L2 SM dataset is very good.The evaluation results discussed in Sect. 5 below confirm that the quality of the NRT SM NN product fulfil the specifications of the operational product.
NRT SM NN uncertainties were computed by error propagation through the NN taking into account the error of the T b measurements used as input as explained in the Appendix.

SMOS NRT SM processor output
The SMOS NRT SM product is a land-only product, collocated and delivered in the ISEA 4H9 grid (Sahr et al., 2003) common to other ESA SMOS products.The main characteristics of the product and the description of the fields are presented in Muñoz- Sabater et al. (2016).The processor output fields are shown in Table 1. Figure 1 shows the NRT SM product and its associated uncertainty for a portion of an orbit on 27 May 2012. 4 Methods

Global evaluation
Several metrics have been used to evaluate the NRT SM dataset from 15 May to 25 November 2015 against the SMOS L2 SM dataset.For all grid points λφ, the temporal means of both SM datasets, SM L2 λφ and SM NRT λφ , have been computed as the following: and using only times (t i ) for which a well-defined value is present simultaneously in both datasets.This number is in principle different for each λφ grid point, but it will be noted as N t in the following instead of N λφ t to simplify the notation.
A bias map has been computed from the local (λ and φ) mean of each dataset as follows: (4) In order to compare the temporal dynamics of the two datasets, the Pearson correlation R has also been computed as follows: where the sum runs for all the points available at a given position λφ: N t .The absolute values of the two datasets have been evaluated using the standard deviation of the difference as a metric.The local time series difference D of the two datasets was defined as The standard deviation of the difference time series (STDD) has been computed as the following: In some studies, the STDD is calculated indirectly from the bias and the root mean squared difference (RMSD) and called unbiased-RMSD (ubRMSD).

Local evaluation against in situ measurements
Evaluating remote-sensing measurements against in situ measurements is a difficult exercise.The spatial resolution of coarse-scale remote-sensing observations (∼ 40 km) is very different to point-like measurements by in situ sensors.The large-scale spatial representativeness of the in situ measurements is not guaranteed (see for instance Gruber et al., 2012).In addition, the depth of the microwave emitting layer can be different with respect to the sensing depth of the in situ measurements.The goal of the current study is not to deal with these open issues but to compare two different retrieval approaches using the same instrument, therefore spatial representativeness or sensing depth differences will not affect the comparison.More detailed evaluations of SMOS SM retrievals can be found in the following references: Al Bitar et al. ( 2017), Kerr et al. (2016), Leroux (2012), Van der Schalie et al. (2016) and Jackson et al. (2012).
The SMOS NRT SM and the SMOS L2 SM datasets have been evaluated against the in situ measurements discussed in Sect. 2 in a consistent manner.First, for each station available, a quality check of the data was performed.Sites with suspicious data (e.g.measurement discontinuity, spurious jumps) were eliminated.The locations of the 127 retained stations are shown in Fig. 4.
The same metrics discussed in the previous section have been computed for the NRT SM dataset with respect to the in situ measurements and for the L2 SM dataset with respect to the in situ measurements.The Pearson correlation was used to compare temporal dynamics of the two SM datasets.The long-term (seasonal) dynamics were compared by computing the Pearson correlation coefficient R of SM L2 and SM NRT with respect to SM inSitu , site per site.In addition, the short-term (1-30 days) dynamics were evaluated by computing site per site the Pearson correlation of the anomalies time series.Following Albergel et al. (2009), the SM anomaly at a given time t, SM a (t), was computed using a 31-day window centred at t as follows: where SM(t − 15, t + 15) represents the ensemble of measurements in the 31-day window.The Pearson correlation coefficient R computed with the anomalous time series will be referred to as R a in the following.
The metrics were computed independently for the NRT and the L2 datasets in a first step.In a second step, the metrics were recomputed only using times for which both the NRT and the L2 were simultaneously available, and thus, using the same number of points for the two time series.
5 Results -SMOS NRT SM evaluation 5.1 Swath-level comparison to SMOS L2 SM Figure 1a and c show the NRT NN SM product and its associated uncertainty for a portion of an orbit on 27 May 2012.The corresponding L2 SM and its associated uncertainty as given by the DQX (data quality index) parameter (Kerr et al., 2012) are also shown (Fig. 1b and d).As discussed in Sect.3, the swath width of the NRT SM retrieval is somewhat narrower than the L2 SM one but both maps show similar spatial structures and numerical values.The uncertainties have similar numerical values as well, but the spatial patterns are not the same.This is expected as the two retrieval algorithms are different.Finally, it should be noted that the spatial coverage can be different for both products as shown in Fig. 2: -The NRT SM product can show circle-arc gaps when not all of the angle bins have a well-defined T b value, while in contrast the L2 algorithm can perform an inversion even if some T b values have been filtered out.
-The NRT SM global retrieval algorithm can provide a SM estimate even when the local minimization of the L2 algorithm does not converge.This can happen mainly in dense forest areas.

Global evaluation with respect to SMOS L2 SM
The SMOS NRT SM product has also been compared to the SMOS L2 SM product globally and over the period mentioned in Sect. 4. Figure 3a and b show the mean of the NRT and L2 SM products over the period of the study.Both maps show an overall excellent agreement, although it is possible to appreciate a significant negative bias (−0.05 m 3 m −3 ) in the NRT SM product in the regions with the highest L2 SM (tropical and boreal forest).The typical number of points with both NRT SM NN and SM L2 in the evaluation period is ∼ 100.The correlation of both products is high (> 0.7) over a large part of North America, the southernmost part of South America, the Iberian peninsula, the Sahel, South Africa, Australia and parts of central Eurasia.The correlation is significantly lower over forest (both tropical and boreal) and in deserts such as the Sahara.In the Sahara, the low correlation is probably not significant because the SM values are very low and the variance is driven by the noise.Actually, Fig. 3f shows that the STDD is also very low in this region.Therefore, L2 SM and NRT SM have actually similar values.In contrast, dense forest regions show a high   STDD in addition to a low R. Therefore, both products show some differences in these regions.Unfortunately, in situ measurements are not available to perform an independent evaluation of both datasets for dense-forest sites.In conclusion, both products show similar dynamics over large parts of the globe.The bias map (Fig. 3d) shows that the NRT SM product shows a tendency to underestimate the L2 SM dataset, which is an expected behaviour as it has been obtained using a regression technique and extreme values are underrepresented in the reference dataset.The most significant effect of the bias is to increase the RMSD (Fig. 3e) with respect to the STDD in parts of Europe and Canada.However, one should note that both the RMSD and the STDD are lower than 0.04 m 3 m −3 over most of the globe (all except the reddish regions in Fig. 3e and f).
The quality metrics discussed in Sect. 4 have been computed site per site and independently for the SMOS NRT.The same evaluation was done for L2 products.The mean number of points in the time series from May 2015 to November 2015 is 186 for the L2 product while it is only half of that value for the NRT product.The reason is a longer revisit time of the SM NRT NN product due to the narrower swaths of the retrievals and the lack of retrievals if not all the six T b values are well-defined for both polarizations and the three angle bins from 30 to 45 • .
Table 2 summarizes the results in the form of averages over all the sites (for the Pearson correlation, the median Table 2. SMOS Level 2 and NRT NN SM comparison to in situ measurements over the USCRN and SCAN networks.The columns are as follows: the SM product, the mean number of points in the time series, the mean and median Pearson correlation with respect to in situ measurements, the mean bias (mean in situ SM minus mean SMOS SM), the root mean square and standard deviation of the difference time series averaged over all sites, and the Pearson correlation of the anomalous time series (R a ).The statistics have been computed independently for the NRT SM NN and the L2 SM product.The number of SM retrievals is, on average, larger for the L2 SM.The two lower rows show the results using only times for which both the NRT SM NN and the L2 SM products are simultaneously available.value is also given).Both SMOS products show a similar mean bias with respect to the in situ measurements, while the mean STDD and RMSD are slightly lower for the NRT SM product.In order to get further insight into the intrinsic quality differences of both datasets, the same statistics have been computed but only using times for which both SMOS products are retrieved.The results are also shown in Table 2.The differences in the evaluation of both products decreases, but the NRT product still shows a larger correlation and lower STDD with respect to the in situ measurements than the L2 product.
Since the mean or median values alone do not show the full picture of the evaluation for more than 100 sites, Fig. 5a  and b show box plots for the Pearson correlation coefficient of the time series (R) and the anomalous time series (R a ), respectively.Figure 5c and d show box plots for the bias and the STDD.As expected, there is a large variation from one site to another.The bias and STDD distributions are similar for both products.The correlation is as high as almost 1 for some sites both for the NRT SM and L2 SM (the maximum is slightly higher for the later).Interestingly, the lower values of the distribution of the correlation are higher for the NRT product.In summary, the bias and R a distributions are very similar for both products while the NRT product shows a lower STDD and a higher R for the central two quartiles of the distribution (green boxes in Fig. 5).This behaviour was already found by Rodríguez-Fernández et al. (2015), who analysed different NN models to retrieve SM from SMOS observations after training the NN on ECMWF simulated SM fields.When comparing to in situ measurements, the best NN models showed a higher Pearson R and a lower STDD than those obtained for the ECMWF SM model simulations.These results can be understood because, provided that the training is done with a large number of statistically representative sam- The mean values are also shown as black crosses.The upper and lower bars represent the minimum and maximum values of the distribution excluding outliers.Points are considered as outliers if they are larger than q 3 + 1.5(q 3 − q 1 ) or smaller than q 1 − 1.5(q 3 − q 1 ).
ples, the NN will not be significantly affected by outliers or inconsistent values during the training phase.The NN output is the most likely (in the sense of the Bayes theorem) SM value taking into account a given set of input data.Thus, a good NN model can show slightly better quality metrics when compared to in situ measurements than the dataset used as reference to train the NN.
Finally, Fig. 6 shows scatter plots of the correlation for the time series and for the anomalous time series taking into account the respective confidence intervals.For most of the sites, both products show the same statistics with respect to the in situ measurements and globally, the scatter plot points lie close to the 1 : 1 line.

Conclusions
This paper describes the ESA SMOS NRT SM processor and the first evaluation of this new operational dataset.This processor is based on a NN algorithm that uses SMOS NRT brightness temperatures and ECMWF IFS soil temperature in the 0-7 cm layer as input.It has been trained with SMOS Level 2 SM data as reference.The SMOS NRT brightness temperatures have been transformed from the antenna reference frame to the ground reference frame to express the polarization as horizontal and vertical components.In addition, Hydrol.Earth Syst.Sci., 21, 5201-5216, 2017 www.hydrol-earth-syst-sci.net/21/5201/2017/ they have been binned in 5 • width incidence-angle bins.Soil temperature and snow cover forecasts from ECMWF IFS are used to filter out frozen soil or soil covered by snow.The uncertainties of the NRT SM data were estimated from the input brightness temperature uncertainties.The SMOS NRT SM product was evaluated with respect to the original SMOS Level 2 SM product using several months of data.The NRT SM product compares well with the L2 product.The most significant difference is that the NRT SM dataset shows local negative bias at the positions were the highest SM values were found (basically under tropical forest).
The SMOS NRT SM product was also evaluated with respect to in situ measurements of SM with the SCAN and the USCRN.The NRT product shows similar performances to those of the L2 product.Actually, the mean and median correlation are slightly higher than those obtained for the L2 product.In addition, the STDD with respect to the in situ measurements is lower for the NRT product than for the L2 product.
In summary, the SMOS NRT SM product shows similar performance to the Level 2 product but it has the advantage to be available in NRT.NRT brightness temperatures are received by ECMWF from ESA in less than 3 h after sensing.The NRT SM production takes on average of 15 min (the arrival of new NRT T b data is checked every 30 min and the actual NRT SM production takes a few minutes).The SMOS NRT SM product is delivered to ESA and EUMETSAT for dissemination via EUMETCast.Therefore, the SMOS NRT SM data are available for a large range of operational applications such as numerical weather prediction, hydrological forecasting and crop modelling.
Data availability.The datasets used in this study (Sect.2) are publicly available.The SMOS L2 SM and NRT SM data can be downloaded from the ESA.The SMOS NRT SM data are also available via EUMETCast in NRT.The in situ measurements can be downloaded from the International Soil Moisture Network (Dorigo et al., 2011).

Appendix A: NRT SM algorithm
The SMOS NRT has been described qualitatively in Sect.3. The current section describes the algorithm and the output uncertainties calculation in detail.Complementary information can be found in Rodríguez-Fernández et al. (2016) and Muñoz-Sabater et al. (2016).

A1 NN specification
The NN discussed in Sect. 3 has two layers.The first layer contains j = 1, . . ., n L1 nodes or neurons with an hyperbolic tangent as activation function.The second layer contains a single neuron with a linear function as activation function.The number of input elements n in is 13: six T b values (H and V for incidence-angle bins from 30 to 45 • ), 6 index I (H and V for incidence-angle bins from 30 to 45 • ), and ECMWF soil temperature.The inputs ranges should be re-normalized to have values in the (−1, 1) range.If for each input vector element, the minimum and maximum values found during the training phase are given by the vectors v min i and v max i (i = 1, . . ., n in ), the normalization can be computed as follows: The normalized input, together with the first layer weights (W L1 ) and bias B L1 are used to compute the first layer outputs v L1 as follows: The output of the second layer is computed from the first layer outputs, and the second layer weights (W L2 ) and bias B L2 as follows: The values of the weights W L1 and W L2 and the bias B L1 and B L2 are determined after the training phase.The exact values for the operational NRT SM processor can be found in Muñoz-Sabater et al. (2016).Finally, to obtain the NN output (v out ), the output of the second layer has to be renormalized as follows:

A2 NN output uncertainties
From the definition of I λφ (t) (Eq.1), it is possible to compute the uncertainties from the T b and SM uncertainties.First, Eq. ( 1) can be rewritten as the following:  A1)-(A4) can be estimated from the uncertainties in the input vector elements ( v i ) as follows.First the uncertainties of the normalized input vector can be computed as Using those quantities, the uncertainty of the two layer NN given by Eqs.(A2) and (A3) can be expressed as where σ j is given by the following: It is worth noting that in the current implementation, the NN weights are assumed to be constant after training.There exist some methods to estimate the additional output uncertainty that originates from the NN weight uncertainty that comes from the uncertainties in the reference data used for the training (see for instance Aires et al., 2004) but they are too complex to be implemented in the SMOS NRT SM operational processor.In contrast, some uncertainties in the reference data used for the training have already been taken into account through SM λφ in Eq. (A7).Finally, the uncertainty after the normalization of the output can be written as Expressing the output uncertainty as Eq.(A10) implies that the vector elements v i are independent.However, when using index I as input as well as the actual T b measurements, some elements are not independent.Since the uncertainties in Eq. (A10) are expressed in quadratic form, Eq. (A10) gives an upper limit to the output uncertainty.
. The T b values are provided with the polarization referred to in the antenna reference frame XY .Several quality checks are performed to filter the T b values: T b X and T b Y should be in the expected physical range (80-340 K) and the real and imaginary components of the cross-polarized measurements (T b XY ) should be in the range (−50, 50 K), otherwise the T b values are considered to be corrupted or affected by RFI (radio frequency interference from human-built equipment).In order to keep information on the possible residual RFI contamination, a RFI probability was computed for each observation as the number of BUFR T b measurements filtered out due to the RFI flags with respect to the total number of T b measurements.The observed T b values are also filtered out if a specific BUFR flag indicates that the observation is located in a zone affected by the aliased image of the Sun.The selected NRT T b values are transformed from the antenna-based XY reference frame to the ground-based horizontal and vertical (H V ) reference frame as described by Al Bitar et al. (2017).In a second step the H V T b values are averaged in 5 • width incidence-angle bins.Three angle bins are actually used for training and applying to the NN: 30-35 • , 35-40 • and 40-45 • .As discussed by Rodríguez-Fernández et al. ( and minimum values of the T b in the local time (λ, φ) series, SM T min b and SM T max b are the associated SM in the SMOS Level 2 SM reference dataset.The index I is computed for each incidenceangle bin and polarization at the time t of the SMOS acquisition and it contains local information on the dynamic ranges of both the measured T b and the reference SM.In the current version of the processor (v100), T max,min b and SM T max,min b

Figure 1 .
Figure 1.Comparison of the NRT SM NN product (a) and the Level 2 SM (b) for one orbit of 27 May 2012.The corresponding NRT SM uncertainty is shown in (c), while the L2 SM uncertainty is shown in (d).

Figure 2 .
Figure 2. Comparison of the NRT SM NN product (a) and the Level 2 SM (b) for one orbit of 27 May 2012.

Figure 3 .
Figure 3. Mean SM for the NRT (a) and the L2 (b) SMOS products.Pearson correlation (c), bias (d), root mean square of the difference (e) and standard deviation of the difference (f) of the NRT SM and L2 SM.

Figure 4 .
Figure 4. Locations of the in situ measurement sites used in this study.

Figure 5 .
Figure 5. Box plots for (a) the Pearson correlation coefficient (R) of the NRT and L2 time series with respect to the in situ measurements, (b) Pearson correlation coefficient of the anomalous time series (R a ), (c) bias (mean in situ minus mean SMOS SM) and (d) STDD of the two SMOS products in comparison to in situ measurements.The green boxes contains the middle 50 % of the data, the central bar represents the median value of the distribution.The upper edge (hinge) of the box indicates the 75th percentile of the dataset(q 3), and the lower hinge indicates the 25th percentile (q 1 ).The mean values are also shown as black crosses.The upper and lower bars represent the minimum and maximum values of the distribution excluding outliers.Points are considered as outliers if they are larger than q 3 + 1.5(q 3 − q 1 ) or smaller than q 1 − 1.5(q 3 − q 1 ).

Figure 6 .
Figure 6.(a) Scatter plots showing the Pearson correlation coefficient of the NRT and L2 SM time series with respect to the in situ measurements (R).The error bars account for the 95 % confidence intervals.The red symbols represent averaged values.(b) Same as (a) but for the anomalous time series (R a ).

×
I 1 λφ (t), (A5)where I 1 λφ (t) is given byI 1 λφ (t) = T b λφ (t) uncertainties I λφ (t)and I 1 λφ (t) can be computed from uncertainties in T b measurements, in the maximum and minimum T b and the associated SM values as follows: of the uncertainty of the local instantaneous measurement T b λφ (t) and the uncertainties of the local extreme T b values ( T max b λφ and T min b λφ ).The uncertainties of the NN output given by Eqs. (