Hydrology and Earth System Sciences an Algorithm for Generating Soil Moisture and Snow Depth Maps from Microwave Spaceborne Radiometers: Hydroalgo

A systematic and timely monitoring of land surface parameters that affect the hydrological cycle at local and global scales is of primary importance in obtaining a better understanding of geophysical processes and in managing environmental resources as well as natural disasters. Soil moisture and snow water equivalent are two quantities that play a major role in these applications. In this paper an algorithm for hydrological purposes (called hereinafter Hy-droAlgo), which is able to generate maps of snow depth (SD) and soil moisture content (SMC) from AMSR-E data, has been developed and implemented within the framework of the JAXA ADEOS-II/AMSR-E and GCOM/AMSR-2 programs , as well as of a project of the Italian Space Agency that is devoted to civil protection from floods and landslides. As auxiliary output, the algorithm also generates maps of vegetation biomass (VB). An initial phase of pre-processing includes the improvement of spatial resolution, as well as masking for urban areas, water bodies, and dense vegetation. The algorithm was then split into two branches, the first of which focused on the retrieval of SMC and the second, on SD. Both parameters were retrieved using Artificial Neural Network (ANN) methods. The algorithm was calibrated using a wide set of experimental data collected on three sites: Mongolia and Australia (for SMC), and Siberia (for SD), integrated with model simulations. These results were then validated by comparing the algorithm outputs with experimental data collected on two additional sites: a part of a watershed in Northern Italy, and a large portion of Scandinavia. An additional test of the algorithm was also performed on a large scale, and included sites characterized by differing climatic and meteorological conditions.


Introduction
The number of weather-related natural disasters, such as floods, storms, cyclones, drought and extreme temperatures, is dramatically increasing, resulting in human and economic losses which strike at least one third of the world's population.Such disasters are primarily due to environmental change and land degradation, which are mostly caused by human impact on the territory (e.g. Bates et al., 2008).
Help in breaking this vicious cycle can be given by a more in-depth knowledge of the environment and by further studying the temporal evolution of the distribution and extent of ecosystems.In particular, a close observation of land surface properties can be crucial when analyzing the two fundamental cycles of our planet, namely the global carbon and hydrological cycles.In particular, the water scarcity is one the main problems, which we will have to face up to in a very near future.A precise evaluation of water utilization on several spatial scales (from local to regional) can be helpful for assessing the waste of water due to human activities, especially agricultural and industrial consumptions (Zang et al., 2012).Earth observation satellites can be very useful tools in monitoring the basic parameters that affect these cycles, in particular soil moisture (SMC), snow water equivalent (SWE) or snow depth (SD), and vegetation biomass (VB).These parameters play significant roles in the distribution of water between blue and green, the latter being essential for agricultural purposes (Liu et al., 2009;Liu and Yang, 2010).The possibility of monitoring the water cycle in its various components is, indeed, very appealing, particularly for agricultural water use.Joint efforts by the national space agencies are currently underway towards developing a global-scale monitoring system that features numerous satellites equipped Published by Copernicus Publications on behalf of the European Geosciences Union.

E. Santi et al.: HydroAlgo
with onboard sensors for global surveillance and the retrieval of information regarding the Earth's conditions.
Several major ongoing projects focus on estimating the most important parameters of the hydrological cycle.These include the AQUA/AMSR-E (Advanced Microwave Scanning Radiometer for EO) of NASA (National Aeronautics and Space Administration) and JAXA (Japan Aerospace Exploration Agency) (Kawanishi et al., 2003), the ESA Soil Moisture and Salinity Mission (SMOS) (Kerr et al., 2010), as well as the future Soil Moisture Active Passive (SMAP) of NASA (Edelstein et al., 2010), which is the follow-up to an initial project called Hydros, and the Global Change Observation Mission-Water (GCOM-W/AMSR-2) of JAXA (Shimoda, 2009).An interesting survey of these projects was published in a recent Special Issue of PIEEE (Tsang and Jackson, 2010).In Italy, the PROSA (Products of Earth Observation for the Meteorological Alert) national project, funded by the Italian Space Agency (ASI), aimed to contribute to civil protection from floods and landslides by developing a series of products derived from microwave and optical satellite sensors (Pettinato et al., 2009).During such emergencies, these products enable immediate assessment of the areas at risk, and/or provide support in the decisionmaking process regarding relief and clean-up operations.Generations of real time SMC and SD maps from passive microwave sensors are the key outputs of this project.
Research enabling the retrieval of SMC and SD from single or multifrequency radiometric data dates back to the late 1970's, when several investigations indicated microwave emission sensitivity to SMC and SWE (e.g.Njoku and Kong, 1977;Ulaby and Stiles, 1980;Hofer and Mätzler, 1980;Shutko, 1982;Jackson et al., 1982;Chang et al., 1982).Although microwave radiometers from space have a coarse ground resolution, they are able to produce daily maps of brightness temperature (T b ), which can then be converted to SMC and SD by using appropriate inversion algorithms (e.g.Shibata et al., 2003;Njoku et al., 2003;Kelly et al., 2003).
Measurements at frequencies between 1 and 3 GHz (L band) are best suited for SMC detection, because energy is emitted from a deeper soil layer and less energy is attenuated by vegetation (e.g.Shutko, 1982;Paloscia et al., 1993).The SMOS mission, which is specifically dedicated to the estimating of SMC, is currently operating at 1.4 GHz (Barre et al., 2008).However, there is potential in retrieving SMC from space-borne instruments at higher frequencies, as demonstrated in over ten years of research on the sensitivity of emission at C-band (which is the lowest frequency channel available from AMSR-E) to moisture of low vegetated soils (e.g.Vinnikov et al., 1999;Jackson and Hsu, 2001;Macelloni et al., 2003).This higher frequency band has the advantage of being less affected by the Radio Frequency Interferences (RFI), which may severely limit the proper functioning of L-band systems (Skou et al., 2010;Balling et al., 2010).RFI can be a serious problem, especially on densely populated areas, as it affects different frequencies depending on the country.For example, C-band data are significantly contaminated in the US, Japan and the Middle East, so that some algorithms for the retrieval of SMC employ higher frequency data despite the higher sensitivity to vegetation and surface roughness.In Europe, the problem is just the opposite, since X-band data have been found to be the ones most affected by RFI (Njoku et al., 2005).
Several approaches for the retrieval of SMC from single or multifrequency radiometric data have been investigated in previous studies.Most of these studies (Njoku et al., 2000(Njoku et al., , 2003;;Jackson, 1993;Wigneron et al., 1995;Njoku and Li, 1999;Jackson et al., 2002;Paloscia et al., 2006;Paloscia et al., 2001;Owe et al., 2001) are based on the inversion of the so-called tau-omega model (Mo et al., 1982) by using an iterative minimization of the root mean square error between model simulations and measurements, and differ primarily in the methods used to correct the effects of soil roughness, texture, vegetation, and surface temperature.For example, in the National Snow and Ice Data Centre (NS-DIC) algorithm (Njoku et al., 2003), correction for the effects of surface roughness is based on an empirical formulation that relates the reflectivity of a rough soil surface to that of the equivalent smooth surface (Wang and Choudhury, 1981).The retrieval methodology used in the Land Surface Parameter Model (LPRM) (Owe et al., 2001(Owe et al., , 2008) is a nonlinear iterative procedure in a forward modeling approach, which solves the canopy optical depth by using an analytical approach, partitions the surface emission into the soil and the canopy emission, and then optimizes the soil dielectric constant.Measurement errors, and several other sources of uncertainty that affect the accuracy of a theoretical retrieval based on the tau-omega model, are assessed in Davenport et al. (2005).The two techniques used to retrieve SMC from AMSR-E data that are described in Njoku et al. (2003) and Owe et al. (2008) were compared in Wagner et al. (2007a).The authors found that the National Snow and Ice Data Center (NSIDC) product (Njoku et al., 2003) provided a weaker performance than the LPRM, and suggested that the NSIDC algorithm is not able to describe the effects of vegetation and/or surface temperature properly.
A powerful alternative method for retrieving SMC is based on the Artificial Neural Network (ANN).ANN, especially if combined with the use of an electromagnetic model, can be a very useful tool for inversion in Remote Sensing, especially when real-time estimates are needed.An ANN is an interconnection of processing elements (nodes) that are organized into a sequence of fully connected layers.Each node calculates a weighted sum of inputs, and then transmits its function value to other nodes.There are two main phases in the operation of a network.In the first training phase, the connection weights are adapted in response to the training data presented at the inputs and to the desired response at the output layer.The response of the output layer is then obtained in the second validation phase, during which the performance of the trained ANN is also assessed.The training of the ANN can be carried out with model simulations, experimental data, or a combination of the two.In the past ten years, ANNs have been applied in several studies for the retrieval of SMC from radiometric data (e.g.Liou et al., 2001;Liu et al., 2002;Del Frate et al., 2003;Jiang and Cotton, 2004;Angiuli et al., 2008;Chai et al., 2010).In general, the most widely used topology is based on multilayer perceptrons with two or more hidden layers with a nonlinear activation function and a back propagation learning rule.A newly developed learning backpropagation neural network trained with simulated data was used to retrieve SMC from microwave T b at L, C and X-band (Liou et al., 2001;Liu et al., 2002). Del Frate et al. (2003) used two neural network algorithms trained by a physical vegetation model to retrieve SMC and vegetation variables of wheat canopies throughout the entire crop cycle.A similar approach was used in Angiuli et al. (2008).More recently, Chai et al. (2010) developed a novel approach based on an ANN with two inputs, one hidden layer of 20 neurons, and one output, to predict SMC at a 1-km resolution on different dates.Good reviews of the potential of SMC retrieval algorithms for hydrological applications are given in Wigneron et al. (2003), Wagner et al. (2007b).
While the retrieval of SMC is based on low frequency channels, detection of SD requires the use of higher frequencies (Kelly et al., 2003;Chang et al., 1987;Hallikainen and Jolma, 1992;Rott and Nagler, 1995;Jin, 1997;Goodison and Walker, 1995;Grody and Basist, 1996;Hall et al., 2001;Pulliainen and Hallikainen, 2001;Tsang et al., 1992;Davis et al., 1993;Tedesco et al., 2004;Pulliainen, 2006).Indeed, previous research has pointed out that the Frequency Index (FI), i.e. the difference between the low (18/19 GHz) and high (35/37 GHz) frequency T b , may be related to the SWE or SD (Chang et al., 1982;Kelly et al., 2003;Chang et al., 1987).For example: good results for SWE retrieval were obtained in Finland by adding the X-band channel of the Scanning Multichannel Microwave Radiometer (SMMR) and performing a correlation analysis for 17 different brightness temperature functions, each of which involved one or several frequencies and polarizations (Hallikainen and Jolma, 1992).The 85 GHz channel was added in the algorithms developed in Rott and Nagler (1995) and Jin (1997) in order to monitor shallow snow from the Special Sensor Microwave Imager (SSM/I) data, while a vertically polarized T b gradient ratio algorithm was developed in Canada (Goodison and Walker, 1995).A SWE regression algorithm based on spectral and polarization differences was proposed in Hall et al. (2001) and tested in Skou et al. (2010).
All of these approaches generally assumed that the average snow density and grain size did not change over time.However, changes in these quantities can also affect the difference between low and high frequency T b .A dynamic approach to retrieving global SD estimation is presented in Kelly et al. (2003).The algorithm is still based on FI, and adjusts the dimensional coefficient (cm K −1 ) to retrieve SD by predicting how the grain size and snow density might vary and affect the emission from a snowpack by using a Dense Medium Radiative Transfer Model (Tsang et al., 2000).Compared with static approaches, this dynamic algorithm tends to estimate SD with greater root mean squared error, but lower mean error.The potential of ANNs in retrieving snow parameters was evaluated in (Tsang et al., 1992;Davis et al., 1993;Tedesco et al., 2004), while a novel approach to improving its accuracy in SWE retrieval by assimilating satellite radiometric data and ground-based observations was introduced in Pulliainen (2006).
Vegetation cover is both the most important disturbing factor in reducing the sensitivity of T b to SMC and SD and an additional target for land hydrology.Thus, the estimation of vegetation biomass (VB) so as to correct for the effect of low vegetation in the retrieval of SMC and snow cover, or to mask densely vegetated areas where the retrieval is impossible, has led to the generation of vegetation maps as a useful byproduct.One very effective index for characterizing vegetation biomass, and in particular the Plant Water Content (PWC, i.e. the total amount of vegetation water per square meter), independently of the characteristics of the individual plant, is the Polarization Index, as defined in (Paloscia and Pampaloni, 1988;Becker and Choudhury, 1988) and tested on a global scale in several works (e.g.Owe et al., 2008;Choudhury, 1989;Paloscia, 1995;Wang and Choudhury, 1995).Other indexes capable of characterizing the VB of agricultural fields on local and global scales were also assessed in (Macelloni et al., 2003;Paloscia and Pampaloni, 1992).In forests, the situation is more complex: indeed, although the first studies of microwave emission from forests date back to the mid 1970's (Borodin et al., 1976), the retrieval of SMC and SD under trees continues to pose a challenge.Specific studies of transmissivity of forest canopies were described in (Pampaloni, 2004;Hallikainen et al., 1988;Calvet et al., 1994;Kurvonen et al., 1998;Kruopis et al., 1999;Pulliainen et al., 1999;Santi et al., 2009).
In this paper, the proposed algorithm, HydroAlgo, which focuses on estimating the SD and SMC of bare or weakly vegetated soils, has been implemented and validated within the framework of both JAXA and ASI (Italian Space Agency) pre-operational programs.This novel algorithm, which also generates maps of vegetation cover/biomass as an auxiliary product, has been optimized by using data from the AMSR-E sensor and is able to produce daily maps at a spatial resolution comparable to the one of the 37 GHz frequency channel of this sensor (10 km).However, its use can be extended to other sensors operating in similar frequency channels and, in particular, to AMSR-2 onboard GCOM-W, which will be the heir to the AMSR-E on AQUA.The snow product can also be generated by using SSM/I data, although obtaining a decrease in spatial resolution and retrieval accuracy.The short operational time of the algorithm was considered a major feature for operational services generating real-time maps.Thus, the retrieval procedures for estimating the surface parameters from microwave data are based on ANN methods, which offer the best compromise between retrieval accuracy and processing time for SMC and SD estimates.
The algorithm has been developed and calibrated on the basis of very large sets of experimental data acquired on three test areas in Mongolia, Australia and Siberia within the framework of the JAXA ADEOS-II/AMSR-E and GCOM-W/AMSR-2 programs.The validation was then carried out by comparing the satellite-generated outputs with experimental data collected on different test areas, including Northern Italy and four areas in US (for soil moisture) and Scandinavia (for snow).
This paper is organized as follows: Sect. 2 summarizes the characteristics of the test sites and datasets used for the development and validation of the algorithm, Section 3 describes the HydroAlgo algorithm, which is then validated in Sect. 4. Section 5 includes several examples of applications at global scale, while Sect.6 provides a summary and a few concluding remarks.

Study sites and datasets for algorithm development and validation
The development, testing, and validation of the algorithm made use of large sets of experimental data that were acquired on different test sites.

Soil moisture
An extensive experimental dataset used for the development of the SMC algorithm was kindly provided by JAXA.This dataset consisted of two years of AMSR-E acquisitions, from 1 January 2003 to 31 December 2004, regarding two test sites located in Mongolia and Australia.The Australian test area (Central coordinates: Lat.35.10 • S, Lon.147.70 • E) was characterized by low to moderate vegetation conditions, with a marked seasonal vegetation cycle.Instead, the Mongolia site (Lat.46.25 • N, Lon.106.75 • E) was typified by semi-arid conditions, with sparse vegetation and the presence of snow in winter.Both sites covered an area of approximately 120 km × 120 km, which corresponded to at least 100 AMSR-E acquisitions.These acquisitions were colocated with direct measurements of volumetric SMC derived from an automatic network of TDR probes, for a total of 18 sampling points in Australia and 15 sampling points in Mongolia (CEOP, Coordinated Enhanced Observing Period: http://www.ceop.net).SMC over a surface layer 3-4 cm deep was sampled every 30-60 min, together with the soil surface temperature.However, only the measurements collected simultaneously with the AMSR-E overpasses were considered in the dataset.For each test area, all the AMSR-E acquisitions (both ascending and descending orbits) and the corresponding SMC measurements, recorded within ±1 h from the satellite acquisition, were averaged daily.The resulting dataset was composed of about 3000 measurements of T b from C-to Ka-band and the corresponding SMC measurements in the range from 0.05 m 3 m −3 to ∼ 0.40 m 3 m −3 vol.under different vegetation conditions.An area for validating the SMC product in Northern Italy was selected on the Scrivia watershed.The area is located in northwestern Italy, close to the town of Alessandria.It is a flat alluvial agricultural area of 100 × 100 km 2 that is crossed by many important rivers (Po, Tanaro, Scrivia, Bormida), thus subjecting the area to frequent flood events.This location is characterized by large agricultural fields cropped with wheat, corn, and potatoes.Several ground campaigns were carried out in some selected subareas in order to collect vegetation and soil parameters (crop type, plant height and density, biomass, SMC, and surface roughness).The volumetric SMC (in cm 3 cm −3 ) was measured by using portable TDR probes for a surface average soil layer 10-15 cm in depth.Surface roughness was measured (along and across rows) by using a 4 m needle profilometer, the digitalized soil profiles of which were processed to retrieve the height standard deviation and the correlation length of the surface.In this area, AMSR-E images were gathered in different seasons from November 2003 to June 2009.In this case, ground measurements sampled over an area of 10 × 10 km 2 were compared with the output of the algorithm for a pixel centered on 45 • N and 8.85 • E.
Four experimental watersheds of the Agricultural Research Service (ARS) in US were selected for a further test of the algorithm.Ground SMC data to be compared with AMSR-E data were kindly provided by Dr. Tom Jackson.These watersheds are well-instrumented with multiple surface SMC and temperature sensors and have been the core sites for several AMSR-E validation campaigns.Overall, they represent a wide range of ground conditions and precipitation regimes.The test areas are the following: Little Washita (OK) (610 km 2 ), which was dominated by the presence of rangeland and pastures; Little River (GA) (334 km 2 ), which was heavily vegetated (forests, croplands, and pasture); Walnut Gulch (AZ) (148 km 2 ), which was a brushand grass-covered area characterized by a semi-arid climate; Reynolds Creek (ID) (238 km 2 ) was instead a rangeland area, with snow-dominated precipitation (Jackson et al., 2010).
An additional test area (0 • -20 • N, 16 • -17 • E) was identified in a wide portion of Africa, from the Sahara desert to the Equatorial forest, which includes a very high variability of vegetation types and landscape.This area was used for checking the capabilities of Polarization Index at X band (PI X ) in identifying vegetation cover and biomass (VB) and by comparing its performances with those of NDVI.Data collected over this region with AMSR-E and SPOT4 in different seasonal periods have been analyzed.Africa was chosen also due to the availability of large homogeneous regions that are compatible with the coarse ground resolution of the microwave sensor.

Snow
As in the case of Mongolia, JAXA provided significant collection of data for developing the SD Algorithm.The dataset was composed of co-located AMSR-E acquisitions and hourly ground measurements of SD and air temperature provided by 7 stations located in the eastern part of Siberia.The stations were dislocated in order to cover a flat area of about 20 in latitude, 45 in longitude, at an average altitude of 300 m a.s.l., and characterized by low vegetation.In this region, snow is generally present from the beginning of October to the end of May, with a depth that does not exceed 50 cm.The average air temperature ranges from −50 The dataset was obtained by considering all the AMSR-E acquisitions from C-to Ka-band, with the footprint center within a radius of 10 km from the coordinates of each station.These data were combined with the ground measurements, which were recorded within ±1 h from the satellite acquisition.After filtering the no data and no snow values, a dataset was obtained that included 17 000 values of T b at all bands and the associated direct measurements of SD and air temperature.On this relatively small area, a further averaging of the 10-15 AMSR-E acquisitions, collected daily, as well as the corresponding ground measurements, was carried out in order to obtain daily mean values representative of the whole test area.This operation resulted in an averaged dataset of about 1500 samples, in which the radiometric data displayed certain sensitivity to the snow parameters.
The AMSR-E acquisitions, which were collected during the 2002-2003 and 2003-2004 winters, were related to the SD measured by the stations.The ground measurements of SD were derived from the Russian archives (http://meteo.infospace.ru).For both winters, snow was present from the end of October to the middle of May, with the depth reaching 60-70 cm.The resulting dataset was made up of more than 400 daily AMSR-E measurements and the corresponding ground data.

Description of the algorithm
In the HydroAlgo algorithm, the retrieval of SMC is mainly based on the low frequency C-band channel, together with X-, Ku-, and Ka-band, while a combination of only highfrequency (X-, Ku-, and Ka-bands) data enables the retrieval of SD.As a secondary quantity, the Vegetation Biomass (VB) is also obtained by means of the X-band Polarization Index (PI X ).VB is expressed as the Plant Water Content (PWC, in kg m −2 ), a parameter that is closely related to total biomass and physically influences microwave emission (Macelloni et al., 2003;Paloscia and Pampaloni, 1992).The flowchart of the algorithm is shown in Fig. 1.
The algorithm presents the results on three different maps, one for each quantity.However, the retrieval of SMC and SD cannot be carried out beneath forest and dense vegetation, due to the high attenuation of soil emission caused by the overlaying cover.Moreover, snow cover also hampers the estimate of the SMC below it.Thus, the output of VB is used to exclude the regions covered by dense vegetation in the SMC and SD maps, while the areas covered by snow are obscured in the SMC maps.In addition, VB maps are also used to correct the retrieval of SMC of poorly-vegetated soils, as described in greater detail later in this section.
1. Extraction of T b collected over the areas of interest from the Hierarchical Data Format (HDF) files delivered by National Snow and Ice Data Center (NSIDC) and containing the calibrated and geocoded acquisitions of AMSR-E from AQUA satellite (Level 2 data) at C-, X-, Ku-and Ka-band in both polarizations (H, V).
2. Check of data for possible miscalibration (Paloscia et al., 2006) and for the presence of the Radio Frequency Interference (RFI) at C-and X-bands.The check for RFI was carried out using a simple threshold method (Njoku et al., 2005) at both C-and X-bands, and all data over this threshold were eliminated from the dataset.
3. Application of the multisensor image fusion procedure to enhance the spatial resolution of the low frequency channels and to reduce the effect of mixed pixels.This procedure, which is based on the SFIM (Smoothing Filter-base Intensity Modulation) technique (Santi, 2010;Liu, 2000), is aimed at increasing the resolution of C-and X-bands up to values close to the sampling rate (i.e. 10 km × 10 km) by means of the higher resolution Ka band channel.
4. Computation of the PI X , which is to be used for estimating vegetation biomass and is defined as follows: where T bVX and T bHX are the brightness temperatures at X band at V and H polarizations, respectively.

E. Santi et al.: HydroAlgo
With this index it is possible to separate deserts and poorly-vegetated areas, where SMC can be estimated, from forests and dense vegetation regions, where retrieval is unrealistic due to the high attenuation induced by vegetation material.The ability of the polarization index to estimate the vegetation optical depth and to identify different levels of biomass, already established in past research carried out on agricultural fields (Choudhury, 1989;Paloscia and Pampaloni, 1992;Paloscia, 1995;Wang and Choudhury, 1995), is due to the depolarization of the soil emission, which is based on the amount of vegetation overlaying soil.This effect is particularly evident at X band, which is consequently the most suitable frequency for quantifying vegetation biomass and was also used in this paper to correct the effect of low vegetation on the SMC estimate.It should be noted that PI X is also sensitive to SMC, although the effect of vegetation is clearly dominant (Njoku et al., 2003;Choudhury, 1989;Paloscia and Pampaloni, 1992).
The PI X performances were tested on a wide portion of Africa, from the Sahara desert to the Equatorial forest, an area which includes a very high variability of vegetation types and landscape.On this area, the PWC (in kg m −2 ) computed from AMSR-E PI X (Paloscia and Pampaloni, 1988) was compared with the PWC values derived from NDVI thanks to the relationship established by Jackson et al. (2004).Although the latter relationship was initially developed for corn and soybean vegetation, it has been found to be valid for other types of vegetation, too (Paloscia et al., 2011).NDVI data, which were obtained from http://free.vgt.vito.be/home.php,as resulting from 10 days of SPOT4 acquisitions, were resampled at a 10 km × 10 km resolution and compared with the corresponding 10 days of AMSR-E acquisitions, in both ascending and descending orbits, for November 2003, April 2004, June 2004, and January 2005, in order to be representative of the whole seasonal cycle.
The result of this comparison is shown in Fig. 2, and the relationship obtained is with a determination coefficient, R 2 = 0.92, and a RMSE = 0.63 kg m −2 .
According to this result, the PI X can then be legitimately used to produce vegetation maps on a global scale by separating 3-4 levels of biomass without any need of further information from other sensors.5. Masking of the area where the parameters cannot be reliably estimated: deserts, dense vegetation for SMC and SD, and snow cover for SMC.This process is performed by using PI X for dense vegetation (PI X < 0.05), with the map of snow cover extent being generated by the algorithm itself.
After this joint initial process, the algorithm is split into two main parts, which generate the output products of SMC and SD.
Along with the maps of SMC and SD, a reliability index of each output product is computed.This index accounts for the percentage of bad input data (including those affected by RFI) and the estimate of output parameters outside the established range.When inputs outside the range considered for training are presented to the ANN, the latter is unable to predict the right output and answers with an "outlier", i.e. an estimate that falls outside the range of values considered in the training phase.An evaluation of the consistency of the output product can therefore be done by accounting for the percentage of outliers.This reliability index is listed in the header file associated with each output.

Estimate of soil moisture content (SMC)
The estimate of SMC is based on an Artificial Neural Network (ANN) algorithm trained with both experimental and simulated data.The basic microwave measurement is the T b at C band, i.e. the lowest AMSR-E frequency, in order to minimize the vegetation attenuation.The use of vertical polarization at the nominal incidence angle of AMSR-E (53 • , close to Brewster angle) guarantees a relative independence to the soil surface roughness (e.g.Schwank et al., 2010).Moreover, a closer look at the experimental data reveals that T b in H polarization appears to be less related to SMC than V polarization, probably due to the greater influence of the surface features.Figure 3 Additional parameters include: -The AMSR-E T b at X-band (H and V polarizations) for computing PI X and correcting for the effect of low vegetation on soil emission.
-The T b at Ka-band, V polarization, used to normalize for the daily and seasonal variation of the surface temperature, due to its strong relationship with the latter parameter (Owe and Van De Griend, 2001;Paloscia et al., 2006).
The ANN used has a feed-forward multilayer perceptron (MLP) configuration, with a certain number of hidden layers of neurons between the input and the output.In MLPs, successive layers of neurons are fully interconnected, with trainable connection weights that control the strength of the connections.MLP ANNs can be trained to represent arbitrary input-output relations (Hornik, 1989;Linden and Kinderman, 1989).The trained ANN can be considered to be a type of nonlinear, least-mean-square-interpolation formula for the discrete set of data points in the training set.The algorithm chosen for the training phase was the back-propagation learning rule, which is an iterative gradient descent algorithm that is designed to minimize the mean square error between the desired target vectors and the actual output vectors.It should be noted that the gradient-descent method sometimes suffers from slow convergence, due to the presence of one or more local minima, which may also affect the final result of the training.In order to overcome this problem, the training was repeated several times, with a resetting of the initial conditions and a verification that each training process led to the same convergence results in terms of R 2 and RMSE, by increasing it until negligible improvements were obtained.This was done in order to define the minimal ANN architecture capable of providing an adequate fit for the training data, so as to prevent overfitting problems.Overfitting is related to the oversizing of the ANN, and may cause considerable errors when testing ANN with input data that is not included in the training set.In order to define the optimal ANN architecture, after the training phase, the ANN was tested using data not included in the training set, and the training and testing results were then compared.The ANN configuration was then increased, until the ANN architecture was found to have a negligible improvement in the training and a worsening in the test results.A configuration with two hidden layers of ten perceptrons each was finally chosen as the optimal one.

ANN training and test
The training of the ANN was carried out by using the extensive experimental dataset available on the Mongolia and Australia sites, integrated by model simulations.PI X was able to  indicate the vegetation seasonal cycle of the Australian site, as is shown for example for one of the ground stations in the site (ADELONG ROCHEDALE station, Lat.35.37 • S, Lon.148.06 • E) (Fig. 4), whereas the semi-arid region of Mongolia did not show any significant periodic variation.
On both these sites, the T b at C-band, in V polarization and at incidence angle > 50 • , showed a noticeable sensitivity to SMC (see Fig. 3).The data spread indicates that the effect of other factors was important, and that it undoubtedly plays a major role among those types of vegetation.On the other hand, the PI X , used as input of the ANN, performs the  correction for vegetation effects through its correlation to the optical depth.
In order to increase the amount of data for the training and testing processes, the experimental dataset described above was enlarged with simulated data by using the Radiative Transfer Theory in the formulation of the tau-omega model.Model simulations performed at all the frequencies and polarizations considered were iterated by randomly varying the input values of SMC and surface temperature, T s , in a reasonable range of expected values (i.e.SMC from 0.05 m 3 m −3 to 0.5 m 3 m −3 , and T s from 275 K to 320 K).The lower threshold of 275 K was selected in order to eliminate frozen soils.The effect due to surface roughness was taken into account by including in the ANN training set T b data corresponding to different surface roughness conditions.In the end, a dataset of 10 000 simulated values of T b was generated.The dielectric constant was derived from the input of SMC by means of the model from Dobson et al. (1985),  and the range of the other two inputs required by the model, namely the optical depth (τ ) and the equivalent single scattering albedo (ω), was set so as to assure consistency between the model simulations and the experimental data.
Since no direct measurements of vegetation were included in the dataset, the values of τ and ω were estimated from the experimental data by using a direct minimization method.This was done by searching for a couple of τ and ω values that would minimize the RMS error between the T b simulated for each measured SMC value and the corresponding AMSR-E acquisition.The minimization was implemented through the Nelder-Mead simplex algorithm (Nelder and Mead, 1965), which is a popular search method for multidimensional unconstrained minimization.
In this case, the Cost Function (CF) to be minimized by varying τ and ω was where: -T bVm (f ) and T bHm (f ) are the T b measured at the f frequency (from C-to Ka-band).
-T bVs (f ) and T bHs (f ) are the outputs of the tau-omega model for each measured value of SMC and surface temperature, which were obtained by varying the τ and ω values until the minimum of the above function was reached.
The above procedure was repeated for each T b couple (V and H pol.) of the experimental dataset, thus enabling us to associate the estimated values of τ and ω with each AMSR-E acquisition and to establish empirical relationships between these two quantities and the frequency.
For the dataset considered, the τ values obtained at C-band ranged between 0.16 and 1.1, while the corresponding ω values were between 0.03 and 0.08.The variation of τ and ω with the frequency was also investigated, in order to establish empirical relationships for deriving their values at frequencies higher than C band.For example, the relationships between the average values of τ (f ) and ω(f ) of the entire dataset and the frequency are shown in the following equations where f is the frequency in GHz.The reliability of this inversion method in estimating τ values was verified by representing the polarization index at Xband (PI X ) derived from the AMSR-E as a function of τ (at the same frequency) estimated as above by using the Nelder-Mead inversion.
The relationship obtained is shown in Fig. 5 and in the following Eq.( 7) which is in agreement with the results found in Paloscia and Pampaloni (1988).
-Surface temperature between 275 K and 320 K.
ω(C-band) between 0.03 and 0.08, ω at higher frequencies computed from the one at C-band.
The results of these iterations, combined with the experimental data, are shown in Fig. 6, where T b at C-(top) and Ka-(bottom) bands (V pol.) are represented as a function of SMC, within the variability of the surface parameters, as assumed above.The training of the ANN was carried out by using half (6500) of all these experimental and simulated data.The test performed on the second half of the experimental data produced the diagram of Fig. 7, in which the soil moisture estimated by the algorithm (SMC est ) is compared with the soil moisture measured on the ground (SMC meas ).The regression equation is This result can be considered to be the main test of the algorithm's performances in estimating SMC.

Estimate of Snow Depth
The estimate of SD was likewise carried out by means of a second ANN, trained with an extensive set of experimental data (Siberian dataset) and kindly provided by JAXA.The ANN used had the same basic characteristics (e.g.type and training procedures) as the ones used for SMC retrieval.The key frequency channels in detecting the presence of snow on ground and its depth or water equivalent were at Ku-and Kaband (V and H polarizations) (Chang et al., 1982;Kelly et al., 2003).Moreover, X-band data were also considered, due to a certain sensitivity to SD demonstrated by this frequency.Thus, all three of these frequencies were used for implementing the ANN algorithm.where V and H are the polarizations, and Ku and Ka are the frequencies considered.
The analysis of the experimental data collected in the Siberian site and in other regions of the world with SSM/I and AMSR-E (Macelloni et al., 2003) showed that FI is a good indicator of the presence of snow.The threshold for having snow on ground was established in FI ≥ 4K (10) Thus, the retrieval of SD was planned in two phases.The first step was the identification of the snow-covered area, by using FI: ANN was then used to retrieve SD.

ANN training and test
In this case, the training of the ANN was carried out by using the extensive experimental dataset available on the Siberian sites.The temporal trends of T b at X-, Ku-and Ka-band at V polarization collected from 2002 to 2009 on these sites showed good agreement with the corresponding SD measurements for the whole dataset, as can be observed in Fig. 8. where V is the polarization and Ku (or Ka or X) is the frequency band considered.
No model simulations were added to the training of the ANN, due to the very large extent of the database.The training of the ANN was carried out by using half of all these experimental data.The test performed on the second half of the dataset produced the diagram in Fig. 9, in which the SD estimated by the algorithm (SD est ) is compared with the SD measured on the ground (SD meas ).The regression equation is SD est = 0.78SD meas + 5.97 ( 14) with a R 2 = 0.79, RMSE = 5.54 cm, and BIAS = 0.059 cm.Also in this case, the result can be considered to be the main test for the performances of the algorithm in estimating SD.

Validation
Validation of the algorithm was carried out on some test areas in Europe and the US, where ground measurements were available.One area was located in Northern Italy (approximately 100 × 100 km 2 ) and four others of smaller dimensions in the US, for SMC.Moreover, a further area in Scandinavia (200 × 100 200 km 2 ) was chosen for SD validation.This validation procedure was also useful in evaluating the performance of HydroAlgo at different spatial scales.

Soil moisture
The validation of HydroAlgo for the retrieval of SMC was performed on the Scrivia watershed in Italy, where a longterm experimental study devoted to SMC and vegetation was carried out in the hopes of fine-tuning operational procedures for flood forecasting and alert.The validation was repeated for all the dates for which ground measurements were available.The results are shown in Table 1.The statistical parameters of the regression between estimated and measured SMC are: R 2 = 0.82, RMSE = 0.035 m 3 m −3 , BIAS = 0.09 m 3 m −3 .
A further and more performing test was carried out by comparing AMSR-E data to the ground SMC data collected in four experimental watersheds of the Agricultural Research Service (ARS) in the US, kindly provided by Dr. Tom Jackson (Jackson et al., 2010).
These watersheds have been already described in Sect.2.1.In Table 2, values of R 2 , RMSE and BIAS of the relationship between estimated and measured on ground SMC are shown.These statistical parameters were obtained for each test area and for both ascending and descending orbits.R 2 is generally not very high, whereas RMSE and BIAS are rather low and ≤ 0.05 m 3 m 3 and ≤ 0.02 m 3 m −3 , respectively.Results demonstrated that the algorithm performs within a specified accuracy of ≤ 0.06 m 3 m −3 (Paloscia et al., 2012).

Snow Depth
The SD retrieval was validated over a test area in Scandinavia, by comparing the algorithm outputs with the averaged SD measurements of four meteorological stations located in Kautokeino, Sodankyla, Muonio, and Pajala.Once the snowcovered areas had been identified by means of the FI ≥ 4 K threshold, the relationship obtained by comparing the measured on ground SD (SD meas ) and the corresponding outputs of ANN (SD est ) was the following: SD est = 0.81SD meas + 7.54 (15) with a R 2 = 0.79, RMSE = 9.13 cm, and BIAS = −0.95cm.The results obtained are shown in Fig. 10.

Algorithm applications
Once the algorithm was validated on relatively small areas, an attempt to test its validity further on a larger scale was carried out.Although it cannot be considered a real validation, due to the absence of corresponding and adequate ground data, this study can be useful for understanding the capability of the algorithm to reasonably estimate SMC, SD and PWC in other regions with respect to those where it has been tested and therefore to also verify its flexibility.This is particularly important for evaluating the capabilities of ANN to generalize the training phase that was based on data derived from small areas.Although it is difficult to obtain ground data of SMC, SD and PWC in order to validate the algorithm at a so large scale, we have observed that the range of these parameters is generally compatible with the climatic regions and the meteorological conditions related to latitude and seasons.
In order to do this, the algorithm was assessed on the Po Valley in Northern Italy and over the entire terrestrial globe for the SMC, on a portion of Europe and over the entire terrestrial globe for the SD, and over Africa for PWC.In all these cases, only modest information on ground truth was Fig. 13.SD maps (in cm) retrieved on Europe before and after heavy snowfall events in December 2009 and 2010.On 15 December 2009 and 9 December 2010 the snow cover is sparse and almost limited to Scandinavia and Alps, whereas, after the events, the snow cover appears to be much more spread and evident even in Central Italy, where the snow depth measured on 20 December 2009 in the area close to Florence (white circle) was about 10 cm on the ground, which is the value estimated by the algorithm.available, and an evaluation of the resulting maps was thus performed on the basis of climatic and meteorological characteristics of the regions of the globe investigated.

Soil moisture
SMC maps produced with the algorithm over all of Northern Italy are shown in Fig. 11.The maps refer to 27 November 2003, and 4 June 2004.In spite of the coarse ground resolution, a marked difference in SMC between the two dates is recognizable and is in agreement with the seasonal and meteorological conditions.In November, the weather was wet with frequent rainfalls, whereas in June a severe drought occurred.The black circles represent the test area of Alessandria, where ground measurements were collected on the same dates and the algorithm was validated.It is interesting to note that a region of rice fields close to Vercelli, in the northwestern area of the images, is clearly recognizable as it is generally wetter than the other agricultural fields, especially in June when the rice fields were flooded.12a, b.Snow cover and forests are masked in the images.At least 4 levels of SMC can easily be identified.Although no ancillary information is available, the results are in reasonable agreement with the climatic and seasonal meteorological conditions of the various zones.The slightly higher SMC values for the Arabian and Australian coasts correspond to the presence of sparse vegetation, as these regions are more humid than the desert zones.The seasonal variation in SMC shows an opposite trend in the two hemispheres: e.g.Australia is wetter in August than in February.

Snow Depth
The SD maps of all of Europe, generated in December 2009 and 2010 are shown in Fig. 13, in order to include the Alps, the Apennines, and the Balkan Mountains as well.The snowcovered areas are clearly visible, and at least 4 ranges of SD can also be distinguished.The maps were made before and immediately after heavy snowfall events.On 15 December 2009 and 9 December 2010, the snow cover was sparse and almost limited to Scandinavia and the Alps, whereas the snow cover after the events appears to have been much more spread and evident even in Central Italy, where the SD measured on the ground in the area close to Florence was about 10 cm, which is the value estimated by the algorithm.Lastly, two SD maps of the whole world, obtained in December 2009 and February 2010 by using HydroAlgo, are shown in Fig. 14.The presence of snow, especially in the Northern hemisphere, is clearly pointed out.

Vegetation biomass
In this context, vegetation maps of PWC (kg m −2 ) are generated from PI X mainly to mask dense vegetation in SMC and SD maps and to correct the SMC estimate for the effects of low vegetation.However, these maps can represent an additional output of the algorithm.
For example, a vegetation map of Africa computed from PI X is shown in Fig. 15a and b, in which the PWC obtained from PI X is compared with the one derived from (optical) NDVI obtained from Free Vegetation Products (http://free.vgt.vito.be/home.php).The direct comparison in the two maps between the PWC values from SPOT4 and from PI X , carried out pixel by pixel gave the following statistical parameters: R 2 = 0.87, RMSE = 1 kg m −2 , and BIAS = 1.89 × 10 −2 kg m −2 .According to these results, the vegetation maps on a global scale can reasonably be generated by using PI X as a byproduct of the HydroAlgo, instead of NDVI, in view of the advantage of using the same sensor for all applications.

Summary and conclusions
A new algorithm (HydroAlgo) for generating simultaneous maps of SMC and SD from AMSR-E data has been implemented within the framework of the GCOM-W/AMSR-2 project of JAXA and the Italian National Project ASI/PROSA, for the purpose of developing products useful for hydrological applications and natural disasters management.The algorithm makes exclusive use of AMSR-E-like data.C-band channel provides basic information for the retrieval of SMC, while SD is essentially obtained from X-, Ku-and Ka-band channels.Additional information on surface temperature and vegetation cover, which was useful for improving the retrieval accuracy of the algorithm, was obtained from the T b at Ka-band (V polarization) and from the Polarization Index at X band (PI X ), respectively.The latter quantity made possible the generation of maps of vegetation biomass (VB, expressed as PWC) as an auxiliary product.
No other ancillary data were required to obtain the results presented here.HydroAlgo was able to separate 4-5 levels of SMC and SD at a nominal ground resolution of 10 km × 10 km, by using a specific algorithm for improving the spatial resolution.Both SMC and SD were retrieved by using ANN methods trained with a large set of experimental data.For the retrieval of SMC, the dataset was enriched by model simulations.The global maps of SMC, SD and PWC were reprojected over a fixed grid, in geographical coordinates with a spatial resolution of about 10 km × 10 km.This represented an improvement in the spatial resolution of the input C-and X-band channels involved in SMC and PWC estimates.Processing an entire day of AMSR-E acquisitions required about 20 min.
In order to compute a reliability index of the output products, the entire process took into account the percentage of bad input data (including those affected by RFI) and the estimate of output parameters outside the established range.This index has been listed in the header file associated with each output.
Furthermore, the algorithm was applied at several spatial scales over different regions of the Earth.Although this part of the paper cannot be considered a real validation, due to the absence of adequate ground data, it is important to note that the algorithm is able to reasonably follow the variations of SMC and SD at different latitudes and in various climatic conditions.In any case, additional tests on different areas and seasons would be desirable, in order to evaluate more thoroughly the operational capabilities of the implemented code.

Fig. 2 .
Fig. 2. The Plant Water Content (PWC, in kg m −2 ) estimated from the X-band Polarization index, compared to the PWC estimated from NDVI, for a large area in Africa (0-20 • N/16 • -17 • E).The line represents the regression equation.

Fig. 3 .
Fig. 3.The brightness temperatures (T b ) measured at C-band (in V and H pol.) in Australian and Mongolian test sites as a function of volumetric SMC (m 3 m −3 ).

Fig. 5 .
Fig. 5. PI X , derived from the AMSR-E measurements, as a function of the optical depth estimated by using the Nelder-Mead inversion method.The obtained regression is: PI X = 11.18exp (−3.12τ ) (R 2 = 0.99).

Fig. 6 .
Fig. 6.Experimental (red) and simulated (blue) T b data (V pol.) of the whole dataset (Australia and Mongolia) as a function of SMC, in m 3 m −3 (top: C-band; bottom: Ka band).

Fig. 7 .
Fig. 7. SMC estimated by using the ANN algorithm as a function of SMC measured on ground for the part of Australian and Mongolian dataset not used for training.

Fig. 8 .Fig. 9 .
Fig. 8. Temporal trends of T b at X-, Ku-and Ka-bands and the corresponding snow depth (SD, in cm) measurements obtained for the Siberia dataset from 2002 to 2009.

Fig. 11 .
Fig. 11.SMC maps generated by using HydroAlgo in Northern Italy.Maps were carried out on 27 November 2003 and on 4 June 2004.Black circles indicate the ground truth data area.

A
direct correlation between T b at Ka-, X-, and Ku-band and SD resulted in the following relationships T b XV = −0.50SD+ 149.34 (

Fig. 12 .
Fig. 12.(a) SMC maps (in m 3 m −3 ) of the entire world obtained in December 2009, February and April 2010, by using HydroAlgo.Some AMSR-E scans are missing, as we can see in Africa and North America in February 2010.(b) SMC maps (in m 3 m −3 ) of the entire world obtained in June, August, and October 2010, by using HydroAlgo.Some AMSR-E scans are missing, as we can see in Africa and South America (black lines).

Fig. 14 .
Fig. 14.SD map (in cm) of the entire world obtained by HydroAlgo in December 2009 and February 2010.The greater snow cover in Europe in February is evident.Some AMSR-E scans are missing, as we can see in Africa and North America in February 2010.

Fig. 15 .
Fig. 15.Vegetation maps of PWC for the entirety of Africa extracted from PI X (top) and NDVI (bottom), respectively.The relationship between NDVI and PWC was derived from Jackson et al. (2004).
• C in winter to 20 • C in summer.The acquired dataset covered 7 winter seasons, from October 2002 to May 2009, with a significant lack of data for the 2008-2009 winter.

Table 1 .
Comparison between measured and estimated averaged values of SMC (in m 3 m −3 ) for the Scrivia test area at different dates.

Table 2 .
Statistical parameters of the relationships between measured and estimated averaged values of SMC (in m 3 m −3 ) for each ARS test area and for both ascending (top) and descending (bottom) orbits.