The International Soil Moisture Network: serving Earth system science for over a decade

In 2009, the International Soil Moisture Network (ISMN) was initiated as a community effort, funded by the European Space Agency, to serve as a centralised data hosting facility for globally available in situ soil moisture measurements (Dorigo et al., 2011b, a). The ISMN brings together in situ soil moisture measurements collected and freely shared by a multitude of organisations, harmonises them in terms of units and sampling rates, applies advanced quality control, and stores them in a database. Users can freely retrieve the data from this database through an online web portal (https://ismn.earth/en/, last access: 28 October 2021). Meanwhile, the ISMN has evolved into the primary in situ soil moisture reference database worldwide, as evidenced by more than 3000 active users and over 1000 scientific publications referencing the data sets provided by the network. As of July 2021, the ISMN now contains the data of 71 networks and 2842 stations located all over the globe, with a time period spanning from 1952 to the present. The number of networks and stations covered by the ISMN is still growing, and approximately 70 % of the data sets contained in the database continue to be updated on a regular or irregular basis. The main scope of this paper is to inform readers about the evolution of the ISMN over the past decade, including a description of network and data set updates and quality control procedures. A comprehensive review of the existing literature making use of ISMN data is also provided in order to identify current limitations in functionality and data usage Hydrol. Earth Syst. Sci., 25, 5749–5804, 2021 https://doi.org/10.5194/hess-25-5749-2021 W. Dorigo et al.: A decade of ISMN 5751 and to shape priorities for the next decade of operations of this unique community-based data repository.

Abstract. In 2009, the International Soil Moisture Network (ISMN) was initiated as a community effort, funded by the European Space Agency, to serve as a centralised data hosting facility for globally available in situ soil moisture measurements (Dorigo et al., 2011b, a). The ISMN brings together in situ soil moisture measurements collected and freely shared by a multitude of organisations, harmonises them in terms of units and sampling rates, applies advanced quality control, and stores them in a database. Users can freely retrieve the data from this database through an online web portal (https://ismn.earth/en/, last access: 28 October 2021). Meanwhile, the ISMN has evolved into the primary in situ soil moisture reference database worldwide, as evidenced by more than 3000 active users and over 1000 sci-entific publications referencing the data sets provided by the network. As of July 2021, the ISMN now contains the data of 71 networks and 2842 stations located all over the globe, with a time period spanning from 1952 to the present. The number of networks and stations covered by the ISMN is still growing, and approximately 70 % of the data sets contained in the database continue to be updated on a regular or irregular basis. The main scope of this paper is to inform readers about the evolution of the ISMN over the past decade, including a description of network and data set updates and quality control procedures. A comprehensive review of the existing literature making use of ISMN data is also provided in order to identify current limitations in functionality and data usage

Introduction
Ground-based soil moisture measurements of land surface variables are indispensable in the process of developing, validating, and advancing spatially contiguous data sets derived from satellites or models Gruber et al., 2020). Although the first systematic measurements of soil moisture started well before the satellite era in the former Soviet Union to support agricultural decision-making (Robock et al., 2000), it was not until the early 2000s that soil moisture monitoring networks started being widely established as part of hydrological and meteorological observing capacities. Particularly, the launch of the Soil Moisture Ocean Salinity (SMOS) mission of the European Space Agency (ESA) in 2009 , and the launch of the Soil Moisture Active Passive (SMAP) mission of the National Aeronautics and Space Administration (NASA) in 2015 , boosted the establishment of new research networks .
While all networks are valuable assets for assessing the skill of soil moisture products under various conditions and scales, their usage is hampered by the diversity of sensors, data formats, quality control, and accessibility mechanisms. The need to bring together and to harmonise available soil moisture data was recognised by the international soil moisture community and expedited by the Global Energy and Water cycle Exchanges (GEWEX) project of the World Climate Research Programme (WCRP) with support of the Committee on Earth Observation Satellites (CEOS), the Global Climate Observing System (GCOS), and the Group on Earth Observations (GEO). In the advent of the SMOS mission, ESA decided to provide the financial impetus to establish a global reference database of in situ soil moisture measurements for the purpose of satellite product development and validation. As a result, the International Soil Moisture Network (ISMN) went online in 2010 (Dorigo et al., 2011b, a).
The primary objective of the ISMN is to collect in situ soil moisture data sets, shared by various data organisations on a voluntary basis, and make them available in a harmonised format through a centralised free and open web portal (https://ismn.earth/en/, last access: 28 October 2021). While 10 years after its launch the core objective of the ISMN remains valid, its functionality has expanded since then. This new functionality includes the integration of advanced quality control methods Sect. 3), the provision of additional metadata and ancillary variables (e.g. precipitation and soil and air temperature), ongoing automation, the provision of software code to users, and the implementation of various tools to promote the information exchange between users, the ISMN, and the data providers.
Moreover, the ISMN has substantially grown in terms of networks, stations, and data sets.
Data from the ISMN has supported hundreds of scientific papers on soil moisture, satellite product, and land surface model validation in particular (e.g. Al-Yaari et al., 2019b;Brocca et al., 2014a;Beck et al., 2021). Several operational data producing services routinely access the ISMN data for repeated quality assurance, including the ESA's Climate Change Initiative , the Copernicus Global Land Service (Bauer-Marschallinger et al., 2018), and the Copernicus Climate Change Service (C3S; Dorigo et al., 2017). Other domains have also exploited the ISMN data, e.g. in meteorology, drought monitoring, or land-atmosphere coupling (Sect. 4).
Despite the valuable contribution of the ISMN to satellite and climate communities, multiple challenges have yet to be mastered, including the heterogeneous availability in space and time , scale differences between in situ measurements and satellite or model sampling , full characterisation and traceability of uncertainties, and differences in spatiotemporal support of the observations caused by different measurement techniques and landscape heterogeneity (Ochsner et al., 2013). New scientific avenues to improve the spatial coverage could be the inclusion of soil moisture data sets from low-cost sensors collected by citizens (Sect. 5.1.4. For climate applications, stable long-term reference data are required, calling for the coordinated establishment and maintenance of Fiducial Reference Measurement (FRM) stations, as outlined by the GEO/CEOS Quality Assurance framework for Earth Observation (QA4EO) (GCOS, 2016;Gruber et al., 2020).
The scope of this paper is to inform readers about the evolution of the ISMN over the past decade, including a description of network and data set updates, new quality control procedures, and new functionality of the data portal. We also review scientific literature making use of ISMN data to assess the achievements facilitated by the ISMN and to identify current limitations in data availability and functionality and challenges in data provision and use. Based on this review, prerequisites and priorities needed to ensure another decade of this unique community-based data repository are defined.

The ISMN data hosting facility
Although the ISMN may be considered a mere data repository, there is much more to it. Its core functionality includes collecting data from participating data providing networks, harmonising them in terms of units, sampling rates, naming, and metadata, performing automated quality control, storing the data and metadata in a searchable database, and making them available through a web interface. And, from a system perspective, it entails even more, e.g. communication with (potential) data providers and users (Fig. B1).

Data and metadata summary
As of July 2021, the ISMN contains 71 networks providing access to a total of 2842 stations, approximately 10 000 soil moisture data sets, and an additional 10 000 data sets of other meteorological variables (collocated with the soil moisture measurements). Although the ISMN is a global network, most networks and stations are located in North America (15), Europe (28), Australia (3), and Asia (19; Fig. 1). New networks are continuously being added, while many existing networks are still being upgraded with additional stations, soil moisture sensors, and meteorological variables. The diversity of networks is large, ranging from networks with a single station to networks comprising more than 400 stations covering different landscape types as well as periods. The distribution of stations per Köppen-Geiger climate class is given in Fig. 2.
Most of the networks originate from scientific initiatives in various disciplines (e.g. remote sensing, soil sciences, agriculture, and meteorology), and only a few are run by operational entities like national weather or environmental services. Consequently, a lack of sustainable project funding has forced several scientific networks to close after some time. As a result, 18 out of the 71 networks contained in the ISMN have become inactive and will no longer provide data set updates (Fig. 3). Data go back as far as 1952, but none of the data sets spans the entire ∼ 70 year period. The longest available time series (∼ 40 years) is provided by the RUSWET-AGRO network in the former Soviet Union, while the longest operating network still active is SNOTEL in the USA.
Most networks provide data set updates at yearly or irregular intervals. Data sets from six networks (ARM, COS-MOS, FMI, SCAN, SNOTEL, U.S. Climate Reference Network (USCRN), and WegenerNet), comprising approximately 900 stations, are updated in near-real time (NRT; status in July 2021), which is currently defined as once per day. While the earliest networks were sampled manually at weekly, fortnightly, or even monthly intervals, most current networks take their measurements using electronic devices at daily, hourly, or even more frequent sampling rates. For more details on the networks, see Sect. A, Dorigo et al. (2011bDorigo et al. ( , 2013, or the individual references herein. The variables contained in the ISMN (Tables 1 and 2) originate from networks that were built for various purposes, which consequently do not all contribute the same information. Since the ISMN was initially founded as a validation database for satellite (surface) soil moisture data, each station in the database provides soil moisture for the upper soil layer (≤ 0.10 m depth). Soil moisture data sets in the ISMN can go as deep as 2 m, but generally with a decreasing number of measurements locations with depth (Table C1). Some stations deploy more than one sensor at a certain depth, ei-   ther as replacement in case of failure of one of the sensors or to characterise local soil moisture variability. The availability of meteorological data, like precipitation and soil and air temperature, is even more heterogeneous, depending on the scope of the network or the data sharing policy of the data providing organisation. It is also quite common that single time series in the database are composed of the consecutive measurements of two or more different sensors when a sensor is replaced after failure. Metadata information can be divided into two categories, i.e. mandatory metadata, which allow for an unambiguous identification of each network, station, and measurement in the ISMN database (Fig. D1), and optional metadata, shared by data providers to allow more in-depth analysis of their data sets. To be consistent between sites, the mandatory variables of climate, land cover, and soil characteristics are taken from external databases. Climate classification is taken from the Köppen-Geiger database, with a resolution of 0.1 • (Peel et al., 2007). Dynamically evolving land cover for the years 2000, 2005, and 2010 is obtained from ESA's Climate Change Initiative (CCI) land cover v1.6.1 with 300 m resolution. Soil information is retrieved from the Harmonized World Soil Database (HWSD v1.1; FAO/IIASA/ISRIC/ISS-CAS/JRC, 2009) with a 30 (1 km) sampling, although the actual resolution may strongly vary from location to location.
If available, data providers can optionally share their own, more detailed, characterisations of land cover, soil, and quality flags with the ISMN. These are stored in addition to the same variables from external sources. All static variables per measurement site and depth are listed in Table 2

Data collection and harmonisation
Data collection is done either manually (mostly by email) or is automated, depending on the degree of automation at the data provider side. Although standards for in situ soil moisture data collection are available , there is no general agreement within the community, and it is not prescribed for participation in the ISMN. Thus, the data being contributed to the ISMN are heterogeneous with regard to units (e.g. volumetric soil moisture, water depth, mass, soil saturation, and plant-available water), installation depth, integration length, and positioning of the sensors (vertical and horizontal), the metrical system, the sampling interval, and the time zones used for the measurement time stamps.
The first harmonisation step for all data and metadata involves the conversion of units to internationally agreed scientific units (e.g. metres and degrees Celsius). Then, following the recommendations of the World Meteorological Organization for weather observation and forecasting (Williams, 2010), all data are resampled to hourly Coordinated Universal Time (UTC) reference time steps. Data that are available at higher sampling rates are thinned by selecting the individual measurements at the hourly UTC reference time step or the ones that are closest in time within a window of ±0.5 h. If there is no measurement available within this interval, the respective time step is not stored in the database but can be recreated and filled with a no-data value upon download. The temporal resampling scheme is valid for all included dynamic variables, except precipitation. Since precipitation is a flux and not a state variable, all measurements within the hourly interval are summed to represent the full amount of precipitation within the preceding hour (Dorigo et al., 2011b).
All soil moisture measurements provided to the ISMN are converted to volumetric soil moisture in cubic metres by cubic metre (hereafter m 3 m −3 ). Since the majority of networks already shares their measurements in volumetric soil moisture units, often no unit conversion is needed. For the other (mostly historical) networks, measurements are converted to volumetric units using metadata on soil properties (in case measurements are provided as saturation level or plant-available water) and/or the thickness of the soil layer represented by the measurement (in case measurements are provided as water height or mass; Dorigo et al., 2011b). Note that, even if all measurements are harmonised in terms of units, differences in sampling volumes related to the sensor design and installation are not accounted for.
The harmonisation of measurement depths is particularly challenging, as different networks adopt different measurement depths, similar sensors are positioned differently (horizontally vs. vertically), or their measurements represent different observation volumes, which may even differ according to the soil wetness (as for cosmic-ray probes). Thus, harmonising soil moisture measurements to one or several reference depths would require either assumptions on the measurements and soil properties or supplemental soil modelling. Additionally, since there are lots of potential uses for the data, there is no common agreement on the optimum sampling depths. For example, satellite calibration-validation generally requires observations of the 0-5 cm layer, while land surface model evaluation requires reference measurements that are representative for the layers defined in the model (Dorigo et al., 2011b). Consequently, the ISMN does not harmonise measurement depths.
After data harmonisation, the data sets are submitted to extensive quality control procedures (Sect. 3. After quality control, all data sets of soil moisture and other variables, metadata information on networks, responsible organisations, sites, sensors, and static soil attributes for each station are stored in a relational database.

Data portal
The ISMN can be accessed at https://ismn.earth/ (last access: 28 October 2021) and consists of a project website containing, e.g., information about networks, data, quality control, and partners and a data interface where users can view, query, and download the data contained in the database.
The data interface displays the location and information of networks and single stations and allows plotting of the available data to gain an impression of data availability and quality. Data can be selected for time period, area, single networks, or entire continents. Alternatively, the data download can be selected via an advanced SQL query, which allows users to make more specific selections (e.g. for a sensor brand or a certain depth range).
The selected data are directly extracted from the database, and downloads are organised per network. For each network, the download contains (i) the measurements and their quality flags, (ii) information about the file data organisation, (iii) a description of the ISMN quality flags, (iv) a metadata file compliant with ISO 19115 and INSPIRE (Infrastructure for Spatial Information in the European Community) metadata standards, and (v) information about static site characteristics (e.g. land cover, climate class, and soil characteristics).
The extracted data set files are formatted according to either the CEOP (Coordinated Energy and Water Cycle Observations Project; Williams, 2010) standard or a slightly modified version of the CEOP format which improves data usability (Dorigo et al., 2011b). These standards use the ascii format, but a NetCDF format is foreseen for the near future.

Data provider and user involvement
The ISMN is entirely built on the voluntary, free-of-charge contributions from scientific and operational providers. This prevents the ISMN from being too prescriptive towards the data providers in terms of delivery intervals, automation, and formatting. Hence, a careful balance is needed between in-clusiveness, on the one hand, and data quality standards, on the other. The ISMN facilitates between users and data providers by reporting data quality issues and user feedback to the providers every 6 months. This is done by means of a report on data usage statistics for each individual network, e.g. on the number of downloads, the usage of their data in scientific publications, and flagging statistics. Together with obtaining visibility and citations, obtaining feedback on data usage and data quality is one of the primary motivations for data providers to join the ISMN.
More than 3000 active users have registered since 2009 (status in July 2021). Data download is free of charge but user registration (compliant with the latest European Union General Data Protection Regulation, EU GDPR, privacy standards) is required to prevent illegal redistribution of the data or theft of ground equipment and to track (undisclosed) statistics on data usage that are fed back to the data providers, e.g. by regular reports.
News feeds on the ISMN web page and a biannual newsletter inform the users about new networks, new data sets, data quality issues, important publications, workshops, and more. In addition, a dedicated forum and classical email exchange allow users to raise and receive response to issues. Moreover, open-source software packages are available for reading and plotting the data (see the data availability section).
3 Quality control

Quality flagging methodology
The wide variety of sensor types and installations, measurement protocols, calibration methods, preprocessing, and quality control procedures adopted by the data providers result in data sets with large differences in quality and filtering. In an attempt to harmonise the reliability of the data from different networks and sensors and to allow for the marking of spurious observations in near-real time, the ISMN has adopted automated quality procedures which are applied to all observations feeding into the ISMN . It uses several approaches to detect dubious soil moisture measurements, which can be subdivided into geophysical dynamic range verification, geophysical consistency methods, and spectrum-based methods (Table 3). While the first category of methods applies simple threshold checks directly to the measurements, the geophysical consistency methods make use of ancillary data, which are either observed in situ at the same site or derived from model data (i.e. Global Land Data Assimilation System (GLDAS)-Noah). The spectrumbased flags are the result of a series of conditions applied to the soil moisture measurement time series and their first and second derivatives. The geophysical consistency and spectrum-based methods are only applied to the soil mois-ture observations, while geophysical dynamic range verification is applied to all dynamic variables in the database.
Recently, the following refinements of the original flagging procedures, as described and assessed in Dorigo et al. (2013), were implemented to increase the flagging accuracy: -The outlier detection method (flag D06) now allows spikes to last 2 consecutive time steps instead of the initial 1 h. This occurs when all conditions of Eqs. (4)-(6) in Dorigo et al. (2013) are met, but the peak value remains unchanged for an additional hour. The overall impact is small (flagging numbers increase from 0.31 % to 0.34 %), but its impact on extreme values can be significant.
-Flag D07 (negative breaks or drops) was extended with an extra possibility, which detects drops from values greater than 0.05 to exactly 0 m 3 m −3 soil moisture as follows: x t ≥ 0.05 = ∧x t+1 = 0. (1) Since a spurious soil moisture drop is a precondition for a low plateau (D09; constant low values following a negative break), the latter is also affected. Adding the extra drop detection increased flagging numbers for D07 from 0.03 % to 0.05 % and for D09 from 1.1 % to 1.4 %.
-In case more than one soil temperature, air temperature, or precipitation sensor is available at a site, a flag is raised for the soil moisture measurement if the conditions of flags D01, D02, and D04, respectively, are met at least for one of these sensors. This has led to a small overall increase in flags D01, D02, and D04 (< 0.8 %).
All quality control procedures adopted by the ISMN have been made available under the open-source license agreement (see the data availability section; https://github.com/ TUW-GEO/flagit, last access: 28 October 2021).
We assessed the refined flagging procedures by applying them to 10 networks with hourly data that include stations with multiple soil temperature, air temperature, or precipitation sensors. Despite the very low overall impact of the refined flags, for some networks they are substantial (Fig. 4).
Measurements that are detected as erroneous by the quality control procedures are not deleted from the database but tagged as such (Table 3). The flag is provided as additional attribute (ISMN quality flag) to the observation upon download and can take one of the main categories, namely C (exceeding plausible geophysical range), D (questionable/dubious), M (missing), or G (good). The D flag is raised when either a geophysical consistency or a spectrum-based check is positive. An additional number indicates the actual cause for flagging (Table 3). A soil moisture observation may receive multiple C-and D-type flags but the good and missing flags are exclusive. Seven networks provide their own soil moisture quality flags, which are added to the ISMN database in addition to the ISMN flags common to all time series. Examples are flags for data quality (without further methodological description) or simple thresholds. For instance, Real-time In-Situ Soil Monitoring for Agriculture (RISMA) tags soil moisture observations for frozen soils when the average temperature of two adjacent soil layers is below 0 • C (Pacheco et al., 2014). By contrast, the ISMN flag D01 (soil temperature < 0 • C) only considers the corresponding depth. For RISMA, the frozen soil flags provided by the network and those computed by the ISMN (D01; soil temperature < 0 • C) agree for 87.8 %.

Flagging occurrence
The most commonly raised flag is when one of the ancillary temperature observations, i.e. in situ soil temperature (D01), in situ air temperature (D02), or GLDAS soil temperature (D03), is < 0 • C (Table 3; Fig. 5). Since in situ temperature measurements are not available for all networks, and to keep consistency between networks, flags D01 and D02 are not shown in Fig. 5. The number of observations flagged as frozen soil are not an indicator of the site in general but show which networks are located in areas with a pronounced cold season, e.g. stations from the FMI, RISMA, MAQU, SCAN, and SNOTEL networks.
The second most common flag is C03 (soil moisture above the site-specific saturation point), which is computed from the HWSD soil properties. The site-specific saturation point is usually lower than the globally adopted, less conservative threshold of 0.6 m 3 m −3 (flag C02) and, thus, raised more often (Fig. 5). However, the HWSD soil properties are uncertain, and consequently, the C03 flag should be considered as being indicative rather than as an absolute quality indicator. Values exceeding the saturation point are often an indication of calibration biases or atypical site conditions. For example, the large number of C03 flags obtained for the BIEBRZA-S-1 network is because it is installed in a temporarily flooded marshlands with peat porosity exceeding 80 %.
Constant values as a result of saturation plateaus (D10) or after a negative break (D09) are the most common spectrumbased flags (Fig. 6). The latter are often a sign of sensor drop outs and mostly limited to the GROW, ARM, HYDROL-Net Perugia, and the U.S. Department of Agriculture (USDA) Agricultural Research Service (ARS) networks. For example, Xaver et al. (2020) evaluated the sensors used in the GROW network and found occasional drops in soil moisture to zero, which may be the result of corroding contacts. Spikes and breaks are, by nature, isolated events that do not occur over an extended period of time and, thus, appear less frequent in the flagging statistics. The relatively large number of spikes for the network SNOTEL of 0.4 % is due to some extremely noisy time series.

Effect of flagging on applications
For a selected ISMN site (SCAN and Mayday station), Dorigo et al. (2013) showed that masking flagged values has a small but positive impact on the validation of GLDAS-Noah v1 modelled surface soil moisture and the remotely sensed VUA-NASA Advanced Microwave Scanning Radiometer for Earth Observing System (AMSR-E) soil moisture product. Here, we performed a more extensive assessment of the impact of excluding automatically detected spurious observations by the revised flagging methods (Sect. 3.1) by using ISMN observations available in the period 2001-2019 to validate both ERA5 top layer (0-0.07 m) water content (Hersbach et al., 2020) and ESA CCI Soil Moisture (v5.2; Gruber et al., 2019bGruber et al., , 2017Dorigo et al., 2017).
While the impact of flagging is positive for temporal agreement (R Pearson and R Spearman) between the ISMN and ERA5, the effect is negligible for ESA CCI (Table 4). On the other hand, the ISMN flagging reduces the unbiased root mean squared difference (ubRMSD) for both comparisons. The benefit of excluding spurious values is also obvious in Fig. F1a, where points are located below the 1 : 1 line. Again, the benefit is less clear for ESA CCI (Fig. F1c). Although validations of a satellite and a model-based are not directly comparable, one reason for the different impact is that the ESA CCI retrievals were already flagged in the production process for values outside valid geophysical range, inconsistencies, dense vegetation, freezing, and snow cover, while this is not the case for the ERA5 model data. Consequently, a positive effect of the ISMN flags is more effective for data sets that were not a priori masked than for data sets that were already filtered for spurious observations.

Other quality indicators
The automated quality control algorithms offer insight into the quality of the respective measurement time series but not necessarily of the usability of the data sets for specific applications. Gruber et al. (2013) adopted a triple collocation approach (TCA) to characterise the representativeness errors of ISMN data for coarse-scale (∼ 25 km) use. TCA is a statistical analysis using a combination of three data sets with independent error structures to estimate the random error variance in each of these data sets. Here, we apply the TCA to estimate the representativeness errors of the ISMN data of all networks with sufficient sampling in the period 2001-2019 for application at the coarse scale. ESA CCI SM passive soil moisture (v5.2; Gruber et al., 2019b;Dorigo et al., 2017) and top-layer ERA5 volumetric soil water content (Hersbach et al., 2020) are used to complement the triplets. Spatial collocation is carried out using a nearestneighbour method, with a maximum distance of 30 km, while the temporal collocation uses a maximum time difference of 1:20 h between the triplets. An exception was solely made for the PBO_H2O network because its observations are provided daily at 12:00 UTC, while ESA CCI SM is given at 00:00 UTC daily. All measurements that cover, at least partly, the 0.00-0.07 m depth interval are used. Systematic differences between the data sets, i.e. multiplicative and additive biases, are removed by scaling ERA5 and ERA CCI SM soil moisture data sets to the in situ data prior to the TCA. The results for different networks are quite diverse (Fig. 7). For example, the spread of errors is relatively large for ORACLE but small for others (e.g. African Monsoon Multidisciplinary Analysis -Coupling the Tropical Atmosphere and the Hydrological Cycle (AMMA-CATCH) or Sungkyunkwan University (SKKU)). The median errors per network vary between 0.03 and 0.05 m 3 m −3 , with some outliers in both directions. Note that the triple collocation analysis estimates the combined representativeness and sensor errors, although the latter are assumed to be small compared to the natural spatiotemporal variability .
There is a clear trend of decreasing mean errors with increasing sensor depths (Fig. 8a), which is likely due to a reduction in high-frequency signals and sensor perturbations with depth. Note that this decreasing trend is observed despite the increasing discrepancy in depth support between the ISMN data and the two surface soil moisture data sets (ERA5 and ESA CCI SM). Thus, theoretically, in situ representativeness errors are expected to be even lower than computed. A potential explanation for the slight increase in the median error for the deepest layer (100-255 cm) may be small sample size and the poor soil parameterisation of the land surface model at this depth. Similar patterns were observed by Gruber et al. (2013). Concerning the sensors used, there is a large spread in computed representativeness errors for time domain reflectometry (TDR) and capacitance sensors. While the costly hygrometric sensors have the lowest mean error, the mean error of resistance probes is the highest (Fig. 8b). Since cosmic-ray and Global Navigation Satellite System (GNSS)/GPS reflectometry sensors integrate over larger horizontal and vertical domains, one would expect lower representativeness errors for these sensors compared to the point observations. However, this is not confirmed by our triple collocation results. Possibly, the advantage of the larger spatial support of these systems is counteracted by their lower signal-to-noise ratio. Error information at the site, sensor, or data set level is currently not routinely available for the stations in the ISMN but would be required for a proper weighting of individual stations in large-scale validations (Gruber et al., 2018(Gruber et al., , 2019b. 4 Impact of the ISMN on Earth system sciences 4.1 User uptake As mentioned earlier, over 3000 users have registered to the ISMN, while 20-30 new users register each month. Most users are based in the USA, China, India, and Europe ( Fig. 9). When asked for the intended use of the data, the four main GEO benefit areas are water, disaster, agriculture, and climate sectors, all with a similar share between 16 % and 30 %. Most users come from research organisations (41 %), higher or secondary education (32 %), and non-profit organisations (19 %). Only few users come from public bodies (6 %) or private companies (2 %).
The large uptake of the ISMN for soil moisture studies is particularly due to the simplicity of accessing and using multiple data sets from a wide variety of networks. Initially, the ISMN was established to facilitate the calibration and validation of SMOS-based soil moisture products (Dorigo et al., 2011b) and, still, satellite soil moisture calibration, validation, and algorithm improvement are the primary applications served by the ISMN (Table 5). In the following, we discuss the major purposes the ISMN has been used for, based on a comprehensive review of all studies that used and cited the ISMN in peer-reviewed scientific publications.

Scientific studies
Soil moisture measurements from the ISMN have been widely used as reference data sets for the development and evaluation of satellite soil moisture products, which are mostly global coarse-scale surface soil moisture products (Table E1; n = 212). Although initially the goal was to support algorithm development and validation of the SMOS satellite, which was indeed the case for 85 studies, many other satellite missions profited from the ISMN data, most notably SMAP (n = 52), Advanced Scatterometer (ASCAT; n = 50), AMSR-E (n = 46), and ESA CCI (n = 34). Fortuitously, the data have also been discovered for the evaluation of soil moisture products from less used sensors, including the Chinese Feng-Yun 3B, HY-2, and Gaofen-1 satellites (Parinussa et al., 2014aZhao et al., 2014;Xing et al., 2017), Meteosat Second Generation (MSG) Spinning Enhanced Visible and Infrared Imager (SEVIRI; Leng et al., 2015Leng et al., , 2017, MODIS (Gumbricht et al., 2017;Gumbricht, 2018), Aquarius , and Landsat Pradhan, 2019).
Recently, the ISMN was recognised as validation source for testing algorithms to derive soil moisture from Global Navigation Satellite Systems (GNSS; e.g. Chew and Small, 2020). Similarly, there is an increasing trend in the use of ISMN data for the validation of novel high-resolution satellite soil moisture products, which either downscale coarse-resolution products through the use of other finer-resolution satellite or ancillary data (e.g. Sheng et al., 2019;Helgert and Khodayar, 2020) or directly derive soil moisture from high-resolution synthetic aperture satellites like Sentinel-1 (e.g. Rodionova, 2019b;Foucras et al., 2020). It should be noted that the ISMN and its contributing networks are mostly designed for analysing time series, thus lacking reference data to assess spatial patterns in the data, particularly in high-resolution products (de Jeu and . Soil moisture measurements from the ISMN have been used as a training set for various data-driven approaches. In situ observations were ingested into machine learning frameworks together with several ancillary predictor variables, either to simulate soil moisture at a very high spatial resolution  or to create long-term records (O and Orth, 2021). Greifeneder et al. (2021) developed a machine learning approach to estimate surface soil moisture from Landsat optical and thermal and Sentinel-1 SAR imagery in the Google Earth Engine, using ISMN soil moisture as target variable. Large-scale monitoring networks are necessary to build reliable models for spatially wide analysis, while dense networks are ideal for accurate localised models (Senanayake et al., 2019;Abbaszadeh et al., 2019).
Usage-oriented evaluation studies have focussed on the intercomparison of multiple coarse-scale satellite products, using the ISMN data as a reference, either to select the best performing sensor for a specific application or geographic region (e.g. Beck et al., 2021;Karthikeyan et al., 2017) or to combine them in an optimal way to build a superior product (e.g. Liu et al., 2011;Hagan et al., 2020).
Since direct satellite observations of soil moisture are only possible for the surface layer, most studies concentrate on the evaluation of surface soil moisture. Yet, various studies also focus on products derived from surface soil moisture that represent moisture in deeper (root zone) layers, either through exponential filtering (e.g. Paulik et al., 2014;Tobin et al., 2017), empirical models (Pradhan, 2019), or by assimilating them in land surface models (e.g. Pablos et al., 2018;Blyverket et al., 2019a). Occasionally, the ISMN is used for the evaluation of other satellite data sets, e.g. soil temperature data to validate the freeze/thaw state Hu et al., 2019).

ISMN as part of operational services
Because of its operational nature and advanced quality control procedures, the ISMN has been identified as the primary reference data source for future operational validation systems for global satellite-based soil moisture products . But, already today, the ISMN is an integrative part of several operational satellite soil moisture production chains.
-In 2005, the Satellite Application Facility on support to operational Hydrology and water management (H SAF) started to operationally produce and validate precipitation, soil moisture, and snow products from satellites operated by EUMETSAT (Rinollo et al., 2013). ISMN data are used for the calibration and validation of various soil moisture products produced in H SAF from the Metop Advanced Scatterometers, including global climate data records and near-real time products of surface soil moisture and a root zone soil moisture product over Europe.
-The Climate Change Initiative of the European Space Agency (ESA CCI) uses data from the ISMN to assess, each year, the quality of new soil moisture climate data record releases and their improvements with respect to forerunner versions (Gruber et al., 2019b;Dorigo et al., 2015). Within ESA CCI, ISMN data were also used to quantify the spatial representativeness of ESA CCI satellite and climate model data sets (Nicolai-Shaw et al., 2015b). The validation is systematically performed through the use of QA4SM (see below) and the results and validation settings are published (e.g. https://doi.org/10.5281/zenodo.4120205; Scanlon, 2020).
-C3S produces authoritative, quality-assured climate data records of soil moisture and other essential climate variables (ECVs). The satellite soil moisture products produced within C3S are routinely updated every 10 d with the latest available satellite observations. C3S uses the ISMN soil moisture data in combination with the metadata provided to categorise product performance per land cover and climate type (Scanlon et al., 2019). The validation is systematically performed through the use of QA4SM, and the results are transparently published (e.g. https://doi.org/10.5281/zenodo.4736927; Preimesberger, 2021).
-Copernicus Global Land Service (CGLS) produces, within 1-2 d after satellite overpass, soil moisture data sets from Sentinel-1 and from a combination of Sentinel-1 and ASCAT (SCATSAR; Bauer-Marschallinger et al., 2018. The latter propagates the surface observations to deeper layers by means of the so-called soil water index (SWI). Both products are provided at 1 km resolution for Europe. ISMN data are used for validating moisture in surface and deeper soil layers.
Recently, the Quality Assurance for Soil Moisture (QA4SM) service (https://qa4sm.eu, last access: 28 October 2021) was initiated to bring together validation methodologies, community protocols, reference data, and satellite observations to evaluate and intercompare soil moisture data products in a coherent, standardised, and transparent way. In situ data sourced from the ISMN are an integral part of this validation system (Scanlon et al., 2019;Gruber et al., 2020). The combination of ISMN, QA4SM, and enhanced quality control protocols and selection procedures to establish a set of fiducial reference measurements is expected to become the standard for satellite soil moisture validation in the next few years .

Model development and validation
In situ measurements are the most important reference source when assessing the performance of land surface models, re-analyses products, and hydrological models. Although also satellite observations are a valuable validation source (e.g. Szczypta et al., 2014), these measure only the upper ∼ 5 cm of the soil and, hence, do not allow for the validation of deeper layers. Besides, state-of-the-art reanalysis products like ERA5 Hersbach et al., 2020 assimilate satellite soil moisture observations so that, for these products, in situ data remain the only truly independent validation reference. Since the sampling rate (hourly) of the ISMN observations is generally higher, and their spatial support lower than that of most models (Reichle et al., 2004), model evaluation is, in principle, not limited by the spatial and temporal resolution of the in situ data, although representativeness issues often remain.
In particular for global assessments, the availability of harmonised data over multiple networks makes the ISMN a preferred reference source over data from individual networks . Table E2 shows that many wellestablished, state-of-the-art land surface model and reanalysis products have been evaluated against data from the ISMN, not only by users of these products (e.g. P.  but also by the developer teams themselves, e.g. Reichle et al. (2017)  Also, a suite of new products have been assessed against the ISMN, including multi-model ensembles (Schellekens et al., 2017;Cammalleri et al., 2015), data-driven evaporation models Lievens et al., 2017), and statistical infiltration models Pal and Maity (2019). As for remote sensing products, a trend towards higher-resolution regional to global model-based products can be observed. Among others, ISMN data have been used to validate new highresolution reanalysis products over the USA (McDonough et al., 2018) and Europe (Naz et al., 2020). Apart from soil moisture, soil temperature data have also been drawn from the ISMN for model validation purposes (e.g. Wang et al., 2016;Albergel et al., 2015).
Besides the evaluation of hydrological or land surface model improvements, the ISMN has also frequently served model development in a more fundamental way. For example, Hartmann et al. (2015) developed a large-scale karstic groundwater recharge model over Europe and the Mediterranean and calibrated and evaluated this model with observations of actual evapotranspiration from FLUXNET (Baldocchi et al., 2001) and soil water content data from the ISMN.  used data from the ISMN to assess the fraction of precipitation that is stored in the soil profile. Calvet et al. (2016) used soil moisture and temperature data from the SMOSMANIA (soil moisture observing system -meteorological automatic network integrated application) network contained in the ISMN to derive pedotransfer functions for the soil quartz fraction in southern France. Pal et al. (2016) developed a statistical model to estimate the vertical soil moisture profile using SCAN data of the ISMN as source, while, on a similar note, Shin et al. (2018) developed a non-parametric evolutionary algorithm to predict soil moisture dynamics using ISMN data over Oklahoma and Illinois. Similarly, Jalilvand et al. (2018) estimated the drainage rate from surface soil moisture drydowns from ISMN data. A bridge between soil moisture and vegetation dynamics was made by Sawada (2018), who used ISMN data to validate a new ecohydrological land reanalysis to better simulate the link between sub-surface soil moisture and vegetation dynamics. Finally, Brocca et al. (2015) developed a water balance approach to estimate rainfall from soil moisture observations based on reverse modelling and evaluated this at several ISMN sites in Europe. Later refinements of this approach were also evaluated against data from the ISMN (Hoang and Lu, 2019).

Meteorological applications
Soil moisture from the ISMN has often been used to validate the land surface representations of meteorological forecasting models. However, as meteorological forecasts often rely on the latest generation of land surface models, in practice there is often no strict distinction between meteorological and land surface model development as described in the previous section. Notable examples are the various generations of the Tiled ECMWF Scheme for Surface Exchanges over Land (TESSEL) models used both in the Integrated Forecasting Systems and reanalysis products of ECMWF, the development of which greatly profited from soil moisture and temperature data from the ISMN (Albergel et al., 2012a. Dirmeyer et al. (2016) assessed the skill and soil moisture memory effects of various weather and climate models with ISMN data over the USA. Similarly, Angevine et al. (2014) assessed the soil moisture skill of the Weather Research and Forecasting Model (WRF ; Table E2).
Several studies have used the ISMN data to assess the forecast skill or new implementations of numerical weather prediction models. For example, de Rosnay et al. (2019) and Rodriguez-Fernandez et al. (2019) used in situ soil moisture observations from several ISMN networks to validate the impact of assimilating SMOS brightness temperatures and soil moisture, respectively, to predict soil moisture up to 5 d ahead. Similarly, Pu (2019, 2020) used ISMN data to examine the impact of MAP soil moisture assimilation in WRF for near-surface, short-range weather forecasts. Boussetta et al. (2015) used over 500 ISMN sites to assess the impact of assimilating surface albedo and vegetation states from satellite observations on numerical weather prediction.
From a more methodological, land-atmosphere perspective, Lee (2018) used the ISMN data to study the role of soil moisture in triggering rainfall over West Africa. Conversely, S.  studied the role of rainfall on soil cooling using data from the SMOSMANIA network in southern France.

Drought monitoring
In a drought monitoring context, ISMN data have frequently been used in a convergence of evidence approach in combination with other drought-related variables or indicators. For example, Scaini et al. (2015) compared the variability in the in situ soil moisture measurements from the ISMN, SMOS surface soil moisture, and two drought indices based on climatic information to study droughts in Spain. Mu et al. (2019) used visible and near-infrared (VNIR) satellite data and soil moisture distributed by the ISMN to monitor drought in the southern USA. On a more technical level, Gruber et al. (2018) assessed the use of spatially sparse ISMN in combination with a continuous model for operational agricultural drought monitoring over the USA.
ISMN data have also been used to classify new and more traditional drought indices, such as the Standardised Precipitation (Evaporation) Index and the Palmer drought severity index (Vicente-Serrano et al., 2012;Krueger et al., 2019). Sadri et al. (2020) used the ISMN data from the RISMA network to assess a global near-real time soil moisture index monitor for food security using SMOS and SMAP data. Likewise, Fang et al. (2021a) used ISMN data to evaluate a new drought monitoring approach based on high spatial resolution soil moisture data from downscaled SMOS and SMAP data over Australia. Chen et al. (2019) used precipitation distributed through the ISMN to evaluate a drought index derived from Sentinel-2 in Spain. Due to relatively good coverage over Europe, Cammalleri et al. (2015) used ISMN data to assess various model soil moisture products as input to the European Drought Observatory (EDO; http: //edo.jrc.ec.europa.eu, last access: 28 October 2021). Also, Mishra et al. (2018) used data from the ISMN to assess a suite of land surface models used to reconstruct drought events in India since the mid-1900s.

Other applications
ISMN data have been used for various other purposes, going beyond what the ISMN was originally developed for. In addition to supporting satellite and land surface model soil moisture product development, the ISMN has played a fundamental role in the validation of a wide range of hybrid observation-based products, including a Soil Moisture Saturation Index (Campo et al., 2011), apparent thermal inertia surface estimates (Notarnicola et al., 2012), an improved antecedent precipitation index (API) formulation (Ramsauer et al., 2021), data-driven surface and root zone soil moisture predictions (Kornelsen and Coulibaly, 2014;Manfreda et al., 2014;L. Wang et al., 2020;O and Orth, 2021), improved satellite albedo products , and estimates of effective permittivity and brightness temperature of organic soils .
The ISMN has frequently been used to assess the impact of assimilating satellite observations into hydrological mod-els (Khaki et al., 2019;Nair et al., 2020;Gruber et al., 2015;Shin et al., 2016;, land surface models (Nair and Indu, 2016;Zhao and Yang, 2018;Nair et al., 2020), and carbon models (Scholze et al., 2016). On a more methodological level, Q.  assessed a new data assimilation scheme against ISMN observations. Gruber et al. (2018) even assimilated spatially sparse ISMN observations directly into a spatially continuous land surface model over the continental USA and tested the preconditions (e.g. requirements for the signal-to-noise ratio and number of sites) for having a significant positive impact.
The ISMN has been used to develop and test new statistical validation approaches Gruber et al., 2016;Afshar et al., 2019). Finally, the standards and quality control procedures adopted by the ISMN have served as a guideline for establishing and benchmarking new networks Skierucha et al., 2012b, a;Petropoulos and McCalmont, 2017), soil moisture metadatabases (Liao et al., 2019;Xia et al., 2015), or validation services  and to assess or improve alternative sensing techniques and constellations Kapilaratne and Lu, 2017;Nguyen et al., 2017;Mahecha et al., 2017).

Added scientific value of ISMN over other data sources
Although the ISMN has facilitated hundreds of scientific studies, it is impossible to quantify precisely to what degree the ISMN has contributed to product improvements and new insights that could not have obtained otherwise. Admittedly, data from several networks are also distributed through other portals, typically providing access to a single network (e.g. SNOTEL data through https://www.wcc.nrcs.usda.gov/ snow/, last access: 28 October 2021). The major contribution of the ISMN to scientific advances is primarily given by the access to the many networks and data sets that are uniquely distributed through the ISMN and in the easy access to thousands of harmonised data sets with a single click. The tracked download statistics reveal that approximately one-third (34 %) of the studies making use of the ISMN use multiple networks, the choice of which depends on the scope of the study, the geographical region, period of interest, and the year the study was performed (with more networks becoming available over time). Examples of such studies using data from more than 20 different networks are Paulik et al. Although it is impossible to pin down the exact contribution of the ISMN to process understanding and product improvements, it is very likely that, mainly, studies have discovered flaws in satellite products that would not have been detected without the use of the ISMN. The main reason is that the large number of networks give insight into the product skill in a large variety of climate zones, land cover types, and so on, which no single network could have provided. This allows for the development of, globally, more robust models and products. Second, the data harmonisation and unified quality control minimise the chance that differences in skill potentially observed between locations with different environmental characteristics is an artefact of a different treatment of the data.
Several studies were identified that not only used the multitude of ISMN data to expose flaws in existing products but also to directly improve data products and models, e.g. by calibrating hydrological model parameters for local site conditions (Beck et al., 2021;Kang et al., 2019;Moradizadeh and Srivastava, 2021;Grillakis et al., 2021), by merging them with other observations and models (Gruber et al., 2018;Xu et al., 2018), or using them as training data in machine learning approaches Eroglu et al., 2019;O and Orth, 2021;Greifeneder et al., 2021). The in situ measurements of soil moisture data in the ISMN have been collected by a large variety of sampling techniques. The early networks contained in the ISMN (e.g. RUSWET, CHINA, and MONGOLIA) are based on gravimetric sampling, which is still considered the most accurate approach (Romano, 2014). However, it is labour intensive and invasive and, thus, measurements are infrequent (weekly or even coarser sampling), taken from a slightly different location every time, and may contain systematic errors between sampling dates.
Nowadays, the most commonly used techniques for systematic in situ sampling are based on the contrasting dielectric properties of soil and water and comprise time domain reflectometry (TDR) and frequency domain reflectometry (FDR; Robinson et al., 2008) more widespread because of their lower cost compared to TDR sensors, despite their lower accuracy (Romano, 2014;Brocca et al., 2017). However, great technical improvements are being achieved with capacitance sensors, thus improving their reliability. Yet, TDR and FDR techniques only provide point measurements, i.e. they are representative of small volumes of soil.
Slightly larger soil volumes (diameter of 15 to 60 cm) are observed with the neutron scattering method, where the density of thermal neutrons produced by scattering of fast neutrons on soil hydrogen can be related can be related to soil moisture through a calibration curve (Gardner and Kirkham, 1952;Romano, 2014) (e.g. IOWA network). The cosmic-ray method has been developed based on similar principles of neutron scattering, (Zreda et al., 2012(Zreda et al., , 2008. Cosmic-ray neutron sensors (CRNSs) are located on or above the soil surface and measure and count the number of cosmogenic neutrons in air above the land surface and that are in equilibrium with the soil. CRNS measurements are representative of significantly larger volumes compared to neutron probes (i.e. radius of a few 100 m and depths between 0.12 and 0.70 m, depending on the soil wetness), and thus can be used to bridge the scales between point observations and coarse resolutions of satellite observations and models. CRNS measurements, which are used in the COSMOS network, are sensitive to vegetation and have relatively high noise levels. To reduce this noise, the data are resampled to daily mean values.
Also, the measurements of the PBO_H2O network, which uses global positioning system (GPS) receivers, can be used to bridge the scales between point and satellite sensors. GPS sensors, initially used for geophysical and geodetic applications, have been found to be well suited for measuring soil moisture (Larson et al., 2008). The GPS signals are representative for an area of approximately 300 m 2 and are L band, hence making them ideal for comparison against satellite missions like SMAP and SMOS (Larson et al., 2008).
As shown, all measurement techniques have their strengths and limitations (Bogena et al., 2015;Dorigo et al., 2011b), which complicates their combined, bulk use. Depending on the application and process scales, the user needs to carefully consider which networks to use or exclude and how to interpret the results obtained. A task of the ISMN could be to translate community-based guidelines (Gruber et al., 2020; to recommendations for the use of individual data sets.

Spatial and temporal representativeness and scaling
Soil moisture is highly variable in space as a result of complex interactions between soil characteristics, topography, vegetation, and meteorological conditions. Depending on the spatial scale considered, the dominant controlling factor(s) can be different (Crow et al., 2012;Western et al., 2002). Hence, validation of satellite-derived products is hindered by the spatial mismatch between ground observations and satellite footprints (Gruber et al., 2020;Molero et al., 2018;Gruber et al., 2013). Ideally, to reduce the representativeness error of in situ references, enough stations should be deployed within a satellite footprint to develop robust areal soil moisture estimates (Brocca et al., 2007;Famiglietti et al., 2008;Colliander et al., 2017). However, this is a costly solution, and therefore, only a limited number of sites provide such a set-up (Crow et al., 2012;Colliander et al., 2017).
In the ISMN, the networks of VDS, BIEBRZA_S-1, RSMN, UDC_SMOS, LAB-net, FMI, and USDA-ARS have been set up in this way. When available, in situ stations within the same satellite pixel should be averaged, either through arithmetic mean or weighted average. Higher weights should be given to stations expected to be more representative of the satellite grid average, e.g. by using Voronoi diagrams , the inverse footprint method (Nicolai-Shaw et al., 2015a), the time stability concept (Vachaud et al., 1985), or landscape properties such as land cover and/or soil texture (Bircher et al., 2012). Alternatively, triple collocation analysis can be used to quantify and correct for spatial sampling errors of in situ stations (Miralles et al., 2010;Gruber et al., 2013). The resulting pixel-scale soil moisture ground reference, i.e. the averaged value from dense networks with several stations per satellite grid or the original time series in the case of sparse networks with a single station per pixel, should undergo a statistical rescaling (Gruber et al., 2020). Indeed, a direct comparison of in situ and satellite products would be subject to representativeness errors, which may dominate the total soil moisture retrieval errors Gruber et al., 2013;Molero et al., 2018).
Rescaling accounts for systematic representativeness errors arising from different spatial resolution and different vertical measurement support, i.e. penetration depths of microwave sensors and in situ sensor placement depths (Gruber et al., , 2020, but does not correct for random representativeness errors. One way to address this is the triple collocation described above. Also, systematic representation errors may have a time-varying (e.g. seasonal) component, which, unless explicitly accounted for, may lead to temporally aliased results. In any case, even though differences in spatial representativeness between ground and satellite measurements impact the evaluation metrics, single stations are still a valuable source for assessing the relative skill of soil moisture products with a similar footprint (Dong et al., 2020). Also, temporal representativeness issues may exist, but due to the hourly sampling of most data sets, the ISMN usually have a denser sampling than most remote sensing or model data sets. Thus, for most applications, the ISMN can be downsampled to the process or observation timescale of interest. However, some of the older, manually sampled data sets have sampling intervals of about 2 weeks and, thus, may miss many higher-frequency wet or dry spells. On a similar note, data sets with a daily sampling or averaging (e.g. cosmic-ray or GPS reflectometry observations) may miss rainfall peaks and are unsuitable for studying sub-daily variability.

Integration of low-cost sensors
The development and use of low-cost sensing technologies, especially in the environmental sciences, have seen a pronounced increase during the last decade. Such a rise is driven by several factors, e.g. the reduced cost of micro-controllers, electric components, and sensors (Mao et al., 2019). Even though a rich variety of low-cost soil moisture sensors based on different measurement principles has been developed (Chawla et al., 2019;González-Teruel et al., 2019;Kumar et al., 2016), capacitance sensors gained the most popularity because they are relatively inexpensive, easy to operate, and provide reliable observations (Kojima et al., 2016).
The considerably lower cost of these sensors compared to traditional probes makes them suitable for high-density and/or large-scale monitoring of soil moisture. The possibility to map soil moisture (and other environmental variables) with an unprecedented spatial coverage can generate new insights into its dynamics and create new opportunities. For instance, high-density networks of low-cost sensors can be used to reduce spatial representativeness errors by providing numerous observations within a satellite footprint (Teuling et al., 2006). Similarly, one can deploy a temporary low-cost sensors network to identify the most suitable location(s) for long-term monitoring. Such locations could then be equipped with professional sensors, while moving the low-cost network to other sites . Another exciting opportunity offered by low-cost sensors is the deployment of networks in low-income countries.
The ISMN has integrated low-cost sensor measurements from the GROW observatory (see Sect. 5.1.4), but a number of practical challenges arise when integrating such data.
A key aspect to consider is the lifetime. Depending on the robustness of the sensor (of both electronic components, such as micro-controllers, and sensor housing), reliable measurements can be recorded for a period ranging from a few months to years (Xaver et al., 2020) but even be shorter in extreme environments. Another aspect, particularly affecting automation, is data storage and transmission. Some low-cost sensors allow for wireless communication with a main server (Bogena et al., 2007;Majone et al., 2013), while other sensors have a limited internal storage, and data should be collected persistently over time (e.g. every 80 d; Xaver et al., 2020). Furthermore, it is necessary to assess their accuracy and robustness (Castell et al., 2017). Therefore, low-cost sensors should always undergo thorough evaluation to (i) quantify the agreement of low-cost sensor measurements with gravimetric samples and/or professional probes, ideally considering a wide range of soil and climatic conditions, (ii) assess the inter-sensor variability, and (iii) test the suitability for usage in field conditions (Domínguez-Niño et al., 2019;Kizito et al., 2008;Adla et al., 2020).

Integration of citizen observations
Citizen science is defined as the involvement of non-experts in collecting data (Bonney et al., 2009). Crowd-sourced measurements of soil moisture are now possible because of the development of low-cost sensors (Sect. 5.1.3). Crowdsourcing has the potential to overcome some of the most challenging issues of soil moisture monitoring, such as the use of many sensors to address scaling issues, and the fact that the observations can be carried out anywhere, as long as there are citizens willing to collect the data.
An outstanding example of a citizen science project focusing on soil moisture is the GROW Observatory (https: //growobservatory.org/, last access: 28 October 2021), from which data sets spanning at least a full year of observations have been added to the ISMN. Within GROW, thousands of low-cost sensors have been distributed to farmers, gardeners, and growers across Europe (Kovács et al., 2019). Focus areas have been identified based on a number of scientific criteria and the presence of active and engaged communities. Within each focus area, covering spatial scales from 20 to 200 km, hundreds of sensors have been distributed. Some sensors served as back-ups for potential failures, so that malfunctioning sensors could be promptly replaced, enabling long-term continuity of observations. In order to ensure high standards of the measurements, citizens have been trained in the selection of the correct locations in which to install sensors and how to properly install and maintain them through field manuals, online courses, meet-ups, and remote support (Kovács et al., 2019). Overall, more than 6000 sensors were deployed to provide soil moisture measurements across 13 European countries, demonstrating that citizen observatories can be integrated in Earth observation activities and contribute to validation of remotely sensed products .
The integration of citizen observations in the ISMN is challenging for multiple reasons. Crucial is the long-term engagement of citizens, which needs to be thoroughly addressed from the early stages of designing a citizen observatory. It is necessary to create long-lasting communities that go beyond the duration of the contributory projects (Grainger, 2017). Successful citizen observatories have been those where citizens were able to benefit directly from the data collected by others (Fritz et al., 2017). Additionally, education and resources are essential to boost individual motivation to continuously participate and to sustain membership renewal as natural participation cycles change . Even if proper training is provided to citizens, uncertainty in the quality of the data is the main limitation for the use of crowd-sourced observations in science (Lukyanenko et al., 2020). The reliability of measurements from individual sensors is unknown because of, e.g., the selection of unsuitable locations, incorrect sensor installation, and existence of defective sensors. However, increasing the number of sensors within the same satellite pixel reduces this uncertainty. Additionally, a visual inspection of the time series and the application of automated quality flagging controls, such as those developed within the ISMN, could be used to mask suspicious observations . Overall, it has been shown that well-organised citizen science projects can provide trustworthy contributions to the scientific community (Kosmala et al., 2016;Palmer et al., 2017).

Automation
Part of the contributions to the ISMN (7 networks with around 900 stations) are inserted into the ISMN in NRT. This process is fully automated, including data downloaded from data providers, harmonisation, flagging, insertion into the database, and updating metadata tables. However, because of different data recording and handling mechanisms at the provider side and differences in data sharing mechanisms and policies, the ISMN is also partially manually operated and will probably also have to continue in this way in the future. The differences between the fully automated and the manual approach are confined to the data download and data ingestion into the processing chain. Data harmonisation and quality control are automated for all data sets (see Fig. B1).
A major challenge in the automation process is the enormous heterogeneity of input data formats. Moreover, these change over time for individual networks, stations, and even sensors, as sensors may fail or the method for data logging is changed. Thus, error detection and handling is of utmost importance, and frequent adaptation of the system is required to cover changes in input data.
A potential way to promote the automated insertion of new data is to allow only data that comply with a strict, yet to be defined, data standard. At the same time, this may be bear the risk of raising the barrier to contribute to the ISMN too high for several scientists.

Including new networks
Since data sharing with the ISMN is entirely built upon a voluntary basis and data usage is open and free, new networks may be reluctant to join. The ISMN is in contact with several network providers who are happy to collaborate but are restrained by data sharing policies, which does not allow data sharing at all or only after a certain time. Furthermore, for governmentally operated networks it is often not allowed to share data as open-access or it is unclear who can make these decisions.
Solving such issues can be supported by collaborative data-hosting facilities, like the Global Terrestrial Network -Hydrology (GTN-H; https://www.gtn-h.info/, last access: 28 October 2021), which have a strong connection to governmental bodies like the International Centre for Water Resources and Global Change under the auspices of UNESCO, the United Nations Environment Programme, the Interna-tional Science Council, and World Meteorological Organization (WMO).

Appropriate recognition of data providers
Not all users correctly cite the ISMN and the involved networks, as stated in the terms and conditions for ISMN data use (ISMN, 2020). Proper citation of single networks is important for giving data providers the recognition that is required to convince funding agencies to continue supporting the maintenance and operation of these networks. In the end, this not only affects the networks themselves but also the open-access availability of data through the ISMN as a whole. Some network providers have raised their concerns in this matter and, hence, continuous efforts are needed to maintain a strong visibility of network data providers towards users.

Transparency and traceability
The ISMN data collection is constantly evolving. New data records are added, and existing ones are extended or, if necessary, reprocessed and corrected. Not only the data but also the underlying code changes. These updates and any retroactive changes made to the data archive are tracked within the ISMN but not yet readily passed on to the users. The updates can lead to differences in analyses on the user side, e.g. when considering obvious changes such as temporal extensions or new stations, and lead to non-reproducible results. Therefore, any update must be traceable and clearly communicated, requiring a system to track and store these changes. Version tracking and digital object identifiers (DOIs) are ways of identifying each database access and, therefore, allow tracing back to past states of the ISMN. Such a mechanism is required to make the ISMN compliant with the FAIR (Findability, Accessibility, Interoperability, and Reusability) data principles (Wilkinson et al., 2016).

Conclusions
In this study, we reviewed the first decade of operations of the ISMN. Besides satisfactorily fulfilling its initial target, i.e. supporting satellite soil moisture product validation and calibration, many additional more or less foreseen uses have emerged. In addition, an increasing number of services and product development chains have routinely included the use of ISMN data in their operational structure. The ISMN started as research activity funded by ESA, and ever since, ESA have provided continuous financial support for ongoing research, development, and operations. In spring 2021, a milestone was achieved when the German Ministry of Transport and Digital Infrastructure announced that it will commit to permanently fund the ISMN operations and development from late 2021. The execution will be with the German Federal Institute of Hydrology (BfG) and the International Centre for Water Resources and Global Change (ICWRGC) based in Koblenz, Germany. At the same time, all network data sets have always been freely contributed by dedicated researchers. To guarantee the availability of these resources for climate and environmental monitoring also for the next decade, we plead with governments and international bodies for systematic funding of its participating data-providing networks too.
To maximise geographic coverage and data usage, the policy of the ISMN has always been to integrate data sets without strict requirements on sensors, sampling protocol, or data quality. The resulting strongly heterogeneous data set characteristics call for far-reaching quantification and traceability of errors, from sensor calibration and data download to the point measurement and the spatiotemporal support of the application. Thus, besides expanding its coverage to data-poor regions and landscapes, the ISMN shall spend the next few years focusing on developing methodologies to fully characterise the uncertainties and usability associated with the individual data sets. An important step in this direction will be made within the ESA-funded Fiducial Reference Measurements for Soil Moisture initiative. With the foreseen developments, the ISMN will reach a new level of maturity in the coming decade.

Appendix A: Network overview
A summary of each network is given in Table A1, while more details are given in the subsequent paragraphs.

A1 AACES
AACES stands for the Australian Airborne Cal/val Experiments for SMOS. This campaign network covers a 500 × 100 km 2 study area located in southeastern Australia, covering a variety of topography, vegetation, and climate classes (Rüdiger et al., 2007;Peischl et al., 2012). Measurements of soil moisture, soil temperature, and precipitation were taken between May 2009 and September 2010. The AACES calibration and validation campaigns were a temporary project; therefore, no further data sets will be available.

A2 AMMA-CATCH
The AMMA-CATCH observatory gathered data from densely instrumented mesoscale sites in West Africa (Benin, Niger, and Mali). The network is devoted to long-term regional monitoring of global change impacts on the critical zone. Height stations in Benin and four stations in Niger of the network are included in the ISMN from 2006 to the present, including surface soil moisture and root zone soil moisture until 1 m depth. For more information, see Galle et al. (2018).

A3 ARM
The Atmospheric Radiation Measurement (ARM) programme has three soil sensor networks across northcentral Oklahoma and southern Kansas in the USA, including the Soil Water And Temperature System (SWATS), through 2016, and, presently, the Soil Temperature and Moisture Profile (STAMP), and Surface Energy Balance System (SEBS). The SWATS and STAMP have profiles at 5-8 depths up to 175 cm, while the SEBS measure at 2.5 cm. All sites are co-located with a suite of meteorological and radiative measurements available from https://arm.gov/ (last access: 28 October 2021; Cook, 2016a;Cook and Sullivan, 2018).

A4 AWDN
The Automated Weather Data Network (AWDN) network is located in Nebraska, USA, and consists of 50 stations. The data sets were collected by the High Plains Regional Climate Center, and data availability is from 1998 to 2010 but varies per station.

A5 BIEBRZA_S-1
The dense soil moisture sites suited for the validation of highresolution Sentinel-1 soil moisture products were established in the Biebrza Wetlands in northeastern Poland in May 2015 (Musial et al., 2016). One site is located across drained grassland and the second one across natural temporarily flooded marshland. They are located within 7 km distance; therefore, weather conditions are similar but soil moisture regimes dif-fer. Both sites are equipped with nine soil moisture stations with five soil probes each and a weather station. The sites are homogeneous regarding vegetation cover. The organic peat soils feature porosity values up to 82 %. The sites are maintained by the Remote Sensing Centre of the Institute of Geodesy and Cartography (IGiK).

A6 BNZ LTER
The Bonanza Creek Long-Term Ecological Research (BNZ LTER) network consists of 12 stations located in the boreal forest near Fairbanks, Alaska (Van Cleve et al., 2015). Soil moisture measurements start around the year 2000 for each station. In addition to soil moisture at several depths, observations are available of soil temperature, air temperature, precipitation, snow depth, and snow water equivalent.

A7 CALABRIA
The CALABRIA network operates five TDR stations measuring volumetric soil moisture at 30, 60, and 90 cm depths. The stations were installed in 2000 by the Centro Funzionale Decentrato of the Calabria region for hydrometeorological monitoring for flood and landslide risk mitigation. For more information on the performance of the network, see Brocca et al. (2011).

A8 CAMPANIA
The CAMPANIA network consisted of two stations located near the city of Naples in southern Italy. It was managed by the Centro Funzionale per la Previsione Meteorologica e il Monitoraggio Meteo-Pluvio-Idrometrico e delle Frane. The ISMN contains data from the operational start in 2000 until the end of 2008. The data sets include soil moisture measured at a depth of 0.30 m, precipitation, and air temperature. For more information on the performance of the network, see Brocca et al. (2011).

A9 CARBOAFRICA (recently renamed SD_DEM)
CARBOAFRICA/SD_DEM is located outside El Obeid in Kordofan, Sudan, and has been active since  (Liu et al., 2001). The data set was transferred to the ISMN from the Global Soil Moisture Data Bank (Robock et al., 2000) and contains measurements made on the 8th, 18th, and 28th of each month.

A11 COSMOS
The Cosmic-ray Soil Moisture Observing System (COS-MOS; Zreda et al., 2012) started in 2009 with a grant from the U.S. National Science Foundation as a 4-year project for demonstration of the then-new technology of sensing soil moisture with cosmogenic neutrons (Zreda et al., 2008). On the completion of the project, the network had 60 sites, most of them in the USA and a few in South America, Europe, and Africa. The network produces hourly soil moisture data, available in real time, to all, without restrictions. After the project funding ended in 2013, the network operations continued with the support of Quaesta Instruments, a private company. The current status is active, but the sensors are being relocated and repurposed.

A12 CTP_SMTMN
CTP_SMTMN is a multiscale Soil Moisture and Temperature Monitoring Network on the central Tibetan Plateau (Yang et al., 2013). The network, with an average elevation of 4650 m a.s.l., consists of 56 stations that measure soil moisture and soil freeze/thaw status at three spatial scales (100, 25, and 9 km). The terrain is relatively flat and covered by sparse and short grasses; the annual precipitation is about 400-500 mm. The network has been in operation since 2010.

A13 DAHRA
The DAHRA field site is located in a typical low tree and shrub savanna environment in Senegal. To limit the uncertainty in the comparison of remote sensing products and models, the site was selected to be flat, with homogeneous vegetation cover within a radius of at least 3 km. The site is equipped with two towers, i.e. a meteorological tower with meteorological, hydrological, and radiation sensors and an eddy covariance flux tower. More information can be found in Tagesson et al. (2014).

A14 FLUXNET-AMERIFLUX
AMERIFLUX is the North American contribution to the global FLUXNET. At this moment, two sites close to Sacramento, California, i.e. Tonzi and Vaira ranches, are distributed through the ISMN. Both stations provide soil moisture measurements at eight different depths down to 0.60 m. Additionally, soil temperature, air temperature, and precipitation are provided.

A15 FMI
This distributed network of in situ measurement stations gathering information on soil moisture and soil temperature has been set up in recent years at the Finnish Meteorological Institute's (FMI) Sodankylä Arctic research station in northern Finland. Between 2010 and 2017, 16 stations were installed around Sodankylä and 3 further north at Saariselkä. Each station covers a vertical measuring profile and two additional horizontal measuring points. The vertical profiles have five sensors placed close to the station at the following depths: 5, 10, 20, 40, and 80 cm in mineral and semi-organic soils and at 5, 10, 20, 30, 40 cm in organic soils. The two additional horizontal measuring points, at depths of 5 and 10 cm, have been installed approximately 10 m from the station in opposing directions to catch small-scale variations in topsoil moisture. A more detailed description is provided in Ikonen et al. (2016Ikonen et al. ( , 2018.

A16 FR_AQUI
The FR_AQUI network was set up by INRAE in the Bordeaux-Aquitaine region (southwestern France) in 2012 Al-Yaari et al., 2018a). Measurements taken at this five-station network (plus the nearby Integrated Carbon Observation System (ICOS) Bilos site), include soil moisture and temperature at various depths and the height of the groundwater table. There are four sites installed in the Landes forest, which is one of the largest coniferous forests in Europe, and one site is installed close to vineyards of the Bordeaux Graves region.

A17 GROW
GROW gathered crowd-sourced observations to assess the temporal and spatial consistency of various satellite-derived soil moisture products. In total, 6500 low cost sensors were deployed in 24 GROW places in 13 countries across Europe. A subset of 150 sensors was transferred to the ISMN  and contains measurements of soil moisture and air temperature. The complete data set is licensed and available at https://doi.org/10.15132/10000156. More information on the quality of the data can be found in Xaver et al. (2020).

A18 GTK
This network is operated by the Geological Survey of Finland (GTK) and contains seven stations throughout the country, with one station north of the polar circle. Measurements are taken from the upper soil layer until 0.9 m depth (soil moisture and soil temperature, as well as air temperature). The data are available from the years 2001 to 2012, but the availability varies per station.

A19 HiWATER_EHWSN
The HiWATER_EHWSN network is located on an irrigated farmland in the middle stream of the Heihe River basin close to the Gobi Desert, China. It consists of short time series between April and September 2012 collected at 174 stations by the Cold and Arid Regions Environmental and Engineering Research Institute (CAREERI) of the Chinese Academy of Science Jin et al., 2014).

A20 HOAL
The Hydrological Open Air Laboratory -Soil Network (HOAL SoilNet) was set up in Petzenkirchen, Austria, as part of a concerted effort to advance the understanding of water-related flow and transport processes in a 66 ha agricultural catchment (Blöschl et al., 2016). Soil moisture has been monitored since 2013 at about 30 locations, at four depths, and at time intervals of 30 min using time domain transmission sensors. Measurements are taken from permanent stations, located in grassland, forest, or at field boundaries, as well as from stations that are temporarily installed in cropland .

A21 HOBE
HOBE is a hydrological observatory established in the western part of Denmark in the Skjern River catchment (Jensen and Refsgaard, 2018). Within the sub-catchment of Ahlergaarde, data have been collected from a network of 30 soil moisture stations distributed within the sub-catchment, according to respective fractions of classes representing mainly land cover and soil type (Bircher et al., 2012). At each station, Decagon 5TE capacitance sensors have been installed at 2.5, 22.5, and 52.5 cm depths. The sensors are logged every 30 min.

A22 HSC_SEOLMACHEON
HSC_SELMACHEON was a single station located in South Korea. Data were collected by the Hydrological Survey Center (HSC) and Water Resource and Remote Sensing Laboratory (WRRSL) and are available for the period August-September 2011.

A24 ICN
The former Illinois Climate Network (ICN) was operated by the Water and Atmospheric Resources Program of the Illinois State Water Survey and formerly integrated in the Global Soil Moisture Data Bank (Hollinger and Isard, 1994). Between 1983 and 2010, ICN covered 19 stations measuring soil moisture and soil temperature down to 2 m depth, as well as precipitation.

A25 IIT_KANPUR
The network IIT_KANPUR network consisted of a single station and was managed by the Hydraulics and Water Resources Laboratory at the Indian Institute of Technology Kanpur, India. Soil moisture measurements were made at four depths (10, 25, 50, and 80 cm) between from June 2011 to November 2012. The station was situated in the Ganges River basin, which is the largest river basin in India, and the soil type at the observation site is clayey silt.

A26 IMA_CAN1
The IMA_CAN1 network is operated by the Institute for Agricultural and Earthmoving Machines (IMAMOTER) of the Italian National Research Council (CNR), now part of STEMS-CNR. It is located in an experimental vineyard in Carpeneto, in the hilly Alto Monferrato region, which is a valuable vine growing and wine production area in the Piedmont region in northwestern Italy. The monitored vineyards are part of the Experimental Vine and Wine Centre of Agrion Foundation. The stations in the network provide measurements of soil moisture, precipitation, air temperature and humidity, hourly runoff, and event soil losses. Hourly volumetric soil moisture was measured by 12 Decagon 5TM sensors in the period 2011-2015, both in grassed and tilled vineyards, in correspondence with the track and no-track position, both down and up the hill (Biddoccu et al., 2016;Capello et al., 2019;Raffelli et al., 2018).

A27 IOWA
The IOWA network was located in two catchments in the southwestern part of Iowa, USA. Soil moisture observations from six stations, until 2.6 m depth, with an interval of twice a month during 1972 to 1994 (April to October) are included in the ISMN (Robock et al., 2000). This network was formerly included in the Global Soil Moisture Data Bank.

A28 IPE
The Instituto Pirenaico de Ecologia (IPE) network runs two stations located in Aragon, northeastern Spain. The stations have been collecting meteorological data with at least an hourly time resolution (air and soil temperatures, soil moisture, relative humidity, radiation, and wind speed) since 2008 in a Mediterranean oak forest (Agüero) and a semi-arid pine forest (Peñaflor). These measurements are related to dendrometer hourly records of changes in the root and stem (Agüero; see Alday et al., 2020) or stem (Peñaflor) increment of the main tree species.

A29 iRON
The interactive Roaring Fork Observation Network (iRON) is a series of 10 stations operated by the Aspen Global Change Institute spread across the elevations of the Roaring Fork Watershed, located in the Southern Rocky Mountains of the USA. This data set includes soil moisture at 5, 20, and 50 cm, soil temperature at 20 cm, and additional weather measurements variable by station. Further information can be found in Osenga et al. (2019) or at https://www.agci.org/ iron/about (last access: 28 October 2021).

A30 KHOREZM
The KHOREZM network in Uzbekistan is located between the Amu Darya river and the border with Turkmenistan and was part of a project conducted by the University of Würzburg, Germany (Patrick Knöfel). There were seven stations that made soil moisture, soil temperature, air temperature, and surface temperature measurements from 2010 April to 2011 September, and these data are included in the ISMN. Although soil moisture was not observed continuously, the measurements are still a valuable contribution since no other recent observations are available in this region.

A31 KIHS_CMC
The Korea Institute of Hydrological Survey (KIHS) has been running the Cheongmicheon (CMC) network since 2009, with annually returning measurements from March till December. It comprises 56 TDR Buriable Waveguide soil moisture sensors at 18 stations located on an area of approximately 50 × 50 m 2 . All stations have a sensor installed at 0.1 m and additional sensors at varying depths, i.e. at 0.3, 0.4, 0.6, and 0.9 m.

A32 KIHS_SMC
KIHS_SMC is operated by the Environmental and Remote Sensing Lab of the Korea Institute of Hydrological Survey. The 51 soil moisture sensors (depths from 0.1 to 0.6 m) are located on a mountain slope distributed over 19 stations in close proximity to each other.

A33 LAB-net
LAB-net was created as the first soil moisture network in Chile to support remote sensing research on drought and wa-ter use conflicts (Mattar et al., 2016(Mattar et al., , 2014. The three stations measuring soil moisture, soil temperature, precipitation, and air temperature between 2014 and 2017 over various land cover types have been included in the ISMN.

A34 MAQU
The MAQU monitoring network Dente et al., 2012) is situated on the northeastern fringe of the Tibetan Plateau, covering an area of approximately 40 × 80 km, with the elevation varying from 3200 m to 4200 m a.s.l. The network provides access to soil moisture and temperature profiles (5, 10, 20, 40, and 80 cm) measured at 15 min intervals. Soil moisture and temperature data from 2008 to 2018 are included in the ISMN.

A35 METEROBS
The METEROBS (MET European Research OBServations) network measured soil moisture at a single site between October 2011 October and May 2012. It was located in the Apennines mountains in the rural area of Benevento, Southern Italy. Soil Moisture measurements from five layers until 0.5 m depth have been included in the ISMN.

A36 MOL-RAO
The MOL-RAO network is operated by the German Meteorological Service (DWD) and is part of the operational measuring programme of the MOL-RAO (Meteorological Observatory Lindenberg -Richard Aßmann Observatory). The network is situated in the northeast of Germany and consists of two stations. While the station at Falkenberg has a grass-type vegetation, Kehrigk is situated in a pine forest (Beyrich and Adam, 2007). Volumetric soil moisture, soil temperature, air temperature, and precipitation have been provided in a halfhourly resolution since 2003.

A37 MONGOLIA
Soil moisture data sets for 44 stations were collected by the National Agency of Meteorology, Hydrology, and Environment Monitoring in Ulaanbaatar. All observations were taken using the gravimetric technique and initially provided as volumetric plant-available water (in percent). Volumetric soil moisture (m 3 m −3 ) was calculated by first extracting texture properties of all sites from the Harmonized World Soil Database and subsequently calculating the wilting levels for all stations using the equations of Saxton and Rawls (2006). Soil moisture measurements are provided 3 times a month (on the 8th, 18th, and 28th) from 1964 to 2002 during the warm period of the year, which runs from April until the end of October (e.g. Robock et al., 2000).

A38 MySMNet
The Malaysian Soil Moisture Network (MySMNet) has been operational since 2014. It deploys seven stations (four on an oil palm plantation, two on shrubland, and one on an orchard) that collect soil moisture at 5, 50, and 100 cm depth, soil temperature at 5 cm, air temperature, and relative humidity, all on an hourly basis. The soil moisture sensors used are the Wa-terScout SM100 (Kang et al., 2019).

A39 NAQU
The NAQU network is part of the Tibetan Plateau observatory of plateau-scale soil moisture and soil temperature (Tibet-Obs) and consists of 11 stations located in a cold semi-arid climate in Tibet at elevations over 4500 m. Soil moisture and soil temperature have been measured at five different depths (5, 10, 20, 40, and 80 cm) from 2010 onwards .

A40 NGARI
The NGARI network is part of the Tibetan Plateau observatory of plateau-scale soil moisture and soil temperature (Tibet-Obs) and consists of 23 stations located in a cold arid climate in Tibet at elevations between 4200-4700 m. Soil moisture and soil temperature have been measured at five different depths (5, 10, 20, 40, and 80 cm) from 2010 onwards .

A41 NVE
This is a network of monitoring stations of the Norwegian Water Resources and Energy Directorate (NVE). Soil moisture, soil temperature, and air temperature are measured. Currently, three out of eight stations are accessible through the ISMN. The stations are situated in the Trøndelag region in eastern Norway. Data are available from 2012 to 2019 at more than five different depth layers and down to 1.5 to 2 m depth.

A42 ORACLE
The ORACLE network includes six stations. The data sets reach back to the year 1985 and are available until 2013. OR-ACLE is a research observatory east of Paris used to study the Grand Morin and Petit Morin river catchments, particularly floods, low water periods, water quality, and the impact of human activities on the environment.

A43 OzNet
OzNet was established in 2001 with eight sites across the Murrumbidgee River catchment and a cluster of five further sites in the Adelong and Kyeamba catchments. This was further extended in 2003 to include a total of 11 sites at Kyeamba, for a GRACE validation experiment, and 13 sites at Yanco, for SMOS pre-launch algorithm development and post-launch calibration-validation. The Yanco site was further extended to have clusters of stations across 9 and 3 km grids in crop and grassland areas for SMAP algorithm development, calibration, and validation (Smith et al., 2012;Young et al., 2008).

A44 PBO_H2O
The PBO_H2O network is a former near-real time network of the ISMN hosted by the University of Colorado Boulder, USA. It consisted of 159 stations distributed in the west of the USA, the Bahamas, the Dominican Republic, Puerto Rico, Colombia, South Africa, and Saudi Arabia. Soil moisture (measured using GPS reflections), precipitation, air temperature, and snow depth from 2004 to 2017 is stored in the ISMN database (Larson et al., 2008). The network was discontinued in 2018 because of lacking financial support.

A45 PTSMN
The Patitapu Soil Moisture Network (PTSMN) was established in 2016 on the hill country landscapes of the east coast of New Zealand's North Island. PTSMN was deployed to capture spatiotemporal soil moisture trends on various topographical positions distributed over a 13.8 km 2 area. The network is composed of 20 multi-sensor probes that were calibrated to the site-specific soils (Hajdu et al., 2019). The sensors provide readings at four consecutive depths down to 0.43 m.

A46 REMEDHUS
REMEDHUS is the University of Salamanca Soil Moisture Measurement Stations Network and was installed in March 1999 with a set of old TDR stations and manual measurements. It is one of the first soil moisture networks in Europe. The network was automated and upgraded with capacitance probes in 2005. The REMEDHUS data available in the ISMN cover the period since its automation. REMEDHUS is located in an agricultural area in the central part of the Duero basin (Spain). The network currently has 20 stations that measure soil moisture and soil temperature hourly in the 0-5 cm layer .

A47 RISMA
The Real-time In-Situ Soil Monitoring for Agriculture (RISMA) network was established in 2011 by Agriculture and Agri-Food Canada at agricultural locations in Ontario, Manitoba, and Saskatchewan (Ojo et al., 2015;L'Heureux, 2011;Canisius, 2011). There are currently 23 RISMA stations collecting hourly soil moisture/temperature data at depths to 1-1.5 m in combination with meteorological data. Calibrated, quality-controlled soil moisture and weather data are provided to the ISMN on an annual basis.

A48 RSMN
The Romanian Soil Moisture Network (RSMN) consists of 19 stations homogeneously distributed over Romania. The network is managed by the Romanian National Meteorological Administration and is part of the ASSIMO project, which aims to create a framework for the evaluation of current and future satellite microwave-derived soil moisture products.

A49 Ru_CFR
The Ru_CFR network includes two stations located on the territory of the Central Forest Reserve (CFR), Tver region, Russia. Since 2015, half-hourly continuous measurements of soil moisture have been carried out. Both stations provide measurements of soil moisture at four different depths and measurements of soil temperature, air temperature, and precipitation.

A50 RUSWET-AGRO, RUSWET-GRASS, and RUSWET-VALDAI
The three historical RUSWET networks were agricultural prediction campaigns conducted by the State Hydrological Institute of the former Soviet Union within the area of present-day Russia. Measurements were taken from 1952 until 2002 and initially distributed through the Global Soil Moisture Data Bank. Altogether, the networks operated 337 sites at which soil moisture was measured 3 times per month via gravimetric sampling. At RUSWET-VALDAI, soil temperature, precipitation, and air temperature were also collected. RUSWET contributes both the northernmost (on Mc-Clintock Island) and the earliest (8 June 1952) observations to the ISMN (Robock et al., 2000).

A51 SASMAS
The Scaling and Assimilation of Soil Moisture and Streamflow (SASMAS) monitoring network commenced commissioning in late 2002, with a total of 26 stations in operation by 2003 across a 6500 km 2 catchment (Rüdiger et al., 2007). Soil moisture is observed up to a depth of 90 cm (where possible) at depth intervals of 0-30, 30-60, and 60-90 cm. The network was developed as a nested catchment with three different types of scales (low-resolution across the entire catchment, medium density across to smaller subcatchment, and very high density across a 175 ha single reach). This site also hosted the second National Airborne Field Experiment (NAFE) campaign, for which additional data were collected.

A52 SCAN
The Natural Resources Conservation Service (NRCS) operates the comprehensive, USA-wide Soil Climate Analysis Network (SCAN). SCAN supports natural resource assessments and conservation activities through its network of automated climate monitoring and data collection sites. SCAN focuses primarily on agricultural areas of the USA, Puerto Rico, and the Virgin Islands. The network consists of 216 stations located across the USA and reports soil moisture, soil temperature, precipitation, temperature, and other climatic variables hourly. Soil sensors are situated at 5, 10, 20, 50, and 100 cm depths (Schaefer et al., 2007).

A53 SKKU
The SKKU network was located at an evenly and moderately vegetated botanical garden in South Korea. It was operated by Sungkyunkwan University (SKKU) from 2014 to 2016, as part of a project for evaluating Cosmic-Ray Neutron Probe (CRNP) soil moisture (Nguyen et al., 2017). The network consisted of 10 stations, and soil moisture measurements were taken at four different depths (i.e. 10, 20, 30, and 40 cm).

A54 SMOSMANIA
The SMOSMANIA network was installed in southern France by Météo-France, the French national meteorological service, in order to monitor in situ soil moisture and soil temperature in contrasting soil and climatic conditions at operational automatic weather stations (Calvet et al., 2007). The SMOSMANIA network is composed of 21 stations, forming an Atlantic-Mediterranean transect, over a large variety of mineral soils ranging from sand and clay to silt loam (Calvet et al., 2016).

A55 SNOTEL
NRCS instals, operates, and maintains an extensive, automated data collection network called SNOTEL (short for snow telemetry; Leavesley et al., 2008). SNOTEL is part of the Snow Survey and Water Supply Forecasting (SSWSF) programme and is designed to collect snowpack and related climatic data in the western USA and Alaska. The programme operates under technical guidance from the NRCS National Water and Climate Center (NWCC). With the majority of the water supply in the west arriving in the form of snow, data on snowpack provide critical information to decision-makers and water managers. SNOTEL currently consists of a network of over 860 automated SNOTEL stations, of which 463 stations have soil moisture and soil temperature sensors.

A56 SoilSCAPE
The Soil moisture Sensing Controller And oPtimal Estimator (SoilSCAPE) is a wireless soil moisture sensor network for measurements of surface-to-root-zone profiles of soil moisture (Moghaddam et al., 2011(Moghaddam et al., , 2016Shuman et al., 2010). It is designed for long-term, ultra-efficient, unattended field operations. Several networks are deployed in the USA, including in the states of California, Arizona, Colorado, Alaska, and New York. Additional deployments are planned in New Mexico, USA, and in New Zealand. The SoilSCAPE architecture includes a local coordinator (LC) and multiple end devices (EDs). The LC is the central command centre of the network, receiving data from the several soil moisture and temperature sensors connected to each ED. The sensors include Decagon 5TM and Teros 12, depending on how recently they were deployed.

A57 SWEX_POLAND
The Soil Water and Energy exchange -Poland (SWEX_POLAND) network was operated between 2000 and 2013 by the Institutes of Agrophysics, Polish Academy of Sciences, in Lublin. The network consisted of six stations located in the wetlands of Poleski Park Krajobrazowy to support SMOS product calibration and validation. Soil moisture and temperature measurements were taken down to 1 m depth, along with precipitation observations (Marczewski et al., 2010).

A58 SW-WHU
The SW-WHU network was hosted by Wuhan University (Chen et al., 2015a, b). SW-WHU is a high-density network, with nearly 100 soil moisture and temperature sensors within 1 km 2 . It adopts a Narrowband Internet of things (NB-IoT) technique for data transmission at low power consumption. Therefore, the data of SW-WHU are particularly valuable for soil moisture validation, monitoring, and application at a very high spatiotemporal resolution (X. .

A59 TAHMO
The Trans-African Hydro-Meteorological Observatory (TAHMO; https://tahmo.org/, last access: 28 October 2021) presently runs a network of over 600 meteorological weather stations in more than 20 African countries. The stations measure all standard weather parameters, such as barometric pressure, wind speed, rainfall, and radiation. Each station also has five open ports that can be used for additional sensors, such as soil moisture sensors. Presently, 70 stations have soil moisture sensors, some of them at several depths. Ideally, many stations will be upgraded by soil moisture sensors in the near future.

A60 TERENO
TERENO consists of four terrestrial observatories that represent typical landscapes in Germany and central Europe and are considered to be highly vulnerable to the effects of global and climate change (Zacharias et al., 2011;Bogena et al., 2012). TERENO combines observations with comprehen-sive large-scale experiments, integrated modelling, remote sensing, and novel measurement technologies to increase our understanding of the functioning and feedbacks of terrestrial ecosystems (Bogena, 2016). The long-term observation platform of TERENO is composed of various measurement systems, including networks of climate and lysimeter stations, eddy covariance towers, and networks of soil, surface water, and groundwater sensors. Almost all online measurements are freely accessible via the TERENO data portal (http://www.tereno.net/ddp/, last access: 28 October 2021).

A61 UDC_SMOS
The former German UDC_SMOS network was hosted by the Department of Geography at the University of Munich, in cooperation with the Bavarian State Research Center for Agriculture, and funded by the German Aerospace Centre (DLR). It was located in grassland in the Bavarian region around Munich as an official European SMOS calibration/validation test site. In total, 11 stations provided soil moisture data from 2007 until 2011, up to 40 cm depth, as measured by several types of sensors (Loew et al., 2009;Schlenz et al., 2012a).

A62 UMBRIA
This soil moisture monitoring network in the north of the Umbria region, in the upper Tiber River basin, operates three stations in real time (Torre dell'Olmo, Petrelle, and Cerbara). Each station measures at 10, 20, and 40 cm depth. Additional stations of the network have not been operational since 2015 due to a lack of resources for their maintenance. Provided financial resources become available, the stations that are no longer functioning will be restored (Brocca et al., 2011(Brocca et al., , 2009(Brocca et al., , 2008.

A63 UMSUOL
UMSUOL (Umidita del Suolo) is a one-station network located close to Bologna, northern Italy. Soil moisture measurements at seven different depths are provided by the Agenzia Regionale Prevenione Ambiente (ARPA). The ISMN contains data from the years 2009 and 2010.

A64 USCRN
The U.S. Climate Reference Network (USCRN) contains 114 stations sparsely distributed across the contiguous USA. Each station has three sets of soil moisture/temperature probes at five depths (5, 10, 20, 50, and 100 cm), in addition to air and surface temperature, precipitation, relative humidity, and solar radiation (Bell et al., 2013). The stations were installed between 2009 and 2011 and are still in operation in 2021. The stations are maintained by the National Oceanic and Atmospheric Administration Atmospheric Turbulence and Diffusion Division (NOAA ATDD), and their data are maintained by NOAA National Centers for Environmental Information (NOAA NCEI).

A65 USDA-ARS
The U.S. Department of Agriculture Agricultural Research Service (USDA ARS) operates a number of Long-Term Agroecosystem Research (LTAR) sites, some of which have spatial coverage of soil moisture and soil temperature. As experimental sites, the locations and configuration of the stations can change depending on the current scientific questions being addressed. A description of the sites can be found in Jackson et al. (2011).

A66 VAS
The Valencia Anchor Station (VAS) network is operated by the Climatology from Satellites Group and Jucar River Basin Authority of the University of Valencia, Spain. The network is located in Spain and consists of three stations. The data sets are available for the years 2010 and 2011.

A67 VDS
The VDS network is run by VanderSat, a Dutch company that specialises in providing global satellite-observed data and services over land. The VDS network consists of four stations located near the city of Bago in Myanmar. The network was installed to validate satellite soil moisture products in the tropics. The network has two measurement periods, i.e. one from June 2017 to July 2018 and one from March 2020 onwards.

A68 WegenerNet
The WegenerNet, located in the foreland of the southeastern Austrian Alps, is a long-term weather and climate monitoring facility comprising 155 hydrometeorological stations in a dense grid, with one station every about 2 km 2 (Kirchengast et al., 2014;Fuchsberger et al., 2021). Together with a range of meteorological variables, such as temperature, humidity, precipitation, and wind, it also measures soil moisture and temperature at 12 stations. These variables are measured at 0.2-0.3 m depth in diverse soil types representative for the region.

A69 WSMN
The Wales Soil Moisture Network (WSMN) was founded in July 2011. It consists of a total of nine monitoring sites located in mid-Wales representing a range of conditions typical of the Welsh environment, with climate ranging from oceanic to temperate and the most typical land use/cover types. The data set acquired in the network is composed of 0-5 (or 0-10) cm soil moisture, soil temperature, precipitation, as well as other ancillary data (Petropoulos and McCalmont, 2017 Appendix D: Metadata structure Figure D1. Overview of all mandatory metadata information available for each station within the ISMN database.   Code and data availability. Upon registration, all data and metadata described in this paper can be downloaded for free from https://ismn.earth (ISMN, 2021). A Python package to read and plot the data and metadata (TUW-GEO/ismn: v1.1) can be accessed at https://doi.org/10.5281/zenodo.855308 (Preimesberger et al., 2021). An example of output of this package is shown in Fig. 1. A Python package containing the ISMN quality control procedures for in situ soil moisture is available on GitHub (https://github.com/ TUW-GEO/flagit; Aberer et al., 2021).
Author contributions. WD conceived the paper and wrote it together with IH, DA, LS, IP, LZ, and WP. All other authors contributed data to the ISMN and provided feedback on the paper.
Competing interests. The authors declare that they have no conflict of interest.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Acknowledgements. The authors greatly acknowledge the financial support provided by ESA through various projects including the following: Review statement. This paper was edited by Thom Bogaard and reviewed by Jan Friesen and Mirko Mälicke.