Advances in soil moisture retrieval from multispectral remote sensing using unoccupied aircraft systems and machine learning techniques

This study investigates the ability of machine learning models to retrieve the surface soil moisture of a grassland area from multispectral remote sensing carried out using an unoccupied aircraft system (UAS). In addition to multispectral images, we use terrain attributes derived from a digital elevation model and hydrological variables of precipitation and potential evapotranspiration as covariates to predict surface soil moisture. We tested four different machine learning algorithms and interrogated the models to rank the importance of different variables and to understand their relationship with surface soil moisture. All the machine learning algorithms we tested were able to predict soil moisture with good accuracy. The boosted regression tree algorithm was marginally the best, with a mean absolute error of 3.8 % volumetric moisture content. Variable importance analysis revealed that the four most important variables were precipitation, reflectance in the red wavelengths, potential evapotranspiration, and topographic position indices (TPI). Our results demonstrate that the dynamics of soil water status across heterogeneous terrain may be adequately described and predicted by UAS remote sensing and machine learning. Our modeling approach and the variable importance and relationships we have assessed in this study should be useful for management and environmental modeling tasks where spatially explicit soil moisture information is important.

Abstract. This study investigates the ability of machine learning models to retrieve the surface soil moisture of a grassland area from multispectral remote sensing carried out using an unoccupied aircraft system (UAS). In addition to multispectral images, we use terrain attributes derived from a digital elevation model and hydrological variables of precipitation and potential evapotranspiration as covariates to predict surface soil moisture. We tested four different machine learning algorithms and interrogated the models to rank the importance of different variables and to understand their relationship with surface soil moisture. All the machine learning algorithms we tested were able to predict soil moisture with good accuracy. The boosted regression tree algorithm was marginally the best, with a mean absolute error of 3.8 % volumetric moisture content. Variable importance analysis revealed that the four most important variables were precipitation, reflectance in the red wavelengths, potential evapotranspiration, and topographic position indices (TPI). Our results demonstrate that the dynamics of soil water status across heterogeneous terrain may be adequately described and predicted by UAS remote sensing and machine learning. Our modeling approach and the variable importance and relationships we have assessed in this study should be useful for management and environmental modeling tasks where spatially explicit soil moisture information is important.

Introduction
The relatively small quantity of water stored in the upper layers of the soil plays a key role in terrestrial biology, biogeochemistry, and atmospheric water and energy fluxes. More than half of the solar energy absorbed by the land surface is used to evaporate water (Trenberth et al., 2009), and about 60 % of terrestrial precipitation is returned to the atmosphere by evapotranspiration (Seneviratne et al., 2010).
In most environments, soil water storage mainly depends on precipitation and evapotranspiration (Hillel, 1998;Rana and Katerji, 2000), but the distribution of water in the soil is also dependent on soil hydraulic properties, topography, and other environmental and belowground conditions (Korres et al., 2015;Vereecken et al., 2014;Western et al., 1999). Because of the complex interplay of these variables, it is a challenge to accurately estimate soil water. It is typically impractical to acquire data on soil water dynamics by direct measurement on scales larger than a small experimental plot, and there is no robust approach to predict it. The scarcity of soil moisture data is often cited as a major impediment for the investigation of soil moisture-climate interaction. Modern techniques for large-scale measurement of soil moisture include the cosmic-ray soil moisture observing system (COSMOS)-and the global positioning system interferometric reflectometry (GPS-IR)-based methods. The COS-MOS employs a network of probes across the United States that estimate soil moisture by measuring cosmic-ray neutron radiation intensity above the land surface (Zreda et al., 2012). GPS-based methods are also able to estimate soil moisture of a few square meters using a GPS signal reflected from the soil. These techniques, while very promising, still need to be refined for routine use (Ochsner et al., 2013).
Remote sensing methods of retrieving soil moisture provide an alternative to conventional methods of soil moisture measurement, which are impractical at large scales. Several remote sensing methods, particularly from spaceborne deployment, have been developed to retrieve soil moisture using optical, thermal infrared, and microwave sensors. Remote sensing methods enable spatially distributed and frequent observations over a large area, which is difficult to achieve when using conventional field measurements (Barrett and Petropoulos, 2014;Petropoulos et al., 2015). A critical challenge to the current remote sensing methods of retrieving soil moisture is the lack of imagery with optimum spatial resolutions appropriate for field-scale soil moisture studies and the low revisit frequency of satellites (Barrett and Petropoulos, 2014;Das and Mohanty, 2006). Alternatives based on occupied airborne platforms are limited due to their high operational costs. Another persistent challenge to soil moisture remote sensing is the difficulty of estimating root zone soil moisture from the surface observations (Nichols et al., 2011;Ochsner et al., 2013).
Water is one of the most significant chromophores in soils, and studies have shown that narrow band spectral information in the visible (0.4-0.7 µm), near-infrared (0.7-1.1 µm), and shortwave infrared (1.1-2.5 µm) regions can be used to estimate surface soil moisture (Ben-Dor et al., 2009;Malley et al., 2004). Soil reflectance in the visible to shortwave infrared spectral region generally decreases with an increase in soil moisture, with some parts of the spectrum showing a more pronounced decrease than others (Haubrock et al., 2008;Weidong et al., 2002). The hydroxide bond is the strongest absorber in the near-infrared region, and free water in soil pores has strong absorption at around the 1.4 and 1.9 µm wavebands (Malley et al., 2004). Several hyperspectral techniques for estimating soil moisture content have been developed, such as the soil moisture Gaussian model (SMGM;Whiting et al., 2004) and the normalized soil moisture index (NSMI; Haubrock et al., 2008).
In the presence of vegetation cover, however, the ability to use soil reflectance to measure soil moisture is limited (Muller and Décamps, 2000). In addition, the reflectance of solar radiation from soil surface represents only about 50 µm of the upper soil. This makes it challenging to estimate the moisture conditions beneath thin surface layers (Malley et al., 2004). Most soil moisture remote sensing approaches operating in the optical range rely on developing an empirical spectral vegetation index (Barrett and Petropoulos, 2014). Several soil moisture measurement methods based on vegetation index proxies have been suggested as vegetation indexes are extremely sensitive to water stress, and they allow indirect estimates of soil moisture (Zhang and Zhou, 2016).
Many studies have focused on deriving surface soil moisture content from the synergistic use of remote sensing data acquired simultaneously in the optical and thermal infrared spectrum. The so-called universal triangular relationship is a widely used method for estimating soil moisture (Nichols et al., 2011;Sobrino et al., 2014).
Retrieval of information from remote sensing measurements is based on the principle that changes in the chemical, physical, and/or structural characteristics of a target determine the variations in its electromagnetic response (Schanda, 1986). The task of retrieving information from remote sensing is complicated by several factors. Ali et al. (2015) summarize the retrieval challenges as being (i) the often complex and nonlinear relation between remote sensing measurement and target variables of interest; (ii) the ill-posed nature of the retrieval problem in that electromagnetic response of a target is typically the result of contributions from multiple target variables, and similar electromagnetic responses may be associated with different physical variables; (iii) the mixed contribution of multiple objects represented within elementary resolution cell; and (iv) the influence of external disturbing factors such as noise, radiation components coming from surrounding of the investigated area, and the atmosphere.
Soil moisture retrieval from remote sensing has traditionally been done by either empirical approaches or approaches based on an inversion of physical models. More recently, the use of machine learning techniques has gained increased attention because of their ability to tackle many of the limitations with the empirical approaches and physical-modelbased approaches.
Physical-model-based approaches depend on an understanding of the mechanisms involving the interaction of electromagnetic radiation and the target variable. A wide variety of analytic electromagnetic models have been proposed in the literature. The thermal inertia approach (Price, 1977) is one such method that is most commonly used for soil moisture retrieval using thermal infrared (wavelengths between 3.5 and 14 µm) observation (Barrett and Petropoulos, 2014;Zhang and Zhou, 2016). Many new soil thermal inertia estimation methods continue to be developed (Price, 1985;Tian et al., 2015;Zhang and Zhou, 2016). The advantages of such physically based models are that they can operate in more general scenarios that are difficult to represent through the collection of in situ measurements. However, such models rely on simplifying the representation of a real phenomenon, which can reduce reliability. A major drawback of analytical models is their complexity and requirement for a large number of input parameters (Zhang and Zhou, 2016).
Empirical modeling approaches, on the other hand, employ statistical regression techniques to develop a mapping function based on in situ measurements of the target variable and corresponding remote sensing measurement. The advantage of empirical relationships is that they are typically fast to derive and require fewer inputs. However, such models require a higher-quality ground measurement, which could be time consuming and expensive, and the derived relationship is typically site and sensor dependent, which limits the possibility of extending their use readily in a different area (Ali et al., 2015).
Specific drawbacks to soil moisture remote sensing, using the optical and thermal infrared spectrum, are their shallow soil penetration ability and the cloud-free atmospheric condition requirement. Many of the optical and thermal infrared synergistic approaches require a wide range of both vegetation index and soil moisture conditions within a study region, which cannot always be satisfied (Barrett and Petropoulos, 2014).
The advantages of machine learning techniques in remote sensing are their ability to learn and approximate complex nonlinear mappings and the fact that no assumptions need to be made about data distribution. They can, thus, integrate data from different sources with poorly defined or unknown probability density functions and have often been shown to outperform other parametric approaches (Ali et al., 2015;Paloscia et al., 2008). Some of the limitations of machine learning methods are the need for a large number of training data, which require extensive ground truth data sets, and that machine learning methods are black boxes, and only limited inference can be made about the relationships of different inputs.
Remote sensing from unoccupied aircraft systems (UAS) has the potential to address several limitations of traditional remote sensing. The most attractive feature is their high spatial resolution, frequent or on-demand image acquisition, and low operating costs (Anderson and Gaston, 2013;Berni et al., 2009;Colomina and Molina, 2014;Elarab, 2016;Manfreda et al., 2018;Tmušić et al., 2020). UAS is an umbrella term that refers to the unoccupied aircraft and the complementary ground control and communication systems necessary for air surveys (Singh and Frazier, 2018).

Objectives
The purpose of this research was to advance soil moisture change measurement, process understanding, and prediction using remote sensing products from UAS and machine learning methods. In this study, the spatial and temporal scale limitations were addressed by deploying multispectral remote sensing with small UAS and surface soil moisture changes were retrieved using machine learning.
The specific goals of this study were as follows: (1) to develop an adaptable method for retrieving information on surface soil moisture from small UAS remote sensing products and machine learning methods, (2) to identify important reflectance and surface characteristics for the prediction of soil moisture changes, (3) to identify appropriate spatial resolutions of reflectance images and terrain variables for estimating soil moisture, and (4) to explore the relation of soil moisture with surface properties.

Methods
Multispectral images of the study area were collected on 6 different days throughout the 2018 water year using a UAS equipped with a multispectral camera. A high-resolution, centimeter-scale digital elevation model (DEM) was generated from the stereo images using photogrammetric software, and multiple sets of terrain variables were calculated. Concurrently with the image acquisition flights, the moisture content of the top 3.8 cm of soil was measured at predefined sampling locations. The ground soil moisture measurements, multispectral reflectance, terrain variables, and rainfall and potential evapotranspiration (PET) data were then aggregated into a data table and used to train a machine learning model to predict the soil moisture. Figure 1 shows the model building process.

Study site
The study was conducted in a small grassland catchment at the Merced Vernal Pools and Grassland Reserve located about 5 km northeast of the city of Merced, California. The grassland is used for livestock grazing; it has a Mediterranean climate with hot, dry summers and cool, wet winters, with an average annual precipitation of 330 mm (Wong, 2014).
Our study site is a 0.6 km 2 area of land located within a subcatchment that contributes to the Avocet Pond, a large stock pond located in the northeastern corner of the reserve (Fig. 2). The catchment was selected because of an extensive hydrologic modeling study that was being conducted on the site at the time (Fryjoff-Hung, 2018).
The study area soils are dominated by Redding gravelly loam (fine, mixed, active, thermic Abruptic Durixeralfs) soils. The elevation of the study area ranges from 118 to 162 m above sea level, and the slope ranges from 0 to 31 • . The distributions of elevation and slope are shown in Fig. S1 in the Supplement.
The vernal pool's ecology is predominantly controlled by large seasonal shifts and high spatial variability in hydrology; this makes the study site attractive for our study. UAS remote sensing has the potential to provide information at appropriate spatial and temporal scales for vernal pool studies (Stark et al., 2015). The annual seasonal cycle of the study site is shown in Fig. 3.

Data collection and preparation
The imagery was acquired on 6 d during the 2018 water year green-up and brown-down, using a fixed-wing UAS with a multispectral camera onboard (see Table 3). Figure 4 shows a typical scene of the study site during the wet and dry seasons. Point soil moisture measurements (top 4 cm) were collected with a time domain reflectometry (TDR) probe across precise sampling transects identified with a real-time kine-S. N. Araya et al.: Advances in soil moisture retrieval from multispectral remote sensing  matic (RTK) positioning survey. Daily rainfall and PET values were acquired from nearby weather stations.

Image acquisition and processing
Multispectral images were acquired using a Parrot Sequoia sensor (Parrot SA, Paris, France) equipped with a sunshine sensor that measured irradiance at the sensor spectral wavebands for radiometric normalization. The camera is deployed on a fixed-wing unoccupied aircraft (Finwing Sabre; Finwing Technology) with an average flight height of 120 m above ground level. Images of a calibrated reflectance panel (Mi-caSense, Inc., Seattle, WA) were taken before each flight and used in the radiometric calibration of the images. The UAS remote sensing flights were conducted between late mornings and early afternoons during clear weather conditions. A single remote sensing-soil moisture collection campaign takes between 3 and 4 h. Images were acquired between 10:00 and 14:00 local time (LT).
The Parrot Sequoia sensor captures four separable bands in the green, red, red edge, and near-infrared bands, with a focal length of 3.98 mm and resolution of 1280 × 960 pixels. A  fifth channel captures a high-resolution image in the visible spectrum, with a focal length of 4.88 mm and a resolution of 4608 × 3456. Images for the study area were captured with a minimum of 85 % overlap and a ground pixel resolution of 10 to 15 cm. Images were scaled to a uniform 15 cm pixel resolution during post-processing. Image processing was done using Pix4D photogrammetry software (Pix4D, Lausanne, Switzerland). A DEM was photogrammetrically generated from the overlapping stereo images, and images were orthorectified and radiometrically calibrated.

Geometric and radiometric corrections
Between seven and nine ground control point (GCP) targets, with precise locations identified by RTK survey, were used for photo alignment. The mean georeferencing root mean square errors (RMSEs) of the GCPs ranged from 0.6 to 2 cm, and the mean reprojection errors ranged from 0.1 to 0.2 px, based on the bundle block adjustment error assessment report. DEM was generated using the structure-from-motion technique; noise filtering and mild surface smoothing (sharp smoothing) were applied to correct for the noisy and erroneous points of the point cloud. The inverse distance weighting algorithm was used to interpolate between points to create the raster DEM.
Radiometric calibration by the Pix4D software considers the positional data, solar irradiance measurements, and gain and exposure data from the camera to convert raw digital numbers into sensor reflectance values. Sensor reflectance represents the ratio of the reflected light to the incoming solar radiation and provides a standardized measure that is directly comparable between images. Finally, surface reflectance is calculated in post-processing, taking into account the cam-era's orientation, the angle of the Sun, and the known reflectance values of the calibration panel.

In situ soil moisture measurement
The moisture content of the top 4 cm of soil was measured simultaneously with UAS remote sensing flights using a Field-Scout TDR 300 soil moisture meter equipped with a 3.8 cm probe (Spectrum Technologies Inc., IL, USA). The Field-Scout TDR 300 measures volumetric water content, using time domain reflectometry, with a resolution of 0.1 % and an accuracy of ± 3 %.
Accurate geolocation of the in situ soil moisture measurement points is critical for an overlay analysis of the ground measurements with remote sensing products. To ensure accurate geolocations of the ground measurements, we identified six 90 m long sampling transects and recorded the survey grade geolocation of the transect ends using the RTK positioning survey. The transect ends were marked with a metal peg hammered into the ground. During each soil moisture measurement campaign, a 90 m tape measure was temporarily affixed between the two ends of the transect, and soil moisture measurements were taken about every 10 m, noting the exact distance of the sampling point from the transect ends.
The sampling transects were laid out in a way that ensured that they run over a variety of topographic variables in terms of flow accumulation, topographic wetness index, and stream networks. Furthermore, each sampling transect fell in a separate subbasin within the Avocet basin.

Hydrological variables
Daily precipitation data were retrieved from the University of California, Merced, weather station located approximately 6 km southwest of the study site (California Department of Water Resources, 2018). Daily reference evapotranspiration data were retrieved from the California Irrigation Management Information System's (CIMIS) Merced station located approximately 10 km south of the study site (California Irrigation Management Information System, 2018). The reference evapotranspiration is evapotranspiration from standardized grass calculated using the modified Penman (CIMIS Penman) and Penman-Monteith equations (California Irrigation Management Information System, n.d.). In this study, we will refer to the reference evapotranspiration as the potential evapotranspiration (PET).
S. N. Araya et al.: Advances in soil moisture retrieval from multispectral remote sensing

Data processing
To prepare the data for machine learning, we compiled all the information into a table with the measured soil moisture from each sampling point and date organized into one column. Each row contained the accompanying information for that sampling point and time.

Feature engineering
We calculated several variables based on the multispectral reflectance, terrain, and hydrological data to be used to train a machine learning model as predictor variables. A list of all the measured and calculated variables used in modeling soil moisture are given in Table 1.
Topographic variables derived from DEM are scale dependent; to account for this, we calculated all topographic variables on six DEMs with different resolutions. Prior to calculating topographic variables, we upscaled the DEM to 15, 30, 60, 100, 300, and 500 cm cell resolutions and then calculated topographic variables for all the resolutions. The calculation of the topographic position index (TPI) is a special case since it does not only depend on DEM resolution but also on the definition of the inner and out radii of the annulus (see Eq. 1). TPI = Elevation − focal mean (annulus (Inner Radius, Outer Radius)). (1) We calculated the TPI for different neighborhood sizes using the ArcGIS 10.5 Land Facet Corridor tool (Jenness et al., 2013). We calculated the TPI on three DEM resolutions (100, 300, and 500 cm), with two inner radii (1 and 3 cells) and three outer radii (3, 5, and 7 cells). A map of selected topographic variables is provided in Fig. S2.
Precipitation and PET are two important drivers of surface soil moisture. To account for antecedent conditions, we used the cumulative water year precipitation and PET and rolling sums of both variables with different time windows before the measurement dates. We calculated cumulative precipitation and PET for 1, 2, 3, 7, 15, and 30 d before the sampling dates and used those rolling sums as input.

Variable selection
The total number of soil moisture measurements was 406, that is, the total number of rows. For each soil moisture point, we added columns with the corresponding date, reflectance, topographic, and hydrological variables. The reflectance and topographic variables were extracted from the raster images using the raster-to-point data extraction tool in ArcGIS software, taking the average value of a 1 m diameter buffer around the points. The hydrological variables were taken to be the same for the entire study area and only changed based on the measurement days.
The data preparation resulted in 138 variables. We employed variable selection (or feature selections) methods to identify a subset of relevant variables (features) from the larger set of potential predictors. The benefits of variable selection include improvement of model performance, reducing training and utilization times, and facilitating data understanding (Guyon and Elisseeff, 2003;Weston et al., 2003). We employed the following three methods of variable selection: tests of linear correlation and linear dependencies among variables and recursive feature elimination. Recursive feature elimination involves removing the least important features whose omission has the least effect on training errors (Chen and Jeong, 2007;Guyon et al., 2002). We implemented a recursive feature elimination procedure during the coarse tuning of the boosted regression tree (BRT), random forest (RF), artificial neural networks (ANNs), and support vector regression (SVR) algorithm models.
Following the variable selection procedure, of the 138 variables, 76 variables were removed based on linear correlation and linear dependencies among variables. An additional 16 were removed following the recursive feature elimination procedure. The final data used for building the models had 46 variables (Table 2), of which five are hydrological, nine are reflectance, and 32 are topographic variables. Variable categories that had no importance included the topographic wetness index (TWI), the reflectance in the red-edge band, and normalized difference vegetation index (NDVI).

Data description
The 6 data collection days in the 2018 water year and summary site statistics are given in Table 3. Cumulative and 30 d rolling sums of precipitation and PET for the 2018 water year are shown in Figs. S3 and S4. Figure 5 shows the distribution of soil moisture and Thiam's transformed vegetation index (TTVI) during the 6 measurement days. The soil moisture measurement followed the general precipitation patterns but was also influenced by immediate rainfall events; the highest soil moisture occurred on the only measurement day on which it had rained the day before (23 January 2018). The vegetation greenness, as measured with TTVI, followed the 15 d cumulative rainfall well, with maximum greenness occurring on 4 April 2018 (day of water year 186) and sharply decreasing in the following 2 months (Fig. 5). Figure S5 shows the distribution of some terrain variables associated with the soil moisture sampling points and correlations between variables. The terrain variables for the ground sampling points show a reasonable distribution of values, while the distribution of elevation shows a bimodal distribution with ranges from 120 to 130 m; most of the other variables show a close to normal distribution. The only variables with Person's correlation above 0.5 are between TPI and curvature (Pearson's correlation is equal to 0.67). The distribution of some variables in the data is shown in Fig. S6. Commonly used index to quantify topographic control on the hydrological process * The TTVI is a transformation of the commonly used normalized difference vegetation index (NDVI). The reason for choosing the TTVI transformation is that it eliminates negative values and often transforms NDVI histograms into a more normal distribution. Normalizing machine learning inputs is considered good practice and aids models in converging faster.  ,7), (3,9), (5,15), (9,21), (15,35), (15,100) * Scale for raster products is pixel resolution in meters and cumulative days for the hydrological variables. Topographic position index scale is a combination of the inner and outer diameters in meters (see Eq. 1).

Machine learning procedure
The overall machine learning procedure is illustrated in Fig. 1. The computationally demanding steps of model training and testing were run at the Multi-Environment Research Computer for Exploration and Discovery (MERCED) highperformance computing cluster at the University of California, Merced. The caret R package (v6.0-78; Kuhn, 2008) was used to handle training and tuning procedures. The SVR and relevance vector regression (RVR) algorithms were implemented using the kernlab package (v0.9-26; Karatzoglou et al., 2004), RF algorithm was implemented using the ran-domForest package (v4.6-12; Liaw and Wiener, 2002), and the BRT algorithm was implemented using the xgboost package (v0.6.4.1; Chen and Guestrin, 2016). Prior to model training, all the predictor variables were standardized by centering to mean zero and scaling by the variable's standard deviation (Eq. 2) as follows: where x is the centered and scaled value of variable x, and x and σ x are the arithmetic mean and standard deviation of the variable. Standardizing variables prior to model training is a good practice that minimizes issues of scale among input variables and often leads to better and faster training (Brownlee, 2020).

Machine learning algorithms used
Several machine learning algorithms exist for multivariate regression modeling. Artificial neural networks (ANNs) are among the most commonly used algorithms for the retrieval of soil moisture from remote sensing (e.g., Hassan-Esfahani et al., 2015;Paloscia et al., 2008). In recent years, the support vector machine (SVM) and the similar support vector regression (SVR) algorithms have become popular in the retrieval of soil moisture (e.g., Ahmad et al., 2010;Zaman and Mckee, 2014;Zaman et al., 2012). Other popular machine learning algorithms include tree-based models such as the random forest (RF) and boosted regression trees (BRTs).

Artificial neural network (ANN)
ANN models have been widely used in the development of pedotransfer models (Matei et al., 2017;Pachepsky et al., 1996;Schaap et al., 2001;Zhang et al., 2018;Zhang and Schaap, 2017). ANNs are universal approximators that can approximate any nonlinear mapping. The feed-forward neural network is a popular variant of ANN. In this study, we implemented the feed-forward neural network with a single hidden layer, which is considered sufficient for the majority of problems (Reed and Marks II, 1999).

Support vector regression (SVR)
SVR is an adaptation of the support vector machine (SVM) for regression problems (Cortes and Vapnik, 1995;Drucker et al., 1997). The SVM learning is a generalization of a maximal margin classifier. The algorithm first maps the input variables into a high-dimensional space using a fixed mapping function, i.e., a kernel function. The algorithm then constructs hyperplanes, which can be used for classification or, in the case of SVR, for regression. In this study, we use the radial basis function kernel, which is one of the most-used kernels in SVR. Some advantages of SVR include the fact that they do not suffer from the problem of local minima, and that they have few parameters to tune when training the model.

Relevance vector regression (RVR)
Like SVM, the RVR was originally introduced as a classification machine (Tipping, 2000). RVR is a Bayesian treatment of the SVM prediction function which avoids some of the limitations of SVM algorithms, such as reducing the use of basis functions and the need for optimizing the cost and the insensitivity parameters (Ben-Shimon and Shmilovici, 2006). Torres-Rua et al. (2016) successfully used the RVR algorithm to estimate surface soil moisture from satellite images and energy balance products.

Random forest (RF)
RFs are popular models that are relatively simple to train and tune (Hastie et al., 2009). They apply ensemble techniques by averaging a large number of individual decision-tree-based models. Tree models are grown by searching for a predictor that ensures the best split that results in the smallest model error. The individual trees in the RF ensemble are built on a bootstrapped training sample, and only a small group of predictor variables are considered at each split; this ensures that trees are decorrelated with each other (Breiman, 2001;James et al., 2013).

Boosted regression trees (BRTs)
BRT is another form of the decision tree model ensemble enhanced by the gradient boosting approach. The gradient boosting algorithm constructs additive regression models by sequentially fitting simple base learner functions (i.e., decision trees) to current pseudo-residuals at each iteration (Friedman, 2002). These pseudo-residuals are the gradient of the loss function being minimized. BRT models have shown considerable success and often outperform other machine learning algorithms in many situations (Elith et al., 2008;Natekin and Knoll, 2013). BRT models are also particularly suitable for less-than-clean data (Friedman, 2001), which makes them particularly attractive in our work where the training data are compiled from various sources and different measurement methods, making them prone to some inconsistencies. Tree-based models, both the RF and BRT, have the advantage of being able to rank the predictor variable's relative importance. In these models, the approximate relative influence of a single predictor variable is calculated as the empirical improvement of predictions by splitting on that predictor at each node and then averaging the relative influence of the variable across all trees of the model (Ridgeway, 2012).

Training-testing set splits
The data was split into training and testing sets of approximately 75 %-25 %, respectively (i.e., approximately 300 and 100 records). The testing set was a hold-out set used only to evaluate final trained models.
The training-testing set splitting was done based on a random selection of the transect. For the testing set, two transects are randomly selected on four randomly selected sampling dates, and one transect is randomly selected on the remaining two sampling dates. To minimize the bias that may result from the training-testing set split, we generated 30 unique training-testing set splits and trained 30 separate models based on each separate training set. The performance of each model was assessed on its respective testing set. Similar performance of the individual models would indicate that bias due to the training-testing set split is minimal. The justification for this subsetting procedure is as follows: (1) the selection of entire transects as testing sets avoids possible data leakage between the training and testing sets due to spatial autocorrelation since samples in a transect are located close to each other -a simple random splitting would not avoid this potential problem; (2) all six sampling dates are represented in the training set -models are trained on the entire range of time and soil moisture changes; and (3) the testing set is between 25 % and 30 % of the data (between 100 and 125 samples).
The distribution of samples across the sampling dates and transects for the training and testing sets is shown in Figs. S7 and S8, respectively. On average, the training-testing split was 294 samples in the training sets and 113 samples in the testing sets. All the training sets have samples from all the six sampling dates and transects. While all sampling dates are represented in each testing set, on average, there are five transects in each testing set.

Cross-validation procedure
The selection of optimal model parameters in the model training process was done by the cross-validation method. Cross validation is done to estimate the test error rate by holding out a subset of the training data (i.e., a validation set) from the fitting process and then applying the fitted model to predict the validation subset. A 30-fold cross-validation set was generated by randomly splitting the training data into 80 %-20 % training-validation split by randomly selecting a single transect every day. Optimum model parameters were selected using a comprehensive grid search method.

Model assessment Performance
The final performance of models was assessed on the separate hold-out test data set that was not used in the model training. The performance of models is measured in terms of mean absolute error (MAE), mean bias error (MBE), and the coefficient of determination (R 2 ) determined as follows: where N is the number of observations, y is the measured value,ŷ is the predicted value, and y is the mean of measured values.
The MAE indicates the average deviation of predictions from the measured value, with smaller values indicating better performance. The MBE measures the average systematic bias, and positive or negative values indicate the average tendency of the predicted values to be larger or smaller than the measured values, respectively. The R 2 measures the correspondence between predicted and measured data, with higher values indicating stronger correspondence. The MAE was chosen over RMSE as it is a more appropriate measure when averaging (Willmott and Matsuura, 2005).

Variable importance
The predictor variable importance is the statistical significance of each predictor variable with respect to its effect on the generated model. For the tree-based models, RF and BRT, variable importance is calculated internally within the model algorithm. For the rest of the machine learning models, we calculated the predictor variable importance by the recursive feature elimination method, which is done by recursively removing predictors before training a model and evaluating the change in model performance. In this method, to account for possible bias in variable subset selection (Ambroise and McLachlan, 2002;Hastie et al., 2009), we included a separate layer of 10-fold cross validation in the entire sequence modeling steps.

Effect of predictor variables
The relationship between the predictor variables and outputs for a black box model can be analyzed using modelindependent methods such as partial dependence plots or accumulated local effects (ALEs) plots (Apley and Zhu, 2020;Greenwell, 2017). These plots help to explain the relationship between the outcome of the black-box-supervised machine learning models and the predictors of interest. We use the ALE plots to analyze the effect of selected predictor variables. Although similar, the ALE plots are preferred over partial dependence plots for their speed and their ability to produce unbiased plots when variables are correlated (Apley and Zhu, 2020). The value of the ALE is centered so that the mean effect is zero; it can be interpreted as being the effect of the variable on the outcome at a certain value compared to the average prediction of the data. For example, if an ALE estimate of −2 occurs when a variable of interest has a value of 3, then the prediction is lower by 2 compared to the average prediction (Molnar, 2019). 3 Results and discussion

Model performance
All the tested machine learning algorithms were able to predict soil moisture with good accuracy. Both BRT and RF algorithms, however, had a slightly better performance with a MAE of less than 4 % soil moisture content. Figure 6 shows the performances of the five machine learning algorithms in the testing set. The relatively better performance of BRT and RF models is consistent with other studies that find that ensemble decision-tree-based regression models perform better than many other machine learning algorithms (Caruana and Niculescu-Mizil, 2006), particularly in terrain and soil spatial predictions (Hengl et al., 2017(Hengl et al., , 2018Keskin et al., 2019;Szabó et al., 2019). Of the two best algorithms, BRT performs marginally better, and we present variable importance and predictor effect analysis done with the BRT model. Despite being marginally inferior to BRT, the RF model has several advantages over BRT and the other algorithms. RF is much easier and faster to train compared to the other machine learning algorithms used. Since the ensemble trees are independent in the RF model, the forest can be grown simultaneously, which dramatically increases the processing efficiency in parallel computing. In addition, the RF model has few hyperparameters to tune. In contrast, the ensemble trees in the BRT algorithm must be grown sequentially since each new tree is dependent on the previous ensemble (which makes parallel processing challenging). Training a BRT model requires tuning multiple hyperparameters -seven in our implementation of the BRT model compared to two for the RF model. The performance of individual models across the 30 different training-testing splits was comparable and seems consistent with minimum bias resulting from testing set selection. The performances of the individual 30 tuned BRT models on their respective testing sets are shown in Fig. S9. Measured vs. model-predicted soil moisture contents for the testing data sets are plotted in Fig. 7. The one-to-one comparison in Fig. 7 shows the individual predictions by all 30 models. The plot shows a general increase in error at higher soil moisture levels. In addition, the model prediction appears capped at around 40 % soil moisture content; this is likely the result of a lack of training data points with values above that soil moisture (see Fig. 5). The marginal box plot on the y axis of Fig. 7 illustrates this point as values above a 40 % soil moisture content are over 1.5 times the interquartile range above the upper quartile and are plotted as outliers.

Predictor variable importance
Precipitation and PET were among the top variables in terms of predictive importance. This is to be expected, given that these two variables represent the major source and loss pathway for surface soil moisture. Reflectance in the red band was by far the most important of all the reflectance bands. Following these three variables, topographic variables -TPI and curvature in particular -were most important in most models. Topography has a strong control on soil moisture distribution at landscape scales (Sørensen et al., 2006), while the TPI was the most important topographic variable in determining soil moisture. A surprising finding was that TWI was found not to be an important predictor despite numerous studies finding it important in explaining surface soil moisture (e.g. Moore et al., 1988;Western et al., 1999) Despite calculating TWI in multiple ways and at multiple scales, it consistently failed the variable selection procedure for all algorithms we tested. The reasons for this could be that TWI is of little significance when explaining soil moisture distribution at a small scale. Upon observing a similar lack of correlation between TWI and soil moisture, Famigliettiet et al. (1998) suggest that TWI is more appropriate for predicting the soil moisture of an entire unsaturated zone profile and not just the surface layer. Yet another reason could be that since slope and flow accumulation, i.e., the constituent parts of TWI, are already included in the models, it was deemed redundant (not providing unique information) to the models. This would be consistent with the models not finding NDVI important when the constituent bands (red and NIR bands) were found to be important.
When considering the importance of variables, it is worth noting that different variables may have different degrees of importance depend on how wet or dry the soil condition is. Western et al. (1999), for example, found that flow accumulation was the best predictor for soil moisture distribution during wet conditions, while the potential solar radiation index was a better predictor during dry conditions.
The relative importance of predictors for the BRT model is shown in Fig. 8. The predictors have been grouped by variable type (lumping the same variables regardless of variable specifications, such as summation window for precipitation or pixel resolution for the raster). The only temporally dynamic variables in our model are the hydrological and reflectance variables; the topographic variables are not time dependent, and their variables need only be generated once  for a study area. A more detailed variable importance plot is provided in Fig. S10.

Effect of predictor variables
We used the ALE plots to investigate the nature of the relationship between the predictor variables and soil moisture. Figure 9 shows the partial effect of red reflectance and three of the most important topographic variables, i.e., TPI, profile curvature, and flow accumulation. Given the high importance of these topographic variables, it is useful to understand how these variables relate to soil moisture and identify possible thresholds of significant changes. Predicted soil moisture generally increased with flow accumulation across all scales.
The relationship of curvature to soil moisture is a little more complex; soil moisture tends to decrease as surfaces became less convex. However, the trend reverses, and soil moisture increases as surface curvature transitions from convex to concave (approximately between profile curvature values −5 and +5) before the effect reverses again at higher concavity surfaces. A possible explanation for this behavior might be that more flat surfaces (with a near-zero curvature value) are associated with higher slope areas, such as those immediately following a ridgetop which is of higher convexity. The decreasing trend of soil moisture at increasing concavity (profile curvature value above +5) is harder to explain. However, at lower scales (3 and 5 m resolution DEM), soil moisture did peak at the convex to concave transition (near zero) but there was almost no noticeable decreasing pattern at higher curvature (concavity) values (not presented in this paper). At the lowest resolution (50 m DEM), soil moisture continuously increases with an increase in concavity of surface.
Of all the topographic variables we calculated, perhaps TPI is the most scale-dependent variable. Surprisingly, TPI across all scales had a similar relation with soil moisture. Negative TPI values indicate a surface that trends towards valleys, and zero values indicate flat areas, if the slope of the surface is shallow, or mid-slope areas. For areas with significant slopes, positive TPI values indicate surfaces that trend towards ridgetops (Jenness et al., 2013). Across all scales, there was a U-shaped relation between TPI and soil moisture, with soil moisture decreasing as negative TPI values moved towards zero and soil moisture increasing as TPI moved from zero to positive values. This pattern is consistent with valleys and ridgetops being wetter than mid-slope areas. We computed TPI at several scales, and across all machine learning algorithms, TPI with 15 and 35 m inner and outer diameters, i.e., TPI(15,35), had the highest variable importance among all topographic variables.
Although the machine learning models are considered non-spatial models, as they do not consider sampling location information and spatial autocorrelations (Georganos et al., 2019;Hengl et al., 2018), the inclusion of spatially dependent variables (specifically curvature, flow accumulation, and TPI) as predictors means that the models do account for a significant amount of spatial information. The inclusion of such variables should make the predictions more spatially relevant.
The red band was the most predictive of the three bands, and the red-edge band was found not to be an important predictor. The two spectral vegetation indices we tested, i.e., NDVI and TTVI, were found not to be important, but their constituent bands, the red and NIR, were important. The lower importance of NIR compared to the red band in the prediction of surface soil moisture was particularly surprising, given the higher sensitivity of the NIR band to plant moisture stress and the fact that our study area was almost entirely covered with vegetation.

Spatial prediction of soil moisture
The final utility of training a machine learning model was to be able to produce a spatially resolved soil moisture map. In Fig. 10, we have predicted surface soil moisture for the 6 sampling days for which we had multispectral images. Ideally, in the future, the only new inputs required to produce a soil moisture prediction map for our study site is UAV-based multispectral images and hydrologic variables of precipitation and PET which are available from nearby weather stations. As can be seen in Fig. 10, while the mean moisture content was largely driven by the day (which, in turn, is controlled by antecedent precipitation and PET), the distribution appears to closely follow topographic attributes. This is particularly visible in a magnified map (Fig. 11). Ridges appear drier, while valleys appear wetter; furthermore, northernfacing ridges appear slightly drier than south-facing slopes, which is probably due to slightly higher vegetation cover. Density plots showing the distribution of soil moisture predictions over the test area for each of the 6 d is provided in Fig. S11.
A magnified map of the soil moisture prediction for 23 January 2018 is shown in Fig. 11 and shows that soil moisture varies considerably with topography. Tracks made by the repeated passage of vehicles are, for example, clearly identifiable as areas of low soil moisture in the magnified map.

Conclusions
Our study addressed the following questions: how effectively can machine learning methods be employed to retrieve soil moisture from a combination of topographic and UAV remote sensing data? What are the most important predictors of surface soil moisture in our study area? And what is the na-  ture of the relationship between predictor variables and soil moisture? Our approach can be summarized as follows: we took multispectral images of grassland in the visible nearinfrared range using a UAS. Using a photogrammetric analysis of the images, we produced a high-resolution digital elevation model of the study site, which we used to calculate several topographic variables at different scales. Simultaneously with the UAS imaging flights, we took about 400 in situ surface soil moisture measurements. Using those ground truth measurements, we trained machine learning models to predict soil moisture from the multispectral images, topographic variables, and precipitation and evapotranspiration data. We finally interrogated the machine learning models to understand the importance of the different variables and elucidate the nature of the relationship between variables and soil moisture.
What makes our study stand out is that we used UASbased remote sensing to investigate soil water outside of the relatively homogenous farm plots. Our study site had un-even topography and an ecology dominated by large seasonal shifts in moisture. The fact that it is neither bare nor a monoculture makes the interpretation of multispectral images challenging and necessitates the use of machine learning methods that are able to generalize complex relationships. This research serves as a proof of concept that surface soil moisture can be interpreted with reasonable accuracy from multispectral UAS remote sensing using machine learning methods.
Based on our findings, we conclude that all the popular machine learning algorithms we tested (ANN, SVR, RVR, RF, and BRT) are acceptable for modeling surface soil moisture with the conditions of our study. We particularly found that decision-tree-based methods, especially RF, make for a versatile algorithm. Even though its prediction accuracy is only marginally better than the other machine learning algorithms we tested, RF has the advantage of being particularly simple to train and requiring the least amount of computational resources.
Analysis of the models revealed that hydrologic variables of precipitation and evapotranspiration are the most important predictor variables for surface soil moisture. However, spatial distribution of soil moisture is highly dependent on topographic variables. Topographic variables, such as flow accumulation and TPI, are particularly important in ameliorating the non-spatial nature of such machine learning models by including spatial variables. Based on our study, a DEM of about 1 m horizontal resolution is sufficient. Although we had resolution as high as 15 cm, we found that excluding such high-resolution topographic variables did not substantially reduce the model performances.
A partial dependence analysis of how the input variables relate to soil moisture is important for an understanding of the mechanisms that control soil moisture over the landscape. Using a type of partial dependency plot, we were able to investigate the nature of the relationships. Surface curvature showed a complex relationship with soil moisture. Curvature showed a negative relationship with soil moisture as convex surfaces became less convex. This relationship reverses as the convex surfaces transition to concave, with increases in concavity being positively related with soil moisture. This relation reverses again at higher concavities, indicating that the partial effect of the curvature variable is such that the mildly concave surfaces tend to be wetter than more concave surfaces. However, the magnitude of soil moisture changes associated with curvature are small. TPI was the most important of the topographic variables. The partial dependency plots in the form of ALE (Fig. 9) show that it affects soil moisture at a relatively higher magnitude. The general nature of the TPI-soil moisture relation indicates that valleys and ridgetops tend to be wetter surfaces compared to the areas in between. Furthermore, valleys tend to be wetter than ridgetops.

Outlook
As a data mining technique, machine learning model performance and reliability are closely tied to the quantity of data. Although the number and spatial coverage of ground sampling points were reasonable for the purposes of our research, future studies would benefit from significantly increasing the number of ground measurements and flight frequency. Multiyear studies are eventually needed to ensure that the model can be used reliably for future predictions. As data-based methods, machine learning methods are highly unreliable if used to forecast situations that they have not been trained on.
Another important consideration for future studies is to move beyond surface soil moisture to deeper layers. Although more challenging to implement, studies of root zone soil moisture are generally more ecologically relevant. Expanding the reflectance information beyond multispectral bands could lead to important improvements in soil moisture prediction. Although lightweight thermal or hyperspectral sensors are currently very expensive and may not be financially feasible for routine applications, the market trend is optimistic that those technologies will become more affordable in the future. Another interesting avenue worth investigating is the possibility of using the high-resolution topographic variables from UAS along with cheaper and more widely available satellite images.
Although the machine learning models used in this study have many advantages, one drawback is that they are nonspatial models and, as such, do not consider sampling location information and spatial autocorrelations. This potentially weakens these models' ability to appropriately address spatial heterogeneities (Georganos et al., 2019;Hengl et al., 2018). Hengl et al. (2018) introduce a novel method for incorporating spatial information into a non-spatial machine learning model by including distances between sampling points as predictor variables, and they show that this method (although still in its formative stage) has comparable accuracy to kriging methods. The potential to improve soil moisture predictions by using such spatially integrated methods should be considered in future research.  (Araya, 2021).
Author contributions. SA, TA, and JV contributed to the study design. The ground measurement site survey and setup was done by SA and AF. The ground measurement was conducted by SA. UAS remote sensing and image preprocessing were done by AA and SA. Image analysis and model building were done by SA, with supervision from TA and JV. SA prepared the paper with contributions from all coauthors.