(2019). Assessment of simulated soil moisture from WRF Noah, Noah-MP, and CLM land surface schemes for landslide hazard application. Hydrology and Earth

. This study assesses the usability of Weather Research and Forecasting (WRF) model simulated soil moisture for landslide monitoring in the Emilia Romagna region, northern Italy, during the 10-year period between 2006 and 2015. In particular, three advanced land surface model (LSM) schemes (i.e. Noah, Noah-MP, and CLM4) integrated with the WRF are used to provide detailed multi-layer soil moisture information. Through the temporal evaluation with the single-point in situ soil moisture observations, Noah-MP is the only scheme that is able to simulate the large soil drying phenomenon close to the observations during the dry season, and it also has the highest correlation coefﬁcient and the lowest RMSE at most soil layers. It is also demonstrated that a single soil moisture sensor located in a plain area has a high correlation with a signiﬁcant proportion of the study area (even in the mountainous region 141 km away, based on the WRF-simulated spatial soil moisture informa-tion). The evaluation of the WRF rainfall estimation shows there is no distinct difference among the three LSMs, and their performances are in line with a published study for the central USA. Each simulated soil moisture product from the three LSM schemes is then used to build a landslide prediction model, and within each model, 17 different exceedance probability levels from 1 % to 50 % are adopted to determine the optimal threshold scenario (in total there are 612 scenar-ios). Slope degree information is also used to separate the study region into different groups. The threshold evaluation performance is based on the landslide forecasting accuracy using 45 selected rainfall events between 2014 and 2015. Contingency tables, statistical indicators, and receiver operating characteristic analysis for different threshold scenarios are explored. The results have shown that, for landslide monitoring, Noah-MP at the surface soil layer with 30 % exceedance probability provides the best landslide monitoring performance, with its hit rate at 0.769 and its false alarm rate at 0.289.

Abstract. This study assesses the usability of Weather Research and Forecasting (WRF) model simulated soil moisture for landslide monitoring in the Emilia Romagna region, northern Italy, during the 10-year period between 2006 and 2015. In particular, three advanced land surface model (LSM) schemes (i.e. Noah, Noah-MP, and CLM4) integrated with the WRF are used to provide detailed multilayer soil moisture information. Through the temporal evaluation with the single-point in situ soil moisture observations, Noah-MP is the only scheme that is able to simulate the large soil drying phenomenon close to the observations during the dry season, and it also has the highest correlation coefficient and the lowest RMSE at most soil layers. It is also demonstrated that a single soil moisture sensor located in a plain area has a high correlation with a significant proportion of the study area (even in the mountainous region 141 km away, based on the WRF-simulated spatial soil moisture information). The evaluation of the WRF rainfall estimation shows there is no distinct difference among the three LSMs, and their performances are in line with a published study for the central USA. Each simulated soil moisture product from the three LSM schemes is then used to build a landslide prediction model, and within each model, 17 different exceedance probability levels from 1 % to 50 % are adopted to determine the optimal threshold scenario (in total there are 612 scenarios). Slope degree information is also used to separate the study region into different groups. The threshold evaluation performance is based on the landslide forecasting accuracy using 45 selected rainfall events between 2014 and 2015. Contingency tables, statistical indicators, and receiver operating characteristic analysis for different threshold scenarios are explored. The results have shown that, for landslide monitoring, Noah-MP at the surface soil layer with 30 % exceedance probability provides the best landslide monitoring performance, with its hit rate at 0.769 and its false alarm rate at 0.289. Tsai and Chen, 2010;Hawke and McConchie, 2011;Bittelli et al., 2012;Segoni et al., 2018b;Valenzuela et al., 2018;Bogaard and Greco, 2018).
For landslide applications, one potential soil moisture estimation method is through satellite remote sensing technologies. Although such technologies have been improved significantly over the past decade, their retrieving accuracy is still largely affected by frozen soil conditions (Zhuo et al., 2015a) and dense vegetation coverages, particularly in mountainous regions (Temimi et al., 2010); furthermore, the acquired data only cover the top few centimetres of soil. Although the more recently launched satellites such as Sentinel-1 (1 km and 3 d resolution) has shown some promising performance of soil moisture estimation (Gao et al., 2017;Paloscia et al., 2013), its availability only covers the recent years (Geudtner et al., 2014). Those disadvantages restrict the full utilisation of satellite soil moisture products for landslide monitoring applications as discussed in our previous study . In Zhuo et al. (2019), it is discussed that both the temporal and spatial resolutions of the ESA CCI satellite soil moisture product (Dorigo et al., 2017) is too coarse for landslide applications, and its data are mostly only available after the year 2002. Moreover, the shallow depth soil moisture observation from the satellite hinders the accuracy of landslide predictions. Therefore, other alternative soil moisture estimation methods need to be explored.
One emerging area relies on modelling. Some studies have used modelled soil moisture data for landslide applications (Ponziani et al., 2012;Ciabatta et al., 2016;Zhao et al., 2019a, b). However, to our knowledge, there is a lack of existing studies using modelled soil moisture from state-of-theart land surface models (LSMs) for landslide studies, such as the Noah LSM (Ek et al., 2003) and the Community Land Model (CLM) (Oleson et al., 2010). LSMs describe the interactions between the atmosphere and the land surface by simulating exchanges of momentum, heat, and water within the Earth system (Maheu et al., 2018). They are capable of simulating the most important subsurface hydrological processes (e.g. soil moisture) and can be integrated with the advanced numerical weather prediction (NWP) system like WRF (Weather Research and Forecasting) (Skamarock et al., 2008) for comprehensive soil moisture estimations (i.e. through the surface energy balance, the surface layer stability and the water balance equations) (Greve et al., 2013). NWPbased (i.e. with integrated LSM) soil moisture estimations have many advantages. For instance their spatial and temporal resolution can be set at different scales depending on the input datasets to fit various application requirements; their coverage is global, and the estimated soil moisture data cover multiple soil layers (from the shallow surface layer to deep root-zones); and a number of globally covered data products can provide the necessary boundary and initial conditions for running the models. Soil moisture estimated through such an approach has been widely recognised and demonstrated in many studies, which cover a broad range of applications from hydrological modelling (Srivastava et al., 2013a(Srivastava et al., , 2015, drought studies (Zaitchik et al., 2013), and flood investigations (Leung and Qian, 2009), to regional weather prediction (Stéfanon et al., 2014). Therefore, NWP-based soil moisture datasets could provide valuable information for landslide applications. However, to our knowledge, relevant research has never been carried out.
The aim of this study is hence to evaluate the usefulness of NWP-modelled soil moisture for landslide monitoring. Here the advanced WRF model (version 3.8) is adopted, because it offers numerous physics options such as micro-physics, surface physics, atmospheric radiation physics, and planetary boundary layer physics (Srivastava et al., 2015), and it can be integrated with a number of LSM schemes, each varying in physical parameterisation complexities. So far there is limited literature comparing the soil moisture accuracy of different LSMs options in the WRF model. Therefore, in this study, we select three of the WRF's most advanced LSM schemes (i.e. Noah; Noah-Multiparameterization, here Noah-MP; and CLM4) to compare their soil moisture performance for landslide hazard assessment. Furthermore, since all three schemes can provide multi-layer soil moisture information, it is useful to include all those simulations for the comparison so that the optimal depth of soil moisture could be determined for the landslide monitoring application. In order to compare with the performance of our previous study on using the satellite soil moisture data , the same study area, Emilia Romagna, is used here. The study period covers 10 years from 2006 to 2015 to include a long-term record of landslide events. In addition, because slope angle is one of the major factors controlling the stability of the slope, it is hence used in this study to divide the study area into several slope groups, so that a more accurate landslide prediction model could be built.
The description of the study area and the datasets used are included in Sect. 2. Methodologies regarding the WRF model, the related LSM schemes and the adopted landslide threshold evaluation approach are provided in Sect. 3. Section 4 shows the WRF soil moisture evaluation results against the in situ observations, and the WRF rainfall evaluations over the whole study area. Section 5 covers the comparison results of the WRF-modelled soil moisture products for landslide applications. The discussions and conclusions of the study are included in Sects. 6 and 7, respectively.
2 Study area and datasets

Study area
The study area is in the Emilia Romagna region, northern Italy (Fig. 1). Its population density is high. The region has high mountainous areas in the S-SW, and wide plain areas towards the NE, with a large elevation difference (i.e. 0 to 2125 m) across 50 km distance from the north to the south (Rossi et al., 2010). The region has a mild Mediterranean climate with distinct wet and dry seasons (i.e. dry season between May and October, and wet season between November and April). The study area tends to be affected by landslide events easily, with approximately one-fifth of the mountainous zone covered by active or dormant landslide deposits (Bertolini et al., 2005). Rainfall is by far the primary triggering factor of landslides in the region, followed by snow melting: shallow landslides are mainly triggered by short but exceptionally intense rainfall, and long and moderate rainfall events over saturated conditions, while deepseated landslides have a more complex response to rainfall and are mainly caused by moderate but exceptionally prolonged (even up to 6 months) periods of rainfall . Due to the abundant data available in the region, several studies on regional scale landslide prediction and early warning have been published (Berti et al., 2012;Martelloni et al., 2012;Lagomarsino et al., 2013Lagomarsino et al., , 2015Segoni et al., 2018a, b). Interested readers can refer to those studies for more information.

Selection of the landslide events
The landslide catalogue is collected from the Emilia Romagna Geological Survey (Berti et al., 2012). The information included in the catalogue are location, date of occurrence, the uncertainty of the date of occurrence, landslide characteristics (dimensions, type, and material), triggering factors, damage, casualties, and references. Unfortunately, many pieces of information are missing from the records in many cases. In order to organise the data in a more systematic way so that only the relevant events are retained, a twostep event selection procedure is initially carried out based on (1) rainfall-induced events only; and (2) high spatialtemporal accuracy (exact date and coordinates). Finally, a revision of the information about the type of slope instabilities such as landslide, debris flow, and rockfall as well as the characteristics of the affected slope (natural or artificial) is also carried out using the selected records (Valenzuela et al., 2018). The catalogue period used in this study covers between 2006 and 2015, which is in accordance with the WRF model run. After filtering the data records, only one-fifth of them (i.e. 157 events) is retained. The retained events are shown as single circles in Fig. 2, with slope information (calculated through the digital elevation model -DEM -data) also presented in the background. It can be seen that the spatial distribution of the occurred landslide events is very heterogeneous, with nearly all of them occurring in the hilly regions.

Datasets
There is a total of 19 soil moisture stations available within the study area; however, based on our collected data, only one of them (at the San Pietro Capofiume: latitude 44 • 39 13.59 , longitude 11 • 37 21.6 ) provides long-term valid soil moisture retrievals (i.e. 2006 to 2017). We have checked the data from all the rest of the stations, they are either absent (or have very big data gaps) or do not cover the research period at all. Therefore, only the San Pietro Capofiume station is used for the WRF soil moisture temporal evaluation. The soil moisture is measured from 10 to 180 cm deep in the soil at five depths, by the time domain reflectometry (TDR) instrument. Data are recorded in the unit of volumetric water content (m 3 m −3 ) and at a daily time step (Pistocchi et al., 2008). The data used in this study are from between 2006 and 2015. Rainfall data over the whole study area are collected from over 200 tipping-bucket rain gauges, which are used to assess the quality of the WRF model's rainfall estima-tions in the study area, as well as for rainfall event selection during the years 2014 and 2015.
To drive a NWP model like WRF for soil moisture simulations, several globally covered data products can be chosen for extracting the boundary and initial condition information; for instance, the European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA-Interim) and the National Centre for Environmental Prediction (NCEP) reanalysis are two of the most commonly used data products. It has been found by Srivastava et al. (2013b) that the ERA-Interim datasets can provide better boundary conditions than the NCEP datasets for WRF hydro-meteorological predictions in Europe, which is therefore adopted in this study to drive the WRF model. The spatial resolution of the ERA-Interim is approximately 80 km. The data are available from 1979 to present, containing 6-hourly gridded estimates of three-dimensional meteorological variables, and 3-hourly estimates of a large number of surface parameters and other two-dimensional fields. A comprehensive description of the ERA-Interim datasets can be found in Dee et al. (2011).
The Shuttle Radar Topography Mission (SRTM) 3 Arc-Second Global (∼ 90 m) DEM datasets are downloaded and used as the basis for the slope degree calculations. SRTM DEM data have been widely used for elevation-related studies worldwide due to their high-quality, near-global coverage and free availability (Berry et al., 2007).

WRF model and the three land surface model schemes
The WRF model is a next-generation, non-hydrostatic mesoscale NWP system designed for both atmospheric research and operational forecasting applications (Skamarock et al., 2005). The model is powerful enough in modelling a broad range of meteorological applications varying from tens of metres to thousands of kilometres. It has two dynamical solvers: the ARW (Advanced Research WRF) core and the NMM (Nonhydrostatic Mesoscale Model) core. The former has more complex dynamic and physics settings than the latter, which only has limited setting choices. Hence in this study WRF with ARW dynamic core (version 3.8) is used to perform all the soil moisture simulations. The main task of the LSM within the WRF is to integrate information generated through the surface layer scheme, the radiative forcing from the radiation scheme, the precipitation forcing from the microphysics and convective schemes, and the land surface conditions to simulate the water and energy fluxes (Ek et al., 2003). WRF provides several LSM options, three of which are selected in this study as mentioned in the introduction: Noah, Noah-MP, and CLM4. Table 1 gives a simple comparison of the three models. The detailed description of the models is written below in the order of increas-ing complexity regarding the way they deal with thermal and moisture fluxes in various layers of soil, and their vegetation, root, and canopy effects (Skamarock et al., 2008).

Noah
Noah is the most basic amongst the three selected LSMs. It is one of the "second generation" LSMs that relies on both soil and vegetation processes for water budgets and surface energy closures (Wei et al., 2010). The model is capable of modelling soil and land surface temperature, snow water equivalent, and the general water and energy fluxes. The model includes four soil layers that reach a total depth of 2 m in which soil moisture is calculated. Its bulk layer of canopy-snow-soil (i.e. a layer lacking the ability to simulate photosynthetically active radiation, here PAR; vegetation temperature; correlated energy; and water, heat and carbon fluxes), "leaky" bottom (i.e. drained water is removed immediately from the bottom of the soil column, which can result in much fewer memories of antecedent weather and climate fluctuations), and simple snow melt-thaw dynamics are seen as the model's demerits (Wharton et al., 2013). Noah calculates the soil moisture from the diffusive form of the Richard's equation for each of the soil layers (Greve et al., 2013), and the evapotranspiration from the Ball-Berry equation (considering both the water flow mechanism within soil column and vegetation, as well as the physiology of photosynthesis; Wharton et al., 2013).

Noah-MP
Noah-MP (Niu et al., 2011) is an improved version of the Noah LSM, in the aspect of better representations of terrestrial biophysical and hydrological processes. Major physical mechanism improvements directly relevant to soil water simulations include (1) the introduction of a more permeable frozen soil by separating permeable and impermeable fractions (Cai, 2015); (2) the addition of an unconfined aquifer immediately beneath the bottom of the soil column to allow the exchange of water between them (Liang et al., 2003); and (3) the adoption of a TOPMODEL (TO-Pography based hydrological MODEL)-based runoff scheme (Niu et al., 2005) and a simple SIMGM groundwater model (Niu et al., 2007), which are both important in improving the modelling of soil hydrology. Noah-MP is unique compared with the other LSMs, as it is capable of generating thousands of parameterisation schemes through the different combinations of "dynamic leaf, canopy stomatal resistance, runoff and groundwater, a soil moisture factor controlling stomatal resistance (the β factor), and six other processes" (Cai, 2015). The scheme options used in the study are the Ball-Berry scheme for canopy stomatal resistance, the Monin-Obukhov scheme for surface layer drag coefficient calculation, the Noah-based soil moisture factor for stomatal resistance, the TOPMODEL runoff with the SIMGM groundwater, the linear effect scheme for soil permeability, the two-stream method applied to vegetated fraction scheme for radiative transfer, the CLASS (Canadian Land Surface Scheme) scheme for ground surface albedo option, and the Jordan scheme (Jordan, 1991) for partitioning precipitation between snow and rain.

CLM4
CLM4 is developed by the National Center for Atmospheric Research (NCAR) to serve as the land component of its Community Earth System Model (formerly known as the Community Climate System Model) (Lawrence et al., 2012). It is a "third generation" model that incorporates the interactions of both nitrogen and carbon in the calculations of water and energy fluxes. Compared with its previous versions, CLM4 (Oleson et al., 2008) has multiple enhancements relevant to soil moisture computing. For instance, the model's soil moisture is estimated by adopting an improved one-dimensional Richards equation (Zeng and Decker, 2009); the new version allows the dynamic interchanges of soil water and groundwater through an improved definition of the soil column's lower boundary condition that is similar to that of the Noah-MP (Niu et al., 2007). Furthermore, the thermal and hydrologic properties of organic soil are included for the modelling which is based on the method developed in Lawrence and Slater (2008). The total ground column is extended to 42 m depth, consisting of 10 soil layers unevenly spaced between the top layer (0.0-1.8 cm) and the bottom layers (229.6-380.2 cm), and 5 bedrock layers to the bottom of the ground column . Soil moisture is estimated for each soil layer.

WRF model parameterisation
The WRF model is centred over the Emilia Romagna Region with three nested domains (D1-D3 with the horizontal grid sizes of 45, 15, and 5 km, respectively), of which the innermost domain (D3, with 88 × 52 grids -west-east and southnorth, respectively) is used in this study. A two-way nesting scheme is adopted, allowing information from the child domain to be fed back to the parent domain. With atmospheric forcing, static inputs (e.g. soil and vegetation types), and parameters, the WRF model needs to be spun up to reach its equilibrium state before it can be used (Cai et al., 2014;Cai, 2015). In this study, WRF is spun up by running through the whole year of 2005. After the spin-up, the WRF model for each of the selected LSM schemes is executed at a daily time step from 1 January 2006 to 31 December 2015, using the ERA-Interim datasets.
The microphysics scheme plays a vital role in simulating accurate rainfall information which in turn is important for modelling the accurate soil moisture variations. WRF V3.8 is supporting 23 microphysics options ranging from simple to more sophisticated mixed-phase physical options. In this study, the WRF Single-Moment 6-class Microphysics Scheme is adopted, which considers ice, snow, and graupel processes and is suitable for high-resolution applications (Zaidi and Gisen, 2018). The physical options used in the WRF setup are Dudhia shortwave radiation (Dudhia, 1989) and Rapid Radiative Transfer Model (RRTM) longwave radiation (Mlawer et al., 1997). Cumulus parameterisation is based on the Kain-Fritsch scheme (Kain, 2004), which is capable of representing sub-grid-scale features of the updraft and rain processes, and such a capability is beneficial for real-time modelling (Gilliland and Rowe, 2007). The surface layer parameterisation is based on the Revised fifthgeneration Pennsylvania State University-National Center for Atmospheric Research Mesoscale Model (MM5) Monin-Obukhov scheme (Jiménez et al., 2012). The Yonsei University scheme (Hong et al., 2006) is selected to calculate the planetary boundary layer. The parameterisation schemes used in the WRF modelling are shown in Table 2. The datasets for land use and soil texture are available in the pre-processing package of WRF. In this study, the land use categorisation is interpolated from the MODIS 21-category data classified by the International Geosphere Biosphere Programme (IGBP). The soil texture data are based on the Food and Agriculture Organization of the United Nations Global 5-minutes soil database.

Translation of observed and simulated soil moisture data to common soil layers
Since all soil moisture datasets have different soil depths, it is difficult for a direct comparison. The Noah and Noah-MP models include four soil layers, centred at 5, 25, 70, and 150 cm, respectively, whereas the CLM4 model has 10 soil layers, centred at 0.9, 3.2, 6. 85, 12.85, 22.8, 39.2, 66.2, 110.65, 183.95, and 304.9 cm, respectively. Moreover, the in situ sensor measures soil moisture centred at 10, 25, 70, 135, and 180 cm. In order to make the datasets comparable at consistent soil depths, the simple linear interpolation approach described in Zhuo et al. (2015b) is applied in this study, and a benchmark of the soil layer centred at 10, 25, 70 and 150 cm is adopted.

Soil moisture thresholds build up and evaluations
To build and evaluate the soil moisture thresholds for landslide forecasting, all datasets have been grouped into two portions: 2006-2013 for the establishment of thresholds, and 2014-2015 for the evaluation. The determination of soil moisture thresholds is based on determining the most suitable soil moisture triggering level for landslides occurrence by trying a range of exceedance probabilities (percentiles). For example, a 10 % exceedance probability is calculated by determining the 10th percentile result of the soil moisture datasets that are related to the landslides that occurred. The exceedance probability method is commonly utilised in landslide early warning studies for calculating the rainfallthresholds, which is therefore adopted here to examine its performance for soil moisture threshold calculations. To carry out the threshold evaluation, 45 rainfall events (during 2014-2015) are selected for the purpose. The rainfall events are separated based on at least 1 d of dry period (i.e. a period without rainfall). The rainfall data from each rain gauge station are first combined using the Thiessen polygon method, and with visual analysis, the 45 events are then finally selected. The information about the selected rainfall events can be found in Sect. 5. The threshold evaluation is based on the statistical approach described in Gariano et al. (2015) and Zhuo et al. (2019), where the soil moisture threshold can be treated as a binary classifier of the soil moisture conditions that are likely or unlikely to cause landslide events. With this hypothesis, the likelihood of a landslide event can either be true (T ) or false (F ), and the threshold forecasting can either be positive (P ) or negative (N ). The combinations of those four conditions can lead to four statistical outcomes (Fig. 3a) that are true positive (TP), true negative (TN), false positive (FP), and false negative (FN) (Wilks, 2011). Using the four outcomes, two statistical scores can be determined.
The hit rate (HR), which is the rate of the events that are correctly forecasted. Its formula is in the range of 0 and 1, with the best result as 1. Kain (2004) The false alarm rate (FAR), which is the rate of false alarms when the event did not occur. Its formula is in the range of 0 and 1, with the best result as 0.
For any soil moisture product, each threshold calculated is adopted to determine T , F , P , and N, respectively. Those values are finally integrated to find the overall scores of TP, FN, FP, TN, HR, and FAR. The threshold performance is then judged via the receiver operating characteristic (ROC) analysis (Hosmer and Lemeshow, 1989;Fawcett, 2006). As shown in Fig. 3b, the ROC curve is based on HR against FAR, and each point in the curve represents a threshold scenario (i.e. selected exceedance probabilities). The optimal result (the red point) can only be realised when the HR reaches 1 and the FAR reduces to 0. The closer the point is to the red point, the better the forecasting result is. To analyse and compare the forecasting performance numerically, the Euclidean distances (d) for each scenario to the optimal point are computed.

WRF model evaluations
In this study, the evaluation is based on the daily mean soil moisture. The reason for not using the antecedent soil moisture condition plus rainfall data on the day is because the purpose of this study is to explore the relationship between different WRF-simulated soil moisture and landslides only. In general, soil moisture is a predisposing factor for slope instability, while rainfall is the triggering factor. The same rainfall may trigger or may not a landslide depending on the soil moisture content at the time of the rainfall event. The mean soil moisture on the day of the landslide implicitly account for both the initial soil moisture and the effective rainfall ab-sorbed by the ground, and can be a robust indicator of the hydrological condition of the slope.

Soil moisture temporal comparisons
Although there is only one soil moisture sensor that provides long-term soil moisture data in the study region, it is still useful to compare it with the WRF-estimated soil moisture. In this study, we carry out a temporal comparison between all three WRF soil moisture products with the in situ observations (at a single soil moisture measuring point in the plain area). The comparison is implemented over the period from 2006 to 2015, and the WRF grid closest to the in situ sensor location is chosen. Figure 4 shows the comparison results at the four soil depths. The statistical performance (correlation coefficient r and root mean square error RMSE) of the three LSM schemes is summarised in Table 3. Based on the statistical results, Noah-MP surpasses other schemes at most soil layers, except for Layer 2, where CLM4 shows stronger correlation, and Layer 4, where Noah gives smaller RMSE error. For Noah-MP, the best correlation is observed at the surface layer (0.809), followed by the third (0.738), second (0.683) and fourth (0.498) layers; based on RMSE, the best performance is again observed at the surface layer and followed by the second, third and fourth layers in sequence (as 0.060, 0.070, 0.088, and 0.092 m 3 m −3 , respectively). From the temporal plots, it can be seen that at all four soil layers, all three LSM schemes can produce the soil moisture's seasonal cycle, with most upward and downward trends successfully represented. However, both the Noah and the CLM4 overestimate the variability at the upper two soil layers during almost the whole study period, and the situation is the worst for the Noah. Comparatively, the Noah-MP can better capture the wet soil moisture conditions, especially at the surface layer; it is the only model of the three that is able to simulate the large soil drying phenomenon close to the observations during the dry season, except for some extremely dry days. Towards 70 cm depth, although Noah-MP is still able to capture most of the soil moisture variabilities during the drying period, it significantly underestimates soil moisture values for most wet days. Similar underestimation results can be observed for CLM4 and Noah during the wet season at 70 cm; furthermore, both schemes are again not capable of reproducing the extremely drying phenomenon and overestimate soil moisture for most of the dry season days. It is surprising to see that at the deep soil layer (150 cm), all soil moisture products are underestimated. In particular, the outputs from the CLM4 and the Noah-MP only show small fluctuations. However, the soil moisture measurements from the in situ sensor also get our attention as they show strange fluctuations with numerous sudden drops and rise situations observed. The strange phenomenon is not expected at such a deep soil layer (although groundwater capillary forces can increase the soil moisture, its rate is normally very slow). One possible reason we suspect is sensor failure in the deep zone. Therefore, the assessment result for the deep soil layer should be considered unreliable. Overall for the Noah-MP, in addition to producing the highest correlation coefficient and the lowest RMSE, its simulated soil moisture variations are the closest to the observations. The better performance of the Noah-MP over the other two models agrees with the results found in Cai et al. (2014) (note: the paper uses stand-alone models, which are not coupled with WRF). Also, as has been discussed in , the Noah-MP presents a clear improvement over the Noah in simulating soil moisture globally. However, it should be noted that the evaluation results are only based on one soil moisture sensor located at the plain part of the study area.

Rainfall evaluations
Since soil moisture is related to rainfall, it is useful to carry out the evaluations of WRF rainfall estimations against the observations in the study area. The spatial plot of R for the three LSMs is shown in Fig. 5. It can be seen that the performances of the three models are very close to each other, with only small differences over the whole study region. In general, the performance is the best in the southeast region, with R reaching above 0.70. The poorest performance is observed in the northeast region and some parts of the mountain zone. Based on the spatial distribution of R, there is no clear correlation between the WRF rainfall performance and the topography of the region. The boxplot for the R performance is illustrated in Fig. 6a. It can be seen again that the performances of the three models are very similar. Generally, R ranges between around 0.10 and 0.80, and with the Table 3. Statistical summary of the WRF performance in simulating soil moisture for different soil layers, based on comparison with the single-point in situ observations. Note: the bold values show the best performance within each of the soil layers.  majority of the region performs around 0.40. RMSE performance is also calculated. Similar to the results of R, it has been found that the RMSE spatial distributions are very similar among the three models. Therefore, the RMSE spatial distribution map is not included in this paper. The boxplot of the RMSE is shown in Fig. 6b. Generally, the RMSE ranges between around 4 and 12 mm, with some outliers between around 12 and 20 mm. The majority of the region performs at around 7 mm RMSE. The statistical calculations are summarised in Table 4. Based on the results of R and RMSE, the WRF rainfall estimation performance in Emilia is similar to the one found in central USA (Van Den Broeke et al., 2018).
5 The assessment of WRF soil moisture threshold for landslide monitoring As introduced at the beginning of the paper, previous works (as discussed in the introduction section) have demonstrated that in complex geomorphologic settings (e.g. in Emilia Romagna), a rainfall threshold approach is too simple, and more hydrologically driven approaches need to be established. This section is to assess whether the spatial distribution of soil moisture can provide useful information for landslide monitoring at the regional scale. Particularly, all three soil moisture products simulated through the WRF model are used to derive threshold models, and the corresponding landslide prediction performances are then compared statistically.
Here the threshold is defined as the crucial soil moisture condition above which landslides are likely to happen. Among different factors for controlling the stability of slope, the slope angle is one of the most critical ones. From the slope angle map in Fig. 2, it can be seen the region has a clear spatial pattern of high and low slope areas, with the majority of the high-slope areas (which can be as steep as around 40 • ) located in the mountainous southern part and the river valleys. Based on the event data analysed, the landslides  that happened during the study period are mainly located in the high-slope region, with a particularly high concentration around the central southern part. The spatial distribution of the landslide events is also in line with the overall geological characteristics of the region, i.e. the southern part mainly constitutes the outcrop of sandstone rocks that make up the steep slopes and are covered by a thin layer of permeable sandy soil, which are highly unstable. Therefore, instead of only using one soil moisture threshold for the whole study area, it is useful to divide the region into several slope groups so that within each group a threshold model is built. To derive soil moisture threshold individually under different slope conditions, all data have been divided into three groups based on the slope angle (0.4-1.86; 1.87-9.61; 9.52-40.43; since no landslide events are recorded under the 0-0.39 group, the group is not considered here). As a result, all groups have equal coverage areas. There are different ways to group the slopes. In this study, in order to have equal coverage areas, we have identified these class-break values. In order to find the optimal threshold so that there are few overestimations (i.e. threshold is overestimated) and false alarms (i.e. threshold is underestimated), we test out 17 different exceedance probabilities from 1 % to 50 %. For each LSM scheme, the total number of threshold models is 204, which is the result of different combinations of slope groups, soil layers, and exceedance probability conditions. The calculated thresholds for all LSM schemes under three slope groups are plotted in Fig. 7. Overall there is a clear trend between the slope angle and the soil moisture threshold, i.e. the threshold becomes smaller for steeper areas. The corre- lation is more evident at the upper three soil layers (i.e. the top 1 m depth of soil), with only a few exceptions for Noah and CLM4 at the 1 % and the 2 % exceedance probabilities. At the deep soil layer centred at 150 cm, the soil moisture threshold difference between slope group (SG) 2 and 3 becomes very small for all three LSM schemes. This could be partially because at the deep soil layer, the change of soil moisture is much smaller than at the surface layer, and therefore the soil moisture values for SG 2 and 3 could be too similar to differentiate. However, for gentler slopes (SG 1), the higher soil moisture triggering level always applies even down to the deepest soil layer for all three LSM schemes. In this study, the results show that wetter soil is more likely to trigger landslides on gentler slopes than on steeper slopes.
All the threshold models are then evaluated under the 45 selected rainfall events (Table 5) using the ROC analysis. Each threshold determined for each of the slope class during the calibration is used for the evaluation. The period of the selected rainfall events is between 1 and 18 d, and the average rainfall intensity ranges from 5.05 to 24.69 mm d −1 . The resultant Euclidean distances (d) between each scenario of exceedance probability and the optimal point for ROC analysis are listed in Table 6 for all three WRF LSM schemes at the tested exceedance probabilities. The best performance (i.e. lowest d) in each column (i.e. each soil layer of an LSM scheme) is highlighted. In addition, the d results are also plotted in Fig. 8 to give a better view of the overall trend amongst different soil layers and LSM schemes. From the figure, for all three LSM schemes at all four soil layers, there is an overall downward and then stabilised trend. Overall for Noah, the simulated surface layer soil moisture provides better landslide monitoring performance than the rest of the soil layers from 1 % to 35 % exceedance probabilities; the scheme's worst performance is observed at the third soil layer, centred at 70 cm. The values of d for Noah's second and fourth layer are quite close to each other. For Noah-MP, the simulated surface layer soil moisture gives the best performance amongst all four soil layers for most cases between the 1 % and 35 % exceedance probability range; the scheme's worst performance is observed at the fourth layer. Unlike Noah, all four soil layers from the Noah-MP scheme provide a distinct performance amongst them (i.e. larger d difference). For CLM4, the performance for the surface layer is quite similar to the second layer's, and the differences between the four layers are small. From the Table 6, it can be seen that for Noah the most suitable exceedance probabilities (i.e. the highlighted numbers) range between 35 % and 50 %; for Noah-MP they are between 30 % and 50 %, and for CLM4 it stays at 40 % for all four soil layers. For both Noah and Noah-MP, the best performance is observed at the surface layer (d = 0.392 and d = 0.369, respectively). For CLM4, the best performances show no distinct pattern amongst soil layers (i.e. the best performance is found at the soil Layer 3, followed by Layers 2, 1, and 4). Of all the LSM schemes and soil layers, the best performance is found for Noah-MP at the surface layer with 30 % exceedance probability (d = 0.369). Based on the d results, WRF-modelled soil moisture provides better landslide prediction performance than the satellite ESA-CCI soil moisture products as shown in our previous study , i.e. d = 0.51). The ROC curve for the Noah-MP scheme at the surface layer is shown in Fig. 9. In the curve, each point represents a scenario with a selected exceedance probability level. It is clear that with various exceedance probabilities, Figure 9. ROC curve for the calculated thresholds using different exceedance probability levels (for Noah-MP at the surface layer). The "no gain" line and the optimal performance point (the red point) are also presented.
FAR can be decreased without sacrificing the HR score (e.g. 4 % to 10 % exceedance probabilities). At the optimal point at the 30 % exceedance probability, the best results for HR and FAR are observed as 0.769 and 0.289, respectively.

Discussions
In this study, the best landslide prediction performance for Noah and Noah-MP follows a regular trend: the deeper the soil layer, the poorer the landslide monitoring performance. There are several potential reasons for such an outcome. First, the simulated soil moisture accuracy at the shallower layers is better than that in the deeper zones. Second, although the wetness conditions at the sliding surface are important, the soil moisture above it is also important (i.e. the loading should be heavier with more water in the upper soil layer). Third, the landslides occurring in the region are mainly in the top shallow soil layer. Fourth, the WRFmodelled soil moisture is not accurate enough in assessing the landslide events in the study region. In order to find out the exact reasons, comprehensive studies with more detailed landslide event datasets are needed in the future.
For the WRF soil moisture evaluation, clearly the evaluation work based on a single soil moisture sensor located in a plain area is not sufficient to derive conclusions about the model's performance over the whole study region. Therefore, the results here are preliminary. However, in this study, by introducing the WRF spatial soil moisture information into the landslide prediction model, the performance has indeed been improved in comparison with our previous study using the satellite remote sensing soil moisture data . A similar concept has been carried out by Segoni et al. (2018b), who implemented the soil moisture information simulated from a hydrological model into a regional landslide early warning system with clear improvements in performance with regard to false alarms or missed alarms (i.e. when a hazard occurred but no early warning was provided). Although the results shown in this study are preliminary and confined to the study area, the improved landslide prediction performance is already obtained. Therefore, it is hoped that with more globally available and dense soil moisture network data and further refinements of the method, the results could be improved further.
In addition, ideally, it will be useful if there is a dense soil moisture sensing network covering the whole study area. In reality, that is not practical, so we have to rely on the spatial soil moisture information by other means. So far, the soil moisture data with the best spatial and temporal resolution is from the WRF model. One question that arises is how representative a single soil moisture sensor can be for the whole study area. We have carried out the correlation study of a single sensor with the whole study region (using the Noah-MP top-layer soil moisture data). As seen in Fig. 10a, the study region is divided into 44 equally spaced grids (30 km apart), with the grid centres marked as black crosses. The initial assumption is that the soil moisture sensor can only represent its adjacent area, but the result was a surprise (Fig. 10b). Based on the outcome, a single-point sensor can represent a significant proportion of the region. Admittedly, there are some areas where the correlations are poor, in particular Grid 27, which has been compared with its surrounding four grids as shown in Fig. 11. It can be seen the soil moisture variation at Grid 27 is totally different in comparison with that of the four surrounding grids. The unique soil moisture variation pattern observed in Grid 27 may be caused  by different land use and soil type in that area, but clearly further studies are needed to find out the exact reasons. The aforementioned work has prompted us to carry out a future study on the optimal soil moisture sensor network design for landside applications. Although there are numerous studies on the rain gauge network design by the research community, the soil moisture sensor network design has been largely ignored by the community. Hence, this study has paved a foundation for such research.
For the WRF rainfall evaluations, the results are not good. Rainfall is one of the main drivers of soil moisture change, and it is logical to think soil moisture and rainfall are highly  linked. However, since rainfall is high-frequency data while soil moisture is low-frequency data, they behave differently. The results illustrate that for landslide study, it is better to use the WRF soil moisture data than its rainfall data. Clearly more studies are needed to confirm this assumption.
Here, WRF is modelled based on the ERA-Interim datasets; however, it has been found in Albergel et al. (2018) that the performance of the ERA5 has surpassed the ERA-Interim. Therefore, the ERA5 datasets will be tested in our future studies. Model-based soil moisture estimations could be affected by error accumulation issues, especially in the real-time forecasting mode. A potential solution is to use data assimilation methodologies to correct such errors by assimilating soil moisture information from other data sources. Since in situ soil moisture sensors are only sparsely available in limited regions, soil moisture measured via satellite remote sensing technologies could provide useful alternatives. Another issue is with the landslide record data, as most of them are based on human experiences (e.g. newspapers and victims) and thus a lot of incidences could be unreported. Therefore, the conclusion made here could be biased. Other ways of expanding the current landslide catalogue can depend on automatic landslide detection methods based on remote sensing images (Nichol and Wong, 2005;Chen et al., 2018), internet new sources (as all landslides with a relevant impact on society will be reported on internet new sources), and automatic web data mining methods (Battistini et al., 2013;Goswami et al., 2018).

Conclusions
In this study, the usability of WRF-modelled soil moisture for landslide monitoring has been evaluated in the Emilia Romagna region based on the research duration between 2006 and 2015. Specifically, the four-layer soil moisture information simulated through the WRF's three most advanced LSM schemes (i.e. Noah, Noah-MP, and CLM4) is compared for the purpose. Through the temporal comparison with the in situ soil moisture observations, it has been found that all three LSM schemes at all four soil layers can produce the general soil moisture's seasonal cycle. However, only Noah-MP is able to simulate the large soil drying phenomenon close to the observations during the drying season, and it also has the highest correlation coefficient and the lowest RMSE at most soil layers amongst the three LSM schemes. However, it should be noted, the soil moisture evaluation is only based on a single-point-based soil moisture sensor that is available in the plain region of the study area. Therefore, the WRF soil moisture performance over the whole study region, in particular at the mountainous zone, cannot be evaluated in this study. Since soil moisture is related to rainfall, we have carried out the WRF rainfall assessments, based on the comparison with the dense rainfall network in the region.
The results have shown that there is no distinct difference between the three LSM schemes. The WRF rainfall performance is found to be similar to a study carried out in the central USA (Van Den Broeke et al., 2018). A landslide prediction model based on soil moisture and slope angle condition is built up, and 17 various exceedance probably levels between 1 % and 50 % are adopted to find the optimal threshold scenario. Through the ROC analysis of 612 threshold models, the best performance is obtained by the Noah-MP at the surface soil layer with 30 % exceedance probability. In summary, this study provides an overview of the soil moisture performance of three WRF LSM schemes for landslide hazard assessment. Based on the results, we demonstrate that the surface soil moisture (centred at 10 cm) simulated through the Noah-MP LSM scheme is useful in predicting landslide occurrences in the Emilia Romagna region. With the hit rate of 0.769 and the false alarm rate of 0.289 obtained in this study, such soil moisture information has the potential to provide landslide predictions through the use of rainfall data. Further study on the soil moisture representation of a single soil moisture sensor over a large region has also been carried out. The results demonstrate that although there is a significant elevation difference in the region, a single soil moisture sensor has a high correlation with a significant proportion of the study area. Although there is still a small proportion of areas where the correlation is poor, this has prompted us to carry out a future study on the optimal design of soil moisture sensor network for landslide study.
One must bear in mind that although the results demonstrated in this study are only valid for the selected region, the methodology could be generalised to derive site-specific calibrations in other sites using the proposed approach. In order to make a general conclusion, more research is needed using the methodology described in this paper. Particularly, a considerable number of catchments with a broad spectrum of climate and environmental conditions and dense soil moisture sensor networks will need to be investigated.
Data availability. The in situ soil moisture and rainfall data can be downloaded from http://www.smr.arpa.emr.it/dext3r/ (DEXT3R, 2019); the Landslide inventory data were kindly provided by Dr Matteo Berti, University of Bologna.
Author contributions. LZ carried out the modelling of WRF, evaluated its soil moisture performance in landslide prediction, and prepared the paper with contributions from all co-authors. QD and BZ processed the in situ rain gauge datasets (> 200 rain gauge stations). DH and NC provided guidance on the paper's main research direction and are the funding holders of this project in the UK and China, respectively.