the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
High-resolution soil moisture mapping in northern boreal forests using SMAP data and downscaling techniques
Abstract. Soil moisture plays an important part in predicting different forest-related phenomena, such as tree growth or forest fire risk. As these phenomena influence the carbon storage capacity of boreal forest ecosystems, it is crucial to provide soil moisture information at high temporal and spatial scales. Current satellite-based soil moisture products often have high temporal resolution at the expense of spatial resolution. Therefore, we developed a machine-learning-based model to estimate soil moisture at high temporal and high spatial resolution over boreal forested areas for the annual time period from May to October. The basis data of the model is the enhanced 9 km spatial resolution soil moisture data from the Soil Moisture Active Passive (SMAP) mission. Additionally, soil and vegetation properties, reanalysis-based parameters, and measured in situ soil moisture data are used to guide the model construction process. The analysis of the developed model shows that the model retains the temporal and large-scale spatial variability of SMAP soil moisture. Furthermore, comparisons with the independent in situ soil moisture data show that the soil moisture values predicted by the developed model have a better agreement with in situ values than SMAP soil moisture, as RMSE decreases from 0.097 m3/m3 to 0.065 m3/m3, and correlation increases from 0.30 to 0.52 over forest sites. Therefore, this machine-learning-based model can be used to predict high-resolution soil moisture over boreal forested areas.
- Preprint
(8748 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 05 Mar 2025)
-
RC1: 'Comment on hess-2024-390', Anonymous Referee #1, 28 Jan 2025
reply
The manuscript presents a machine-learning-based approach to downscale SMAP soil moisture data from 9 km to finer resolutions of 1 km and 250 m for boreal forests. The model integrates SMAP data with soil, vegetation, and weather inputs to provide higher spatial resolution soil moisture estimates, addressing the limitations of SMAP's coarse coverage in northern latitudes. Validation against in situ measurements shows improved accuracy, with reduced RMSE and increased correlation compared to raw SMAP data. However, the methodology is limited to forested areas, excluding peatlands and other land types. While the approach demonstrates the potential for high-resolution soil moisture mapping, several areas require substantial improvement before publication.
Major Comments:
- SMAP Mission provides SMAP-Sentinel 3 km and 1 km soil moisture (https://doi.org/10.1016/j.rse.2019.111380), and it is very strange to see that these are not discussed in the literature section.
- One of the key advantages of this study is its complement to the SMAP Sentinel Soil Moisture product, particularly by addressing NASA’s limitation in providing soil moisture data over northern latitudes. However, while this contribution is acknowledged, the paper could have been strengthened significantly by demonstrating a more direct comparison with SMAP Sentinel dense time series in areas where such data are available. I would suggest the author replicate the same method over the mainland where SMAP Sentinel retrieval is available and compare for multiple locations. This can help a wider audience understand how the discussed method is reliable when compared to the operational product. This would provide a robust validation framework and establish the superiority or limitations of the proposed methodology.
- The reliance on static inputs such as bulk density and silt content raises concerns about the adaptability of the model to regions beyond the boreal forests of Northern Finland. The training set’s limited geographic and environmental variability suggests that the model may not perform well in regions with differing soil or vegetation characteristics. This could potentially undermine the generalizability of the approach, and expanding the training dataset to include diverse boreal forest sites would address this shortcoming.
- The exclusion of peatlands from the study is a significant limitation, especially given their critical role in carbon storage in boreal ecosystems. Although the authors briefly discuss this gap, they fail to propose a concrete pathway for integrating peatlands into future models. More effort should have been made to outline how the methodology could be adapted to incorporate such essential land cover types.
- The discussion around uncertainty analysis highlights the model’s heavy dependence on soil properties, which dominate the prediction outcomes. While these are undoubtedly critical inputs, the relative insensitivity of the model to weather-related inputs like precipitation suggests a potential flaw in the approach. The coarse resolution of ERA5-Land data used for precipitation might be a contributing factor, and exploring higher-resolution meteorological datasets could refine the model’s sensitivity to short-term climatic variations.
- The use of a machine-learning-based gradient boosting model (LightGBM) is appropriate for capturing complex relationships, but the small training dataset limits the robustness of the approach. Therefore, it’s important to discuss the limitations of the method used and how to overcome them.
- The SMAP L3_SM_P_E spatial resolution is 33 km, which is gridded to 9 km, but this is not mentioned anywhere in the manuscript. Typically, downscaling should consider the original 33 km resolution rather than the 9 km gridded resolution. If the model directly uses the 9 km gridded data as the spatial resolution, I recommend reprocessing the model by considering the original 33 km resolution. Additionally, the revised version should clearly explain how this resolution is incorporated into the methodology.
- While the authors discuss future L-band missions like NISAR and ROSE-L, more focus should be given how this method will be useful for this mission.
- Validation against in situ measurements shows promising accuracy improvements, but the exclusion of outliers like DIS0004 suggests sensitivity to anomalies that the model should handle better.
- The conclusion section needs attention as it does not read well. Please consider rewriting the section more scientifically.
Minor Comments:
- L2: “Phenomena” is not appropriate here.
- L4: “High spatio-temporal scale” would be more appropriate.
- L37: “Short distance” is not appropriate. Rephrase as “spatially heterogeneous.”
- L41: Citation of the operational 1 km SMAP soil moisture product is missing (https://doi.org/10.1016/j.rse.2019.111380).
- There are multiple sites in Alaska where in-situ soil moisture is available, and those should be included, such as the site from Delta Junction (NEON site).
- A description of the study site is required in the main text.
- It is unclear why CORINE land cover is used. The new ESA 10 m land cover provides more sufficient information for this study and spatially has more detail than CORINE.
- L336: The NISAR mission will provide a 200 m soil moisture product as an operational soil moisture product. It is suggested to include proper citations in the manuscript (https://doi.org/10.1016/j.rse.2023.113667 and https://doi.org/10.1016/j.rse.2024.114288).
- L338: NISAR will be launched in April 2025.
Citation: https://doi.org/10.5194/hess-2024-390-RC1 -
RC2: 'Comment on hess-2024-390', Anonymous Referee #2, 21 Feb 2025
reply
In this manuscript, the authors presented a machine-learning-based model to downscale SMAP soil moisture data from 9km to 1km (or 250m). However, the method is limited to forested areas only, without considering peatlands or other land types. Similar to RC1’s comments, I believe the manuscript demonstrates good potential for a method to downscale soil moisture mapping with good writing, however, major revisions should be made to improve its quality before publication.
Major comment:
- Model performance: while the validation statistics provided do suggest an improvement of correlation with the in-situ data, the graphs in Figure 4 clearly suggest a consistent and considerable underestimation of the predicted data at 3 out of 4 sites compared to the in-situ data (~30-60%). At site B and D, the predicted data are even lower than the raw SMAP. Considering that soil moisture directly translates to “volumetric water content” as mentioned in L65, this underestimation is quite concerning as it unintentionally reduced approximately half of soil water volume in the area and should be addressed. Thus, I urge the authors to conduct additional model calibration (if applicable) and provide additional comment on validation results on p-bias as well as reasons why this underestimation happened.
- On similar issue, considering that the in-situ soil moisture data is available starting from 1952 (as mentioned in L133), I believe it will greatly improve the manuscript and method novelty/trustworthy if the authors could generate a longer dataset and validate with available (despite gaps at some stations) in situ data rather than just the summer or 2019.
- Further examination of Figure 4 suggests that the raw SMAP has considerable noise compared to in-situ data, which is then transferred to the downscaled product as mentioned by the authors in L295. While I agree that clearing these noises is not a straightforward task, I believe it is not entirely out of scope for this study considering various forcing data such as precipitation and temperature are already included as input for soil moisture modeling. Thus, I suggest the authors provide additional analyses investigating if there are any linkages between the spikes in in-situ soil moisture data with the precipitation/temperature dataset from ERA5-Land. Both of these forcing clearly have a direct impact on soil moisture. Thus, if there are some statistical linkages, a modification on how the forcing data is processed as input for the soil moisture is needed and should further improve the novelty of this new method by directly reducing noise of the raw SMAP data.
- The abstract mention that the method can provide “high temporal” soil moisture data, however, I fail to find some information or a clear comparison on temporal coverage (time step, start, end, etc.) for any of the referenced dataset (including in situ) or the newly generated data within the introduction/data section. Please consider adding additional information in the introduction section and provide a paragraph in discussion, as well as some mentioning in conclusion section comparing the significant improvement in temporal coverage of this new dataset with raw SMAP and the currently available downscaled ones (Zheng et al., 2023, Zhang et al.,2023, Fang et al., 2022, etc.).
Minor comment:
- L20, consider removing “for example”.
- L45, “… are not specifically designed for boreal forests”, please provide references for this claim and additional information on how the method being presented are more tailored for boreal forests.
- L278-294, “Even though SMAP… and agriculture”, this section clearly discusses about the based input data, thus, should be moved to the data section.
Citation: https://doi.org/10.5194/hess-2024-390-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
144 | 22 | 5 | 171 | 3 | 2 |
- HTML: 144
- PDF: 22
- XML: 5
- Total: 171
- BibTeX: 3
- EndNote: 2
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1