Accurate high-resolution estimates of precipitation are vital to improving the understanding of basin-scale hydrology in mountainous areas. The traditional interpolation methods or satellite-based remote sensing products are known to have limitations in capturing the spatial variability of precipitation in mountainous areas. In this study, we develop a fusion framework to improve the annual precipitation estimation in mountainous areas by jointly utilizing the satellite-based precipitation, gauge measured precipitation, and vegetation index. The development consists of vegetation data merging, vegetation response establishment, and precipitation remapping. The framework is then applied to the mountainous areas of the Nu River basin for precipitation estimation. The results demonstrate the reliability of the framework in reproducing the high-resolution precipitation regime and capturing its high spatial variability in the Nu River basin. In addition, the framework can significantly reduce the errors in precipitation estimates as compared with the inverse distance weighted (IDW) method and the TRMM (Tropical Rainfall Measuring Mission) precipitation product.
Precipitation plays an important role in hydrological processes, land–atmospheric processes, and ecological dynamics. Accurate high-resolution precipitation is crucial for streamflow prediction, flood control, and water resources management in data-sparse regions such as mountainous areas (Song et al., 2016). However, it is a great challenge to obtain accurate precipitation in mountainous areas due to the sparse gauge network and the remarkable spatiotemporal variability of precipitation. Conventional gauge networks can provide accurate rainfall measurements at point scales, which can be interpolated within the region of interest to give estimates of precipitation in ungauged areas. However, such interpolated estimates might not be reliable in mountainous areas considering the very limited gauges there (Phillips et al., 1992; Mair and Fares, 2011; Jacquin and Soto-Sandoval, 2013; Wang et al., 2014; Borges et al., 2016).
Recently, remote-sensing-based precipitation (RSBP) products, such as the
Global Precipitation Climatology Project (GPCP) (Schamm et al., 2014), the
Tropical Rainfall Measuring Mission (TRMM) (Council, 2005), and the Climate
Prediction Center Morphing Method (CMORPH) (Joyce et al., 2004), have been
extensively used in ungauged or sparsely gauged areas to bridge the gap
between the need for precipitation estimates and the scarcity in gauge
observations (Akbari et al., 2012; Kneis et al., 2014; Li et al., 2015;
Worqlul et al., 2015; Mourre et al., 2016; Wong et al., 2016). Also, data
fusion across satellite and gauge observations is being conducted to further
the application of RSBPs (Rozante et al., 2010; Woldemeskel et al., 2013;
Arias-Hidalgo et al., 2013; Chen et al., 2016; Zhou et al., 2016). However,
due to the relatively coarse spatial resolution (e.g., 0.25–5
Precipitation estimates can be influenced by a variety of ambient factors
(e.g., topography, vegetation). In order to correct effects of topography on
precipitation estimates, a digital elevation model (DEM) has been widely used
in spatial interpolation of precipitation over mountainous areas
(Marquínez et al., 2003; Lloyd, 2005). However, the relationship between
elevation and precipitation is not clear. Meanwhile, strong correlations
between the normalized difference vegetation index (NDVI) and precipitation
have been found by several studies (Li et al., 2002; Kariyeva and Van
Leeuwen, 2011; Li and Guo, 2012; Sun et al., 2013; Campo-Bescós et al.,
2013). As such, establishing statistical models between the NDVI and
precipitation so as to improve the spatial resolution of TRMM products in
mountainous areas is becoming popular (Immerzeel et al., 2009; Jia et al.,
2011; Duan and Bastiaanssen, 2013; Chen et al., 2014; Xu et al., 2015; Mahmud
et al., 2015; Jing et al., 2016). For instance, Immerzeel et al. (2009)
downscaled TRMM-3B43 to 1 km based on an exponential relationship between
NDVI and TRMM precipitation on the Iberian Peninsula of Europe. Jia et
al. (2011) established four multivariable linear regression models between
TRMM-3B43 precipitation and two other factors (i.e., DEM and NDVI) of
different resolutions (0.25, 0.5, 0.75, and 0.1
However, the present RSBP–NDVI-based schemes have several limitations: (1) significant errors can be introduced during the downscaling given the nonlinear relationship between RSBP and NDVI; (2) large uncertainties exist in the RSBP for mountainous areas; and (3) inter-comparison of existing NDVI datasets is missing in deriving the RSBP–NDVI relationships. In this study, we develop a fusion framework to obtain more accurate high-resolution estimates of precipitation in mountainous areas based on the relationship between precipitation and vegetation response. More specifically, in addition to RSBP, gauge measurements and different vegetation datasets will be used in this study to overcome the aforementioned limitations in current RSBP–NDVI-based schemes. The paper is organized as follows: Sect. 2 describes the development of the fusion framework; Sect. 3 documents the study area and related datasets; Sect. 4 presents the results of the fusion framework and discusses impacts of different determinants on the performance of the fusion framework; and Sect. 5 summarizes this work.
The satellite–gauge–vegetation fusion framework (Fig. 1) involves three stages of development: (1) vegetation data merging, (2) precipitation–vegetation regression, and (3) RSBP product remapping, whose details are described in the following subsections.
Flow chart of the satellite–gauge–vegetation fusion framework development.
Vegetation closely interacts with soil moisture and is recognized as a good proxy of precipitation. The remote sensing technique provides us with various high-resolution vegetation products such as NDVI, EVI (enhanced vegetation index), and LAI (leaf area index). Among the vegetation indices, NDVI, an indicator of plant density and growth, is chosen as the proxy of precipitation in this study due to its wide availability. Considering the crucial role of NDVI in deriving precipitation estimates under our framework, we conduct an inter-comparison in data accuracy between two NDVI datasets (termed datasets A and B hereinafter) to reduce the error. First, the systematic errors of both datasets are eliminated by multiplying the reduction factor or using the simple regression model. After the correction, the final dataset is then obtained by selecting a better element between A and B if the quality criteria are satisfied, otherwise filling an anomaly value.
It should be noted that since the vegetation growth is suppressed or promoted on some land covers (e.g., rivers, lakes, snow and ice, and urban areas), the vegetation data of these land covers are excluded by filling anomaly values. Besides, due to the strong influence of farming activities (e.g., irrigation, fertilization, and harvest) on the crop growth, vegetation data of farmland are excluded as well. We note that although Moran's index (Li et al., 2007) is widely employed to detect anomalies in vegetation data (Jia et al., 2011; Duan and Bastiaanssen, 2013), it is not used in this study for its inapplicability in large areas with continuous anomaly pixels (e.g., farmland). As such, we identify anomaly pixels simply by land-use type: pixels categorized as water, wetland, urban, cropland, snow/ice, and barren will be identified as anomalies. The detected anomaly pixels are excluded from the original NDVI dataset and then filled with interpolated values using the IDW method so as to generate an optimized NDVI dataset.
Based on the optimized NDVI dataset, the NDVI data at the gauge locations are retrieved with the neighbor-average method (i.e., the value of a certain grid is determined as the average of all its eight neighboring grids) and will be used for the precipitation–vegetation regression.
As far as we know, there is no widely accepted form of the
precipitation–vegetation relationship. Therefore, the final regression form
will be determined from several candidate relationships, including
polynomial, exponential, logarithmic, and linear forms, according to the five
metrics: correlation coefficient (
Also, considering the annual variability of precipitation, the regression model is further determined for two temporal scales: (1) the entire period covering all the study years and (2) the individual year of the entire study period. The regression models for the entire study period and for individual years are thus termed RME and RMI, respectively. RME can utilize the full knowledge of precipitation characteristics of the entire study period, whereas RMI implies the inter-annual variability. Besides, RME can reasonably reconstruct the precipitation series of the years when data gaps exist.
The calibration–validation procedure for each candidate model is conducted
under three scenarios with different numbers of gauges and/or years: Fully random: a random number of gauges and a random number of years
are independently used for calibration and validation; All gauges, partial period: all the gauges will be involved in both
procedures, but only Partial gauges, entire period: all years will be used, but only
With the optimized vegetation dataset and the precipitation–vegetation regression model, the RSBP product is then remapped over the study region. Thanks to the finer resolution of the NDVI dataset than the RSBP product and the accurate estimate of precipitation by gauges, the remapped RSBP product is expected to provide more detailed spatial characteristics of precipitation over mountainous areas.
The Nu-Salween basin (Fig. 2a), where 6 million people live, is one of the
largest river basins in South Asia and spreads across three countries with an
area of 324 000 km
Considering the limited number of gauges (i.e., 13) in the Nu River basin, an
enlarged area covering 23–33
In this study, we use two MODIS (MODerate resolution Imaging Spectoradiometer) vegetation products, MOD13A3 (termed MOD hereafter) and MYD13A3 (termed MYD hereafter), in the application of the fusion framework. Both the MOD and MYD datasets contain 10 sub-datasets consisting of NDVI, EVI, and pixel reliability. The temporal and spatial resolutions of the MOD13A3 and MYD13A3 products are 1 month and 1 km, respectively. The pixel reliability is an accuracy metric of the data quality pixel and has four valid values: 0 for good accuracy, 1 for marginal accuracy, 2 for snow/ice, and 3 for cloud. Based on the pixel reliability information, the NDVI values are either selected for corresponding pixel reliability levels of 0 and 1, or discarded as anomalies otherwise.
The MOD dataset is used as a benchmark while MYD is taken as the alternative
for occasions when MOD data are missing or have large uncertainties. Since
both the MOD and MYD datasets are extracted from different satellites at
different transit times, systematic errors may exist in the difference
between the two datasets. As such, we construct two regressions to remove
their systematic errors: one is based on a subset with both MOD and MYD of
good reliability (
The annual MMD dataset is then calculated by averaging the 12 monthly images.
The MCD12Q1 Version 51 (MODIS/Terra
Box plots of
Comparison in annual precipitation between the gauged measurements
and predictions by the regression model for scenarios
Regression model performance and regression coefficients.
Statistics of regression models for validation and calibration under three scenarios.
The relationship between mean annual precipitation and elevation at
different elevation bands:
The relationship between mean annual precipitation and NDVI at
different elevation bands:
Datasets consisting of daily precipitation and air temperature collected at
the 59 gauges in the study area are obtained via the China Meteorological
Data Sharing Service system
(
Based on the results of six evaluation metrics for different regression form
candidates (Fig. 3a), the second-order polynomial is chosen as the regression
model form in this study:
The best performance of the regression model is found within
0.2 < NDVI < 0.7 and
400 mm yr
In general, the RMIs demonstrate better performance than RME, which can be
attributable to the lower variability of precipitation in a single year than
the whole study period. It is also noted that the
The performance of regression models is assessed under three scenarios as
described in Sect. 2.2. A total of 300 tests are conducted and performance
metrics (i.e.,
Scenario a is designed to examine inter-annual stability in the performance of regression models, where the good performance indicates the acceptable ability of the RME model in estimating precipitation during periods when precipitation measurements are not available. Scenarios b and c investigate the impacts of spatial and temporal coverages of measurements, respectively. It is noteworthy that under Scenario b better performance in regression models is observed as compared with Scenario c, implying the greater importance of spatial coverage of measurements in conducting the regressions. In addition, the results of calibration are better than validation, as revealed by all metrics criteria, as expected. However, the differences between calibration and validation are not significant, implying the consistent performance of regression models under various scenarios.
The performance of RME is further assessed by comparing the estimates against observations (Fig. 5), and good agreement between estimates and observations is observed. It should be noted that the RME shows difficulty in estimating precipitation higher than 2000 mm (cf. the dashed line in Fig. 5), implying the limitation of the fusion framework inherited from the oversaturation effect of the vegetation index.
Elevation effect on the relationship between precipitation and NDVI is a
concern to appreciate. An overall negative relationship is found between
precipitation and elevation for the whole elevation range (i.e., 0–5000 m)
with the
The spatial characteristics of the precipitation of the study area are
investigated with RME for the whole study period (Fig. 8). Annual
precipitation in the Nu River is observed to decrease from south to north and
from west to east with prominent spatial variability. Two “hot-spot”
regions, whose annual precipitation exceeds 1500 mm, can be identified in
the study areas: one near the southern border and the other close to the
southwestern mountain border. The eastern part of the Nu River basin
featuring a dry and warm climate receives an average annual precipitation of
800 mm with large inter-annual variability. A precipitation product (DEMP)
based on a precipitation–elevation relationship is used to compare with RME.
There is no obvious distribution pattern of precipitation (Fig. 9a) and a
smaller spatial variability compared to RME in the DEMP product, indicating
the advantage of RME in representing the spatial variability of annual
precipitation. And the overall underestimation of precipitation is observed
in the DEMP product across the whole study area (Fig. 9b). In addition, the
pixels in Fig. 8 with a value out of the valid range (i.e., 400 mm
yr
Average annual precipitation distribution of 2003–2012 from RME.
The performance between the IDW approach, the TRMM product and the fusion framework is compared in this section. IDW is one of the most popular methods for spatial interpolation of rainfall due to its easy implementation and flexibility in incorporating other auxiliary information (e.g., elevation). In general, the IDW approach is unable to demonstrate the high spatial variability, though it can capture the general spatial distribution of the whole basin (Fig. 10a), as TRMM (Fig. 10b). Due to the coarse spatial resolution, TRMM cannot capture the high variability in the river valley, where the elevation varies significantly. Although large rainfall (> 1800 mm) is observed in both our and TRMM products in the southwest of the study area region, our product gives lower rainfall compared to TRMM. As discussed above, the regression model tends to underestimate rainfall as the annual rainfall exceeds a certain threshold because the water supply is no longer a determinant of vegetation growth.
Spatial distribution of mean annual precipitation of 2003–2012
estimated by
To demonstrate the advantage of the fusion framework, a cross-validation is
conducted against the randomly sampled gauge observations by varying the
number of samples (1–40). The cross-validation shows a higher
Performance of
Performance comparison between IDW, RME, and TRMM.
To further evaluate the performance of RME, the annual averages of precipitation of five hydrological stations (Fig. 12a) and the whole basin estimated by the three approaches (IDW, RME, and TRMM) are compared. At the whole basin scale, the estimate by RME is 5.2 % higher than that of IDW but 7.9 % lower than TRMM. Although the difference between the three approaches is minimal at the basin scale, the difference at the sub-basin scale is remarkable. In the upstream region (i.e., the Gongshan sub-basin) located on the Tibetan Plateau, TRMM overestimates precipitation by 13.2 %, while IDW underestimates it by 7.6 % as compared with RME. In the other four downstream sub-basins, estimates by RME are larger than those by IDW and TRMM. In general, in the midstream and downstream regions with large variability in terrain height, RME gives larger estimates of precipitation than IDW and TRMM.
Regression model performance and coefficients of regression.
Results of two regression models established with extra independent
variables: RME
To validate the accuracy of different precipitation estimates, we utilize
the monthly MODIS (MOD16) global ET (evapotranspiration) product with 1 km spatial resolution (Mu et al., 2011) (i.e., ET
We also compared our products with the Multi-Source Weighted-Ensemble
Precipitation (MSWEP) product. The dataset takes advantage of a wide range of
data sources, including gauges, satellites, and atmospheric reanalysis
models, to obtain the best possible precipitation estimates at the global
scale with a high 3-hourly temporal and 0.25
Considering the possible degradation in model performance caused by oversaturation of NDVI in high biomass areas, another vegetation indicator, the enhanced vegetation index (EVI), is suggested as an alternative for estimating vegetation growth (Matsushita et al., 2007; Liao et al., 2015). As such, we also test the fusion framework with EVI in addition to NDVI and the results are assessed against the gauge observations.
Based on the chosen metrics, EVI is found to outperform NDVI with better
regression quality (Table 4): the EVI-based regression model gives higher
One major assumption of the proposed framework is that precipitation is the
only determinant of vegetation growth, and thus NDVI is regarded as a proxy
for precipitation. However, other ambient factors, such as soil properties,
solar radiation, air temperature, and elevation, may significantly influence
the vegetation growth as well as NDVI values. Considering the data
availability of various ambient factors, air temperature and elevation, in
addition to NDVI, are adopted as extra determinants to establish the
regression models, which are thus termed RME
The differences in
Comparison in mean annual precipitation between the gauged measurements and predictions by the MSWEP, RMM and RME.
Regression relationship between annual precipitation and normalized NDVI/EVI.
Spatial precipitation difference between RME and
In this study, a satellite–gauge–vegetation fusion framework has been developed for estimating the precipitation in mountainous areas by establishing a regression relationship between gauge-based precipitation observations and a satellite-based vegetation dataset. The fusion framework was then applied in the Nu River basin of Southwest China for estimating precipitation between 2001 and 2012.
The fusion framework for the Nu River basin adopted a second-order polynomial
form and demonstrated promising ability in capturing the high spatial
variability of precipitation in the river valley. Five evaluation metrics, including
The success of application of the fusion framework in the Nu River sheds light on the precipitation estimation in mountainous areas by using multi-source datasets. However, this framework does have certain limitations that are important to appreciate. First, the framework is applied only in the Nu River basin. More mountainous areas under different climates need to be examined to further test the robustness of this framework. In addition, although the RME model can utilize the full knowledge of precipitation in the entire study period compared with RMI models, the difference in the coefficients suggests apparent inter-annual variability of precipitation that should be considered when applying these models. Given the duration of study period and purpose, we suggest the RME model be used for long-term climatology identification while RMI models for inter-annual variability examination. Also, to fully verify the theoretical basis of this framework that vegetation actively interacts with precipitation in mountainous areas, future work is required to refine the spatiotemporal resolution of this study to enable better scrutiny into vegetation–precipitation interactions at sub-monthly scales across more detailed vegetation species.
The MODIS data used in our study (MOD13A3, MYD13A3, and MCD12Q1) are supplied
by NASA (The National Aeronautics and Space Administration) and can be
accessed at
The merging of NDVI datasets improves the accuracy as expected (Fig. A1); the
monthly error rates (i.e., the ratio of the pixel whose quality value is
over 1) of MOD and MMD are generally reduced with an average of 5 % and
over 20 % in several months. Figure A2 shows that the accuracy of MMD is
significantly improved in a ridge area covering
23
Monthly error rate of MOD, MYD, and MMD.
Comparison of three NDVI products over a ridge area on June 2006,
Comparison of three NDVI monthly time series over one gauge.
The authors declare that they have no conflict of interest.
The study is supported by the NSFC under grants U1202231, 51679119, and 91647107, the National Key Technology Support Program under grant 2011BAC09B07-3, and by the China Postdoctoral Science Foundation under grant 2015T80093. The authors thank the China Meteorological Administration, Yunnan University, MODIS NDVI, the Tropical Rainfall Measuring Mission (TRMM), and the Shuttle Radar Topography Mission (SRTM) for providing the data used in this study. Edited by: L. Wang Reviewed by: three anonymous referees