Introduction

HESS

Hydrology and Earth System Sciences

HESS

Hydrol. Earth Syst. Sci.

1607-7938

Copernicus Publications

Göttingen, Germany

10.5194/hess-21-999-2017

Remapping annual precipitation in mountainous areas based on vegetation patterns: a case study in the Nu River basin

Zhou

Xing

https://orcid.org/0000-0003-1963-4671

Guang-Heng

Shen

Chen

Sun

Ting

sunting@tsinghua.edu.cn State Key Laboratory of Hydro-Science and Engineering, Department of Hydraulic Engineering, Tsinghua University, Beijing 100084, China

Ting Sun (sunting@tsinghua.edu.cn)

16February2017

21 2 9991015 20November2016 23November2016 21January2017 23January2017

This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/

This article is available from https://hess.copernicus.org/articles/21/999/2017/hess-21-999-2017.html

The full text article is available as a PDF file from https://hess.copernicus.org/articles/21/999/2017/hess-21-999-2017.pdf

Accurate high-resolution estimates of precipitation are vital to improving the understanding of basin-scale hydrology in mountainous areas. The traditional interpolation methods or satellite-based remote sensing products are known to have limitations in capturing the spatial variability of precipitation in mountainous areas. In this study, we develop a fusion framework to improve the annual precipitation estimation in mountainous areas by jointly utilizing the satellite-based precipitation, gauge measured precipitation, and vegetation index. The development consists of vegetation data merging, vegetation response establishment, and precipitation remapping. The framework is then applied to the mountainous areas of the Nu River basin for precipitation estimation. The results demonstrate the reliability of the framework in reproducing the high-resolution precipitation regime and capturing its high spatial variability in the Nu River basin. In addition, the framework can significantly reduce the errors in precipitation estimates as compared with the inverse distance weighted (IDW) method and the TRMM (Tropical Rainfall Measuring Mission) precipitation product.

Introduction

Precipitation plays an important role in hydrological processes, land–atmospheric processes, and ecological dynamics. Accurate high-resolution precipitation is crucial for streamflow prediction, flood control, and water resources management in data-sparse regions such as mountainous areas (Song et al., 2016). However, it is a great challenge to obtain accurate precipitation in mountainous areas due to the sparse gauge network and the remarkable spatiotemporal variability of precipitation. Conventional gauge networks can provide accurate rainfall measurements at point scales, which can be interpolated within the region of interest to give estimates of precipitation in ungauged areas. However, such interpolated estimates might not be reliable in mountainous areas considering the very limited gauges there (Phillips et al., 1992; Mair and Fares, 2011; Jacquin and Soto-Sandoval, 2013; Wang et al., 2014; Borges et al., 2016).

Recently, remote-sensing-based precipitation (RSBP) products, such as the Global Precipitation Climatology Project (GPCP) (Schamm et al., 2014), the Tropical Rainfall Measuring Mission (TRMM) (Council, 2005), and the Climate Prediction Center Morphing Method (CMORPH) (Joyce et al., 2004), have been extensively used in ungauged or sparsely gauged areas to bridge the gap between the need for precipitation estimates and the scarcity in gauge observations (Akbari et al., 2012; Kneis et al., 2014; Li et al., 2015; Worqlul et al., 2015; Mourre et al., 2016; Wong et al., 2016). Also, data fusion across satellite and gauge observations is being conducted to further the application of RSBPs (Rozante et al., 2010; Woldemeskel et al., 2013; Arias-Hidalgo et al., 2013; Chen et al., 2016; Zhou et al., 2016). However, due to the relatively coarse spatial resolution (e.g., 0.25–5∘) and uncertainties of RSBPs, their applications in mountainous basins, where the precipitation shows large spatial variability, are still very limited (Krakauer et al., 2013; Chen and Li, 2016).

Precipitation estimates can be influenced by a variety of ambient factors (e.g., topography, vegetation). In order to correct effects of topography on precipitation estimates, a digital elevation model (DEM) has been widely used in spatial interpolation of precipitation over mountainous areas (Marquínez et al., 2003; Lloyd, 2005). However, the relationship between elevation and precipitation is not clear. Meanwhile, strong correlations between the normalized difference vegetation index (NDVI) and precipitation have been found by several studies (Li et al., 2002; Kariyeva and Van Leeuwen, 2011; Li and Guo, 2012; Sun et al., 2013; Campo-Bescós et al., 2013). As such, establishing statistical models between the NDVI and precipitation so as to improve the spatial resolution of TRMM products in mountainous areas is becoming popular (Immerzeel et al., 2009; Jia et al., 2011; Duan and Bastiaanssen, 2013; Chen et al., 2014; Xu et al., 2015; Mahmud et al., 2015; Jing et al., 2016). For instance, Immerzeel et al. (2009) downscaled TRMM-3B43 to 1 km based on an exponential relationship between NDVI and TRMM precipitation on the Iberian Peninsula of Europe. Jia et al. (2011) established four multivariable linear regression models between TRMM-3B43 precipitation and two other factors (i.e., DEM and NDVI) of different resolutions (0.25, 0.5, 0.75, and 0.1∘) to get 1 km estimates of precipitation in the Qaidam basin of China. Duan and Bastiaanssen (2013) used a nonlinear relationship between TRMM-3B43 and NDVI to downscale precipitation to 1 km in a humid area and a semi-arid area. Chen et al. (2014) established a spatially varying relationship between TRMM, NDVI, and DEM by using a local regression analysis approach known as geographically weighted regression (GWR) in South Korea. Xu et al. (2015) also used the GWR method to explore the spatial heterogeneity of the RSBP–NDVI and RSBP–DEM relationships over two mountainous areas in western China.

However, the present RSBP–NDVI-based schemes have several limitations: (1) significant errors can be introduced during the downscaling given the nonlinear relationship between RSBP and NDVI; (2) large uncertainties exist in the RSBP for mountainous areas; and (3) inter-comparison of existing NDVI datasets is missing in deriving the RSBP–NDVI relationships. In this study, we develop a fusion framework to obtain more accurate high-resolution estimates of precipitation in mountainous areas based on the relationship between precipitation and vegetation response. More specifically, in addition to RSBP, gauge measurements and different vegetation datasets will be used in this study to overcome the aforementioned limitations in current RSBP–NDVI-based schemes. The paper is organized as follows: Sect. 2 describes the development of the fusion framework; Sect. 3 documents the study area and related datasets; Sect. 4 presents the results of the fusion framework and discusses impacts of different determinants on the performance of the fusion framework; and Sect. 5 summarizes this work.

Framework development

The satellite–gauge–vegetation fusion framework (Fig. 1) involves three stages of development: (1) vegetation data merging, (2) precipitation–vegetation regression, and (3) RSBP product remapping, whose details are described in the following subsections.

Flow chart of the satellite–gauge–vegetation fusion framework development.

(a) Terrain map of the study area (the Nu-Salween basin and its adjacent areas). (b) The distribution of rainfall during the year across the Nu River.

Vegetation data merging

Vegetation closely interacts with soil moisture and is recognized as a good proxy of precipitation. The remote sensing technique provides us with various high-resolution vegetation products such as NDVI, EVI (enhanced vegetation index), and LAI (leaf area index). Among the vegetation indices, NDVI, an indicator of plant density and growth, is chosen as the proxy of precipitation in this study due to its wide availability. Considering the crucial role of NDVI in deriving precipitation estimates under our framework, we conduct an inter-comparison in data accuracy between two NDVI datasets (termed datasets A and B hereinafter) to reduce the error. First, the systematic errors of both datasets are eliminated by multiplying the reduction factor or using the simple regression model. After the correction, the final dataset is then obtained by selecting a better element between A and B if the quality criteria are satisfied, otherwise filling an anomaly value.

It should be noted that since the vegetation growth is suppressed or promoted on some land covers (e.g., rivers, lakes, snow and ice, and urban areas), the vegetation data of these land covers are excluded by filling anomaly values. Besides, due to the strong influence of farming activities (e.g., irrigation, fertilization, and harvest) on the crop growth, vegetation data of farmland are excluded as well. We note that although Moran's index (Li et al., 2007) is widely employed to detect anomalies in vegetation data (Jia et al., 2011; Duan and Bastiaanssen, 2013), it is not used in this study for its inapplicability in large areas with continuous anomaly pixels (e.g., farmland). As such, we identify anomaly pixels simply by land-use type: pixels categorized as water, wetland, urban, cropland, snow/ice, and barren will be identified as anomalies. The detected anomaly pixels are excluded from the original NDVI dataset and then filled with interpolated values using the IDW method so as to generate an optimized NDVI dataset.

Based on the optimized NDVI dataset, the NDVI data at the gauge locations are retrieved with the neighbor-average method (i.e., the value of a certain grid is determined as the average of all its eight neighboring grids) and will be used for the precipitation–vegetation regression.

Precipitation–vegetation regression

As far as we know, there is no widely accepted form of the precipitation–vegetation relationship. Therefore, the final regression form will be determined from several candidate relationships, including polynomial, exponential, logarithmic, and linear forms, according to the five metrics: correlation coefficient (R), coefficient of determination (R2), root-mean-square error (ERMS), mean relative error (EMR), and mean absolute relative error (EMAR), which are given as follows:

R=∑i=1n(Pi-P‾)(Oi-O‾)∑i=1n(Pi-P‾)2∑i=1n(Oi-O‾)2,R2=∑i=1n(Pi-Oi)2∑i=1n(Oi-O‾)2,ERMS=∑i=1n(Pi-Oi)2n,EMR=1n∑i=1nPi-Oi,EMAR=1n∑i=1n|Pi-Oi|Oi, where O‾ is the mean annual precipitation of all gauges, Oi the mean annual precipitation of gauge i, Pi the estimated precipitation at gauge i, and n the total number of gauges.

Also, considering the annual variability of precipitation, the regression model is further determined for two temporal scales: (1) the entire period covering all the study years and (2) the individual year of the entire study period. The regression models for the entire study period and for individual years are thus termed RME and RMI, respectively. RME can utilize the full knowledge of precipitation characteristics of the entire study period, whereas RMI implies the inter-annual variability. Besides, RME can reasonably reconstruct the precipitation series of the years when data gaps exist.

The calibration–validation procedure for each candidate model is conducted under three scenarios with different numbers of gauges and/or years: Scenario a

Fully random: a random number of gauges and a random number of years are independently used for calibration and validation;

Scenario b

All gauges, partial period: all the gauges will be involved in both procedures, but only 2/3 of years will be randomly chosen for calibration, and the other years for validation;

Scenario c

Partial gauges, entire period: all years will be used, but only 1/3 of gauges will be randomly chosen for calibration, and other gauges for validation.

For each scenario, the calibration–validation procedure will be performed for 100 samples determined based on the above criteria and the five evaluation metrics (i.e., R, R2, ERMS, EMA, and EMAR) will be calculated for each sample accordingly. The best model is then determined based on the metrics.

RSBP product remapping

With the optimized vegetation dataset and the precipitation–vegetation regression model, the RSBP product is then remapped over the study region. Thanks to the finer resolution of the NDVI dataset than the RSBP product and the accurate estimate of precipitation by gauges, the remapped RSBP product is expected to provide more detailed spatial characteristics of precipitation over mountainous areas.

(a) Different regression form between annual precipitation and NDVI; (b) the NDVI–precipitation relationships for RME and RMI.

Study area and datasets for framework application Study area

The Nu-Salween basin (Fig. 2a), where 6 million people live, is one of the largest river basins in South Asia and spreads across three countries with an area of 324 000 km2. This study focuses on the Chinese part of the Nu-Salween basin (termed the Nu River basin hereafter), where the elevation ranges from 446 to 6134 m and the narrowest part is only 24 km. The annual precipitation of the Nu River basin ranges from 400 to 2000 mm with an average of 900 mm, and the mean annual runoff is 69 km3. The precipitation of the Nu River basin generally decreases from southwest to northeast and demonstrates high variability due to mountain weather systems (e.g., the difference in annual precipitation between the mountaintop and valley of Gongshan is larger than 1000 mm). Annual rainfall varies significantly across this region. Figure 2b shows the annual rainfall distributions of seven stations located in the upstream, middle, and downstream of the Nu River basin. The upstream and downstream have similar rainfall distributions, with larger rainfall occurring in summer compared to winter, while the middle part observes relatively large rainfall in winter and spring. Thanks to the adequate rainfall and minimal human perturbation, the Nu River basin has an extensive vegetation coverage, with the dominant types grassland in the Qinghai–Tibetan Plateau (upper basin) and mixed forest in Yunnan Province (lower basin). However, the dense vegetation cover increases the difficulty in conducting precipitation observations and only 13 gauges are very unevenly distributed over the whole basin of 142 479 km2, which makes it highly challenging to obtain the accurate spatial precipitation characteristics with traditional interpolation approaches. Although the RSBP products are available for this area, they are too coarse (usually with a spatial resolution of ∼ 50 km) to capture the high spatial variability of precipitation.

Considering the limited number of gauges (i.e., 13) in the Nu River basin, an enlarged area covering 23–33∘ N and 91–101∘ E is chosen for the application of the fusion framework, where 59 gauges are available and the climatic and topographic conditions are similar: both regions are characterized as mountainous areas under the subtropical climate influenced by the southeast and southwest monsoons. Besides, given no rain gauges are available outside of China in this study region, the non-Chinese region is excluded from the study area.

Datasets Vegetation data

In this study, we use two MODIS (MODerate resolution Imaging Spectoradiometer) vegetation products, MOD13A3 (termed MOD hereafter) and MYD13A3 (termed MYD hereafter), in the application of the fusion framework. Both the MOD and MYD datasets contain 10 sub-datasets consisting of NDVI, EVI, and pixel reliability. The temporal and spatial resolutions of the MOD13A3 and MYD13A3 products are 1 month and 1 km, respectively. The pixel reliability is an accuracy metric of the data quality pixel and has four valid values: 0 for good accuracy, 1 for marginal accuracy, 2 for snow/ice, and 3 for cloud. Based on the pixel reliability information, the NDVI values are either selected for corresponding pixel reliability levels of 0 and 1, or discarded as anomalies otherwise.

The MOD dataset is used as a benchmark while MYD is taken as the alternative for occasions when MOD data are missing or have large uncertainties. Since both the MOD and MYD datasets are extracted from different satellites at different transit times, systematic errors may exist in the difference between the two datasets. As such, we construct two regressions to remove their systematic errors: one is based on a subset with both MOD and MYD of good reliability (= 0), and the other on a subset with MOD of marginal reliability (= 1) and MOD of good reliability (= 0). After the removal of systematic errors, a merged dataset of MOD and MYD (termed MMD hereafter) is generated under the criteria given as follows:

MMD=MODMOD==0,MYDMOD>1&MYD==0,MODMOD==1&MYD==1,NULLMOD>1&MYD>0.

The annual MMD dataset is then calculated by averaging the 12 monthly images.

Land-use data

The MCD12Q1 Version 51 (MODIS/Terra+Aqua Land Cover Type Yearly L3 Global 500 m SIN Grid V051) land-use dataset in the period of 2001–2013 is used to identify the outliers of MMD, while the IGBP (International Geosphere Biosphere Programme) classification is adopted for its wide applications. Due to mismatch in spatial resolutions between the MMD and MCD12Q1 datasets, the MCD12Q1 dataset is upscaled to 1 km as MMD for outlier identification. It should be noted that for any of the four 500 m pixels in MCD12Q1 classified as water, urban, snow or ice and cropland, the upscaled 1 km pixel will be assigned with a missing value (i.e., -9999) and the corresponding NDVI pixel will be identified as an outlier.

Box plots of R, R2, and ERMS of the RME model under three scenarios: (a) fully random; (b) all gauges, partial period; and (c) partial gauges, entire period. Details of the three scenarios refer to Sect. 2.2. The triangle marker corresponds to the value (R, R2, RMSE) of the RME model. Plus signs represent the outlier of the sample used to draw the box diagram whose value is out of the range from (Q1-1.5IQR) to (Q3 + 1.5IQR). Q1 and Q3 represent the lower and upper quartiles, IQR = Q3–Q1.

Comparison in annual precipitation between the gauged measurements and predictions by the regression model for scenarios (a) fully random; (b) all gauges, partial period; and (c) partial gauges, entire period. Details of the three scenarios refer to Sect. 2.2.

Regression model performance and regression coefficients.

Year Mean

ERMS

EMAR

(mm) (mm) (%) 2001 961 0.91 138 10.6 3038.1 -345.3 359.8 2002 887 0.90 119 10.2 1354.7 687.5 212.0 2003 828 0.75 155 14.0 1700.2 -115.5 472.7 2004 1018 0.89 171 12.4 3784.3 -1047.7 517.4 2005 810 0.93 97 9.5 2465.4 -265.0 363.2 2006 737 0.88 122 11.4 2065.2 -112.2 287.5 2007 928 0.84 184 14.6 2306.9 53.5 286.4 2008 960 0.91 121 9.4 2504.0 -258.1 433.5 2009 726 0.89 119 13.2 2091.3 -168.0 294.5 2010 937 0.94 124 9.1 4094.8 -1293.3 512.6 2011 824 0.84 167 14.2 4697.8 -2613.7 792.7 2012 791 0.89 114 10.6 1966.4 3.5 308.1 RME 848 0.83 174 15.2 2670.4 -471.2 409.2

Statistics of regression models for validation and calibration under three scenarios.

Scenario Statistics Calibration Validation

ERMS

EMAR

ERMS

EMAR

(mm) (%) (mm) (%) mean 0.91 0.83 175 16.6 0.91 173.9 16.8 a max 0.92 0.85 186.2 17.8 0.94 211.8 19.9 min 0.9 0.81 161.1 15.7 0.88 141 13.2 mean 0.92 0.84 166.6 15.8 0.91 186.1 17.8 b max 0.94 0.89 207 19.7 0.95 229.7 23.3 min 0.89 0.8 126.2 12.8 0.89 148.6 12.9 mean 0.91 0.82 172.7 16.5 0.91 180.8 17.3 c max 0.95 0.91 207.9 19.1 0.94 204.8 24.4 min 0.85 0.73 144.6 13.9 0.85 143.4 13.9

The relationship between mean annual precipitation and elevation at different elevation bands: (a) whole elevation bands; (b) elevation band: < 1000 m; (c) band: 1000–2000 m; (d) band: 2000–3000 m; (e) band: 3000–4000 m; (f) band: > 4000 m.

The relationship between mean annual precipitation and NDVI at different elevation bands: (a) elevation band: < 200 m; (b) band: 2000–3500 m; (c) band: > 3500 m; (d) whole bands; (e) comparison of the precipitation–NDVI relationship for different bands.

Weather data

Datasets consisting of daily precipitation and air temperature collected at the 59 gauges in the study area are obtained via the China Meteorological Data Sharing Service system (http://data.cma.cn/data/detail/dataCode/SURF_CLI_CHN_MUL_DAY_V3.0/keywords/v3.0.html). The air temperature measurements will be used for dependence analysis later in Sect. 4.5. The streamflow data provided by Yunnan University will be used for calculating sub-basin-scale precipitation based on water balance. The five hydrological stations are Gongshan, Liuku, Jiucheng, Gulaohe, and Dawanjiang, with drainage areas of 101146, 106681, 6308, 4185, and 7986 km2, respectively. MODIS evapotranspiration (ET) product MOD16 (http://www.ntsg.umt.edu/project/mod16) with the spatiotemporal resolution of 1 km / 1 weekly will also be used in calculating precipitation based on water balance.

Results and discussion Model calibration and validation

Based on the results of six evaluation metrics for different regression form candidates (Fig. 3a), the second-order polynomial is chosen as the regression model form in this study: p=aNDVI2+bNDVI+c, where p denotes the precipitation amount in millimeters, and a, b, and c are regression coefficients. The results of regression coefficients and evaluation metrics are given in Table 1, and the NDVI–precipitation relationships for the study period are demonstrated in Fig. 3b.

The best performance of the regression model is found within 0.2 < NDVI < 0.7 and 400 mm yr-1 < p < 1500 mm yr-1. Larger errors are found at pixels with NDVI larger than 0.7 or annual rainfall higher than 1500 mm, implying the water supply is no longer a determinant of vegetation growth as annual rainfall exceeds a certain threshold.

In general, the RMIs demonstrate better performance than RME, which can be attributable to the lower variability of precipitation in a single year than the whole study period. It is also noted that the R2 values of RMIs for drier years (2003, 2009, and 2011) are less than wetter years, indicating the weaker coupling effect between vegetation growth and precipitation.

The performance of regression models is assessed under three scenarios as described in Sect. 2.2. A total of 300 tests are conducted and performance metrics (i.e., R, R2, ERMS, and EMAR) are calculated accordingly (Fig. 4 and Table 2). The high R values (> 0.85) indicate a strong correlation between NDVI and precipitation independent of sampling method. Also, the regression models demonstrate good performance, with R2 larger than 0.75 and EMAR less than 20 %. In addition, the metrics of regression models fluctuate around that of the RME, with narrow inter-quartile ranges, indicating the regression models have remarkable consistency with the RME model.

Scenario a is designed to examine inter-annual stability in the performance of regression models, where the good performance indicates the acceptable ability of the RME model in estimating precipitation during periods when precipitation measurements are not available. Scenarios b and c investigate the impacts of spatial and temporal coverages of measurements, respectively. It is noteworthy that under Scenario b better performance in regression models is observed as compared with Scenario c, implying the greater importance of spatial coverage of measurements in conducting the regressions. In addition, the results of calibration are better than validation, as revealed by all metrics criteria, as expected. However, the differences between calibration and validation are not significant, implying the consistent performance of regression models under various scenarios.

The performance of RME is further assessed by comparing the estimates against observations (Fig. 5), and good agreement between estimates and observations is observed. It should be noted that the RME shows difficulty in estimating precipitation higher than 2000 mm (cf. the dashed line in Fig. 5), implying the limitation of the fusion framework inherited from the oversaturation effect of the vegetation index.

Elevation effect on the relationship between precipitation and NDVI is a concern to appreciate. An overall negative relationship is found between precipitation and elevation for the whole elevation range (i.e., 0–5000 m) with the R2 value of 0.62 (Fig. 6a), whereas there is only an unapparent/weak relationship at different elevation bands (Fig. 6b–f). Given the spatial heterogeneity of orographic effects on precipitation (Brunsdon et al., 2001; Daly et al., 2008) and the insufficient data of this study, a more thorough investigation of the relationship between precipitation and elevation needs to be conducted with more information that might be available in the future. Positive precipitation–NDVI relationships are found at different elevation bands (Fig. 7), with the best and worst fitness observed at elevation band 2000–3500 m with an R2 value of 0.94 and at elevation band 0–2000 m with an R2 value of 0.62, respectively. By comparing the three regressions at different bands with the global regression, we notice that more significant overestimates of precipitation are observed with the range of lower NDVI values (< 0.4) at band 0–2000 m than the other three regressions, whereas regression at band > 3500 m has a significant overestimation of precipitation than the other three regressions for higher NDVI values (> 0.5).

Spatial characteristics of precipitation

The spatial characteristics of the precipitation of the study area are investigated with RME for the whole study period (Fig. 8). Annual precipitation in the Nu River is observed to decrease from south to north and from west to east with prominent spatial variability. Two “hot-spot” regions, whose annual precipitation exceeds 1500 mm, can be identified in the study areas: one near the southern border and the other close to the southwestern mountain border. The eastern part of the Nu River basin featuring a dry and warm climate receives an average annual precipitation of 800 mm with large inter-annual variability. A precipitation product (DEMP) based on a precipitation–elevation relationship is used to compare with RME. There is no obvious distribution pattern of precipitation (Fig. 9a) and a smaller spatial variability compared to RME in the DEMP product, indicating the advantage of RME in representing the spatial variability of annual precipitation. And the overall underestimation of precipitation is observed in the DEMP product across the whole study area (Fig. 9b). In addition, the pixels in Fig. 8 with a value out of the valid range (i.e., 400 mm yr-1 < P < 1500 mm yr-1) may have a relatively large error as discussed in Sect. 4.1. As there is no justifiable method for such a correction and given the limited fraction of invalid pixels (10 % in the whole study area and 7 % in the Nu River basin), the figure can be used to demonstrate a full picture of the spatial precipitation pattern in the study area, but we note those pixels are of large uncertainties and should be interpreted with caution.

Average annual precipitation distribution of 2003–2012 from RME.

(a) The map of precipitation estimates of DEMP; (b) difference in precipitation estimates between RME and DEMP.

Model performance comparison

The performance between the IDW approach, the TRMM product and the fusion framework is compared in this section. IDW is one of the most popular methods for spatial interpolation of rainfall due to its easy implementation and flexibility in incorporating other auxiliary information (e.g., elevation). In general, the IDW approach is unable to demonstrate the high spatial variability, though it can capture the general spatial distribution of the whole basin (Fig. 10a), as TRMM (Fig. 10b). Due to the coarse spatial resolution, TRMM cannot capture the high variability in the river valley, where the elevation varies significantly. Although large rainfall (> 1800 mm) is observed in both our and TRMM products in the southwest of the study area region, our product gives lower rainfall compared to TRMM. As discussed above, the regression model tends to underestimate rainfall as the annual rainfall exceeds a certain threshold because the water supply is no longer a determinant of vegetation growth.

Spatial distribution of mean annual precipitation of 2003–2012 estimated by (a) IDW and (b) TRMM.

To demonstrate the advantage of the fusion framework, a cross-validation is conducted against the randomly sampled gauge observations by varying the number of samples (1–40). The cross-validation shows a higher ERMS for the IDW approach, followed by TMMM and RME (Fig. 11a). A higher mean EMR of 15 % is observed for TRMM than for IDW (8 %) and RME (5 %), while the differences in EMAR are minimal between TRMM and IDW. The results indicate an overestimated precipitation by TRMM as compared to gauge observations. Table 3 summarizes the maximum, minimum, and mean values of each method and shows the relative difference between RME and the other two methods. On average, the ERMS of RME is smaller than that of IDW and TRMM by 20.4 and 17.4 %, respectively. In general, the fusion framework demonstrates better performance than the other approaches.

Performance of ERMS, EMR, and EMAR for three methods in different removed numbers.

Performance comparison between IDW, RME, and TRMM.

Method Statistics

ERMS

EMR

EMAR

(mm) IDW max 273 0.1 0.26 min 249 0.08 0.23 mean 223 0.05 0.21 TRMM max 220 0.17 0.24 min 213 0.16 0.23 mean 203 0.15 0.22 RME max 183 0.07 0.18 min 177 0.05 0.17 mean 168 0.04 0.16 RME–IDW (%) max -32.9 -33 -30.5 min -26.3 -9.8 -21.4 mean -20.4 -1.2 -18.9 RME–TRMM (%) max -16.8 -59.5 -23.8 min -16.6 -66 -25.9 mean -17.4 -71.5 -28.3

To further evaluate the performance of RME, the annual averages of precipitation of five hydrological stations (Fig. 12a) and the whole basin estimated by the three approaches (IDW, RME, and TRMM) are compared. At the whole basin scale, the estimate by RME is 5.2 % higher than that of IDW but 7.9 % lower than TRMM. Although the difference between the three approaches is minimal at the basin scale, the difference at the sub-basin scale is remarkable. In the upstream region (i.e., the Gongshan sub-basin) located on the Tibetan Plateau, TRMM overestimates precipitation by 13.2 %, while IDW underestimates it by 7.6 % as compared with RME. In the other four downstream sub-basins, estimates by RME are larger than those by IDW and TRMM. In general, in the midstream and downstream regions with large variability in terrain height, RME gives larger estimates of precipitation than IDW and TRMM.

Regression model performance and coefficients of regression.

ERMS

EMAR

(mm) (%) NDVI 0.83 174.7 14.8 2670.4 -471.2 409.2 EVI 0.87 143.8 12.4 5129.6 702.5 254.7

Results of two regression models established with extra independent variables: RME + T for temperature, RME + H for elevation.

Model

ERMS

EMAR

Extra (mm) (%) b RME 0.83 174.7 15 2670.4 -471.2 409.2 – RME + 

0.84 172.6 15 2728.8 -496 407.3 -0.2 RME + 

0.84 172.6 15 2838.4 -638.7 492.9 -0.02

To validate the accuracy of different precipitation estimates, we utilize the monthly MODIS (MOD16) global ET (evapotranspiration) product with 1 km spatial resolution (Mu et al., 2011) (i.e., ET + R) and to compare it with five products, including RME, BandP (rainfall based on the precipitation–NDVI relationship with the consideration elevation band), DEMP, TRMM, and IDW (Fig. 12b). Although all five products underestimate the sub-basin-scale precipitation, RME and BandP give the closest estimates to the water-budget-based precipitation, indicating the effectiveness of the precipitation–NDVI relationship in precipitation remapping.

We also compared our products with the Multi-Source Weighted-Ensemble Precipitation (MSWEP) product. The dataset takes advantage of a wide range of data sources, including gauges, satellites, and atmospheric reanalysis models, to obtain the best possible precipitation estimates at the global scale with a high 3-hourly temporal and 0.25∘ spatial resolution (Beck et al., 2016). Comparison in the annual mean precipitation between the gauge measurements and predictions by the MSWEP and TRMM products (Fig. 13) shows acceptable performance of both MSWEP and TRMM in predicting the precipitation with an overall overestimation. The RMSE values for MSWEP, TRMM, and RME are 241, 196, and 174 mm, respectively, indicating that RME gives the best prediction among the three products. The possible reason why MSWEP shows no superiority over TRMM in predicting annual precipitation is that very few gauges are available in this region that might limit the applicability of the MSWEP methodology. However, the MSWEP methodology does provide insights into the production of high temporal resolution (3-hourly) rainfall, which we believe will be helpful to our future work.

Influence of different vegetation indices

Considering the possible degradation in model performance caused by oversaturation of NDVI in high biomass areas, another vegetation indicator, the enhanced vegetation index (EVI), is suggested as an alternative for estimating vegetation growth (Matsushita et al., 2007; Liao et al., 2015). As such, we also test the fusion framework with EVI in addition to NDVI and the results are assessed against the gauge observations.

Based on the chosen metrics, EVI is found to outperform NDVI with better regression quality (Table 4): the EVI-based regression model gives higher R2, and smaller ERMS and EMAR compared to the NDVI-based model. Also, a remarkable difference is observed in the precipitation estimates based on the two vegetation indices (Fig. 14). It is noted that the curvature of the EVI-based model is larger than the NDVI-based model, suggesting higher sensitivity of the EVI-based model in a humid environment. Although the EVI-based model demonstrates better performance than the NDVI-based one, it should be noted that NDVI is the most popular vegetation index used in operational applications among the available vegetation index products. Besides, NDVI has a relative longer temporal coverage compared to other vegetation index products. For instance, the AVHRR (Advanced Very High Resolution Radiometer) NDVI data have been available since 1982 with a global coverage. As such, under scenarios when EVI is unavailable, NDVI is a satisfactory index that can be used in the fusion framework.

(a) Sub-basins based on hydrological stations. (b) Comparison between precipitations based on basin water balance (R + ET) and different annual rainfall products: DEMP (P elevation relationship), BandP (P–NDVI relationship with consideration elevation band), RME, TRMM, and IDW. GS, JC, GLH, DWJ, and LK-GS are the abbreviations for Gongshan, Jiuchen, Gulaohe, Dawanjing, and Liuku-Gongshan, respectively.

Influence of other ambient determinants

One major assumption of the proposed framework is that precipitation is the only determinant of vegetation growth, and thus NDVI is regarded as a proxy for precipitation. However, other ambient factors, such as soil properties, solar radiation, air temperature, and elevation, may significantly influence the vegetation growth as well as NDVI values. Considering the data availability of various ambient factors, air temperature and elevation, in addition to NDVI, are adopted as extra determinants to establish the regression models, which are thus termed RME + T and RME + H for air temperature and elevation, respectively. We note that, for simplicity, the extra determinants are assumed to have a linear relationship with precipitation.

The differences in R2, ERMS, and EMAR between the three models are minimal, and the regression coefficients of the three models are very close to each other (Table 5). The negative regression coefficient of temperature in RME + T indicates inconsistent trends between precipitation and temperature. Since the temperature decreases with the increase in elevation, RME + T and RME + H essentially provide consistent estimates of precipitation which are also clearly shown in Fig. 15. It is also noted that the information added by extra determinants (i.e., air temperature and elevation) is in fact minimal. Overall there is little difference between RME and the other two products. As such, we consider the RME-only-based vegetation index to be a simple and efficient model for precipitation estimation.

Comparison in mean annual precipitation between the gauged measurements and predictions by the MSWEP, RMM and RME.

Regression relationship between annual precipitation and normalized NDVI/EVI.

Spatial precipitation difference between RME and (a) RME + H; (b) RME + T (b).

Conclusion

In this study, a satellite–gauge–vegetation fusion framework has been developed for estimating the precipitation in mountainous areas by establishing a regression relationship between gauge-based precipitation observations and a satellite-based vegetation dataset. The fusion framework was then applied in the Nu River basin of Southwest China for estimating precipitation between 2001 and 2012.

The fusion framework for the Nu River basin adopted a second-order polynomial form and demonstrated promising ability in capturing the high spatial variability of precipitation in the river valley. Five evaluation metrics, including R, R2, ERMS, EMR, and EMAR, indicated good performance of the fusion framework in precipitation estimation. The performance of the fusion framework was also compared with the IDW approach and TRMM product and the comparison results indicated that the fusion framework generally outperformed other approaches in estimating precipitation in mountainous areas. On average, the ERMS of the fusion framework is 20.4 %, 17.4 % smaller than that of IDW and TRMM, respectively. The EMR of the fusion framework is 1.2 %, 71.5 % smaller than that of IDW and TRMM. The EMAR of the fusion framework is 18.9 %, 28.3 % smaller than that of IDW and TRMM.

The success of application of the fusion framework in the Nu River sheds light on the precipitation estimation in mountainous areas by using multi-source datasets. However, this framework does have certain limitations that are important to appreciate. First, the framework is applied only in the Nu River basin. More mountainous areas under different climates need to be examined to further test the robustness of this framework. In addition, although the RME model can utilize the full knowledge of precipitation in the entire study period compared with RMI models, the difference in the coefficients suggests apparent inter-annual variability of precipitation that should be considered when applying these models. Given the duration of study period and purpose, we suggest the RME model be used for long-term climatology identification while RMI models for inter-annual variability examination. Also, to fully verify the theoretical basis of this framework that vegetation actively interacts with precipitation in mountainous areas, future work is required to refine the spatiotemporal resolution of this study to enable better scrutiny into vegetation–precipitation interactions at sub-monthly scales across more detailed vegetation species.

Data availability

The MODIS data used in our study (MOD13A3, MYD13A3, and MCD12Q1) are supplied by NASA (The National Aeronautics and Space Administration) and can be accessed at https://reverb.echo.nasa.gov/reverb/url_hashes/61rh6164. The meteorological data, including precipitation, air temperature), are supplied by the China Meteorological Administration and can be downloaded at http://data.cma.cn/data/detail/dataCode/SURF_CLI_CHN_MUL_DAY_V3.0/keywords/v3.0.html. The river mask file of the Nu River is available on request to the corresponding author via sunting@tsinghua.edu.cn.

Merging of NDVI datasets

The merging of NDVI datasets improves the accuracy as expected (Fig. A1); the monthly error rates (i.e., the ratio of the pixel whose quality value is over 1) of MOD and MMD are generally reduced with an average of 5 % and over 20 % in several months. Figure A2 shows that the accuracy of MMD is significantly improved in a ridge area covering 23∘10′–23∘40′ N and 98∘30′–99∘0′ E. Figure A2b shows that the NDVI value near the right and left boundaries is underestimated by MOD. Figure A2c shows that the NDVI value in the middle boundary is underestimated by MYD. The underestimates in both products near the boundary of MOD and MYD are amended (Fig. A2a). Figure A3 shows the three NDVI series for one rain gauge. Comparing with MOD series, the improved accuracy in MMD is mainly observed in the wet season (from May to October), when the NDVI values could be often underestimated due to the overcasts.

Monthly error rate of MOD, MYD, and MMD.

Comparison of three NDVI products over a ridge area on June 2006, (a) for MMD, (b) for MOD, and (c) for MYD.

Comparison of three NDVI monthly time series over one gauge.

The authors declare that they have no conflict of interest.

Acknowledgements

The study is supported by the NSFC under grants U1202231, 51679119, and 91647107, the National Key Technology Support Program under grant 2011BAC09B07-3, and by the China Postdoctoral Science Foundation under grant 2015T80093. The authors thank the China Meteorological Administration, Yunnan University, MODIS NDVI, the Tropical Rainfall Measuring Mission (TRMM), and the Shuttle Radar Topography Mission (SRTM) for providing the data used in this study. Edited by: L. Wang Reviewed by: three anonymous referees

References 1

Akbari, A., Abu Samah, A., and Othman, F.: Integration of SRTM and TRMM date into the GIS-based hydrological model for the purpose of flood modelling, Hydrol. Earth Syst. Sci. Discuss., 9, 4747–4775, 10.5194/hessd-9-4747-2012, 2012.

Arias-Hidalgo, M., Bhattacharya, B., Mynett, A. E., and van Griensven, A.: Experiences in using the TMPA-3B42R satellite data to complement rain gauge measurements in the Ecuadorian coastal foothills, Hydrol. Earth Syst. Sci., 17, 2905–2915, 10.5194/hess-17-2905-2013, 2013.

Beck, H. E., van Dijk, A. I. J. M., Levizzani, V., Schellekens, J., Miralles, D. G., Martens, B., and de Roo, A.: MSWEP: 3-hourly 0.25∘ global gridded precipitation (1979–2015) by merging gauge, satellite, and reanalysis data, Hydrol. Earth Syst. Sci. Discuss., 10.5194/hess-2016-236, in review, 2016.

Brunsdon, C., McClatchey, J., and Unwin, D. J.: Spatial variations in the average rainfall-altitude relationship in Great Britain: an approach using geographically weighted regression, Int. J. Climatol., 21, 455–466, 10.1002/joc.614, 2001.

Campo-Bescós, M. A., Muñoz-Carpena, R., Southworth, J., Zhu, L., Waylen, P. R., and Bunting, E.: Combined Spatial and Temporal Effects of Environmental Controls on Long-Term Monthly NDVI in the Southern Africa Savanna, Remote Sensing, 5, 6513–6538, 10.3390/rs5126513, 2013.

Chen, F. and Li, X.: Evaluation of IMERG and TRMM 3B43 Monthly Precipitation Products over Mainland China, Remote Sensing, 8, 472, 10.3390/rs8060472, 2016.

Chen, F., Liu, Y., Liu, Q., and Li, X.: Spatial downscaling of TRMM 3B43 precipitation considering spatial heterogeneity, Int. J. Remote Sens., 35, 3074–3093, 10.1080/01431161.2014.902550, 2014.

Chen, J., Yong, B., Ren, L., Wang, W., Chen, B., Lin, J., Yu, Z., and Li, N.: Using a Kalman Filter to Assimilate TRMM-Based Real-Time Satellite Precipitation Estimates over Jinghe Basin, China, Remote Sensing, 8, 899, 10.3390/rs8110899, 2016.

Council, N. R.: Assessment of the Benefits of Extending the Tropical Rainfall Measuring Mission: A Perspective from the Research and Operations Communities, Interim Report, available from: https://www.nap.edu/catalog/11195/assessment-of-the-benefits (last access: 18 November 2016), 2005.

Daly, C., Halbleib, M., Smith, J. I., Gibson, W. P., Doggett, M. K., Taylor, G. H., Curtis, J., and Pasteris, P. P.: Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States, Int. J. Climatol., 28, 2031–2064, 10.1002/joc.1688, 2008.

Borges, P. A., Franke, J., da Anunciação, Y. M. T., Weiss, H., and Bernhofer, C.: Comparison of spatial interpolation methods for the estimation of precipitation distribution in Distrito Federal, Brazil, Theor. Appl. Climatol., 123, 335–348, 10.1007/s00704-014-1359-9, 2016.

Duan, Z. and Bastiaanssen, W. G. M.: First results from Version 7 TRMM 3B43 precipitation product in combination with a new downscaling–calibration procedure, Remote Sens. Environ., 131, 1–13, 10.1016/j.rse.2012.12.002, 2013.

Immerzeel, W. W., Rutten, M. M., and Droogers, P.: Spatial downscaling of TRMM precipitation using vegetative response on the Iberian Peninsula, Remote Sens. Environ., 113, 362–370, 10.1016/j.rse.2008.10.004, 2009.

Jacquin, A. P. and Soto-Sandoval, J. C.: Interpolation of monthly precipitation amounts in mountainous catchments with sparse precipitation networks, Chil. J. Agr. Res., 73, 406–413, 10.4067/S0718-58392013000400012, 2013.

Jia, S., Zhu, W., Lű, A., and Yan, T.: A statistical spatial downscaling algorithm of TRMM precipitation based on NDVI and DEM in the Qaidam Basin of China, Remote Sens. Environ., 115, 3069–3079, 10.1016/j.rse.2011.06.009, 2011.

Jing, W., Yang, Y., Yue, X., and Zhao, X.: A Spatial Downscaling Algorithm for Satellite-Based Precipitation over the Tibetan Plateau Based on NDVI, DEM, and Land Surface Temperature, Remote Sensing, 8, 655, 10.3390/rs8080655, 2016.

Joyce, R. J., Janowiak, J. E., Arkin, P. A., and Xie, P.: CMORPH: A Method that Produces Global Precipitation Estimates from Passive Microwave and Infrared Data at High Spatial and Temporal Resolution, J. Hydrometeorol., 5, 487–503, 10.1175/1525-7541(2004)005<0487:CAMTPG>2.0.CO;2, 2004.

Kariyeva, J. and Van Leeuwen, W. J. D.: Environmental Drivers of NDVI-Based Vegetation Phenology in Central Asia, Remote Sensing, 3, 203–246, 10.3390/rs3020203, 2011.

Kneis, D., Chatterjee, C., and Singh, R.: Evaluation of TRMM rainfall estimates over a large Indian river basin (Mahanadi), Hydrol. Earth Syst. Sci., 18, 2493–2502, 10.5194/hess-18-2493-2014, 2014.

Krakauer, N. Y., Pradhanang, S. M., Lakhankar, T., and Jha, A. K.: Evaluating Satellite Products for Precipitation Estimation in Mountain Regions: A Case Study for Nepal, Remote Sensing, 5, 4107–4123, 10.3390/rs5084107, 2013.

Li, B., Tao, S., and Dawson, R. W.: Relations between AVHRR NDVI and ecoclimatic parameters in China, Int. J. Remote Sens., 23, 989–999, 10.1080/014311602753474192, 2002.

Li, D., Ding, X., and Wu, J.: Simulating the regional water balance through hydrological model based on TRMM satellite rainfall data, Hydrol. Earth Syst. Sci. Discuss., 12, 2497–2525, 10.5194/hessd-12-2497-2015, 2015.

Li, H., Calder, C. A., and Cressie, N.: Beyond Moran's I: Testing for Spatial Dependence Based on the Spatial Autoregressive Model, Geogr. Anal., 39, 357–375, 10.1111/j.1538-4632.2007.00708.x, 2007.

Li, Z. and Guo, X.: Detecting Climate Effects on Vegetation in Northern Mixed Prairie Using NOAA AVHRR 1-km Time-Series NDVI Data, Remote Sensing, 4, 120–134, 10.3390/rs4010120, 2012.

Liao, Z., He, B., and Quan, X.: Modified enhanced vegetation index for reducing topographic effects, J. Appl. Remote Sens., 9, 096068–096068, 10.1117/1.JRS.9.096068, 2015.

Lloyd, C. D.: Assessing the effect of integrating elevation data into the estimation of monthly precipitation in Great Britain, J. Hydrol., 308, 128–150, 10.1016/j.jhydrol.2004.10.026, 2005.

Mahmud, M. R., Numata, S., Matsuyama, H., Hosaka, T., and Hashim, M.: Assessment of Effective Seasonal Downscaling of TRMM Precipitation Data in Peninsular Malaysia, Remote Sensing, 7, 4092–4111, 10.3390/rs70404092, 2015.

Mair, A. and Fares, A.: Comparison of Rainfall Interpolation Methods in a Mountainous Region of a Tropical Island, J. Hydrol. Eng., 16, 371–383, 10.1061/(ASCE)HE.1943-5584.0000330, 2011.

Marquínez, J., Lastra, J., and García, P.: Estimation models for precipitation in mountainous regions: the use of GIS and multivariate analysis, J. Hydrol., 270, 1–11, 10.1016/S0022-1694(02)00110-5, 2003.

Matsushita, B., Yang, W., Chen, J., Onda, Y., and Qiu, G.: Sensitivity of the Enhanced Vegetation Index (EVI) and Normalized Difference Vegetation Index (NDVI) to Topographic Effects: A Case Study in High-density Cypress Forest, Sensors, 7, 2636–2651, 10.3390/s7112636, 2007.

Mourre, L., Condom, T., Junquas, C., Lebel, T. E., Sicart, J., Figueroa, R., and Cochachin, A.: Spatio-temporal assessment of WRF, TRMM and in situ precipitation data in a tropical mountain environment (Cordillera Blanca, Peru), Hydrol. Earth Syst. Sci., 20, 125–141, 10.5194/hess-20-125-2016, 2016.

Phillips, D. L., Dolph, J., and Marks, D.: A comparison of geostatistical procedures for spatial analysis of precipitation in mountainous terrain, Agr. Forest Meteorol., 58, 119–141, 10.1016/0168-1923(92)90114-J, 1992.

Rozante, J. R., Moreira, D. S., de Goncalves, L. G. G., and Vila, D. A.: Combining TRMM and Surface Observations of Precipitation: Technique and Validation over South America, Weather Forecast., 25, 885–894, 10.1175/2010WAF2222325.1, 2010.

Schamm, K., Ziese, M., Becker, A., Finger, P., Meyer-Christoffer, A., Schneider, U., Schröder, M., and Stender, P.: Global gridded precipitation over land: a description of the new GPCC First Guess Daily product, Earth Syst. Sci. Data, 6, 49–60, 10.5194/essd-6-49-2014, 2014.

Song, J., Xia, J., Zhang, L., Wang, Z.-H., Wan, H., and She, D.: Streamflow prediction in ungauged basins by regressive regionalization: a case study in Huai River Basin, China, Hydrol. Res., 47, 1053–1068, 10.2166/nh.2015.155, 2016.

Sun, J., Cheng, G., Li, W., Sha, Y., and Yang, Y.: On the Variation of NDVI with the Principal Climatic Elements in the Tibetan Plateau, Remote Sensing, 5, 1894–1911, 10.3390/rs5041894, 2013.

Wang, S., Huang, G. H., Lin, Q. G., Li, Z., Zhang, H., and Fan, Y. R.: Comparison of interpolation methods for estimating spatial distribution of precipitation in Ontario, Canada, Int. J. Climatol., 14, 3745–3751, 10.1002/joc.3941, 2014.

Woldemeskel, F. M., Sivakumar, B., and Sharma, A.: Merging gauge and satellite rainfall with specification of associated uncertainty across Australia, J. Hydrol., 499, 167–176, 10.1016/j.jhydrol.2013.06.039, 2013.

Wong, J. S., Razavi, S., Bonsal, B. R., Wheater, H. S., and Asong, Z. E.: Evaluation of various daily precipitation products for large-scale hydro-climatic applications over Canada, Hydrol. Earth Syst. Sci. Discuss., 10.5194/hess-2016-511, in review, 2016.

Worqlul, A. W., Collick, A. S., Tilahun, S. A., Langan, S., Rientjes, T. H. M., and Steenhuis, T. S.: Comparing TRMM 3B42, CFSR and ground-based rainfall estimates as input for hydrological models, in data scarce regions: the Upper Blue Nile Basin, Ethiopia, Hydrol. Earth Syst. Sci. Discuss., 12, 2081–2112, 10.5194/hessd-12-2081-2015, 2015.

Xu, S., Wu, C., Wang, L., Gonsamo, A., Shen, Y., and Niu, Z.: A new satellite-based monthly precipitation downscaling algorithm with non-stationary relationship between precipitation and land surface characteristics, Remote Sens. Environ., 162, 119–140, 10.1016/j.rse.2015.02.024, 2015.

Zhou, L., Chen, Y., Liang, N., and Ni, Y.: Daily rainfall model to merge TRMM and ground based observations for rainfall estimations, in 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 601–604, 2016.

</app></app-group></back> </article>