Global component analysis of errors in five satellite-only global precipitation estimates

: Revealing the error components for satellite-only precipitation products (SPPs) can help algorithm developers and end-users substantially understand their error features and meanwhile is fundamental to customize retrieval algorithms and error 15 adjustment models. Two error decomposition schemes were employed to explore the error components for five SPPs (i.e., MERG-Late, IMERG-Early, GSMaP-MVK, GSMaP-NRT, and PERSIANN-CCS) over different seasons, rainfall intensities, and topography classes. Firstly, this study depicted global maps of the total bias (total mean squared error) and its three (two) independent components for these five SPPs over four 20 seasons for the first time. We found that the evaluation results between similar regions could not be extended to one another. Hit and/or false biases are major components of the total bias in most regions of the global land areas. In addition, the proportions of the systematic error are less than 20% of total errors in most areas. One should note that each SPP has larger systematic errors in several regions (i.e., Russia, China, and 25 Conterminous United States) for all four seasons, these larger systematic errors from retrieval algorithms are primarily due to the missed precipitation. Furthermore, IMERG suite and GSMaP-NRT display less systematic error in the rain rates with intensity less than 40 mm/day, while the systematic errors of GSMaP-MVK and PERSIANN-CCS increase with increasing rainfall intensity. Given that mean elevation cannot reflect the 30 complex degree of terrain, we introduced the standard deviation of elevation (SDE) to replace mean elevation to better describe topographic complexity. Compared with other SPPs, GSMaP suite shows a stronger topographic dependency in the four bias scores. A novel metric namely normalized error component (NEC) was proposed to fairly evaluate the impact of the solely topographic factor on systematic (random) error. It is 35 found that these products show different topographic dependency patterns in systematic (random) error. Meanwhile, the pattern of the impact of the solely topographic factor on systematic (random) error is similar to the relationship between systematic (random) error and topography because the average precipitations of all topography categories are very close. Finally, the potential directions of the improvement in satellite 40 precipitation retrieval algorithms and error adjustment models were identified in this study.


45
Precipitation is one of crucial inputs for hydrological cycle system and therefore obtaining the accurate precipitation data is of great significances for the study of global water cycle (Hou et al., 2014;Kidd et al., 2017;Chen et al., 2019a). Traditional methods depend on rain gauges to obtain the precise point-scale precipitation observations (Kidd and Huffman, 2011b). In addition, ground-based radars can provide the accurate 50 precipitation estimates over a range of approximately 250 km (Chen et al., 2019b).
However, these two methods of measurement precipitation are affected by local environment, economy and other factors, and it is difficult to obtain the continuous spatiotemporal precipitation estimates over many regions of the world, especially over complex mountainous and developing countries (Baez-Villanueva et al., 2020). 55 The satellite-based instruments have the ability to overcome the limitations of rain gauges and ground-based radars to provide the precipitation estimates with high spatiotemporal resolution and even covering the globe (Kidd et al., 2011a). However, satellite precipitation products contain a large number of random errors, systematic 60 errors and large uncertainties, especially over complex mountains (Tian et al., 2010a;Maggioni et al., 2016a;Chen et al., 2020). Therefore, it is necessary to comprehensively analyze the errors of satellite precipitation products, especially for their satellite-only versions. Over the past 20 years, there are many literatures to investigate the error features of satellite precipitation products at global scale (e.g., Yong et al., 2015;Liu et al., 2011;Yong et al., 2010Yong et al., , 2013Yong et al., , 2016Takido et al., 2016;Tan et al., 2017;Prakash et al., 2018;Gebregiorgis et al., 2018;Beck et al., 2019;Chen et al., 2019b). These studies provided a great deal of valuable information for algorithm developers and endusers. However, most studies used relative bias/mean error to analyze the error features 70 of SPPs, which could be misleading due to the error average from different error components. In some cases, relative bias/mean error is smaller even though the absolute values of its error components are larger (Chen et al., 2019b). Tian et al. (2009) proposed an error decomposition scheme to separate the total bias 75 into three independent components (i.e., hit bias, miss bias and false bias). This scheme effectively avoided above-mentioned questions and is a fairer method to analyze errors.
To date there are several evaluation studies investigating major components of the total bias for satellite precipitation products at several regions, such as mainland China (Yong et al., 2016;Xu et al., 2016;Su et al., 2018;Chen et al., 2020), the contiguous United 80 States (Tian et al., 2009), central Asia (Guo et al., 2017). In terms of systematic error, AghaKouchak et al. (2012) used an error decomposition technique proposed by Willmott, (1981) to separate total mean squared error into systematic and random errors, and analyzed systematic and random errors of the three satellite precipitation products (i.e., CMORPH, PERSIANN, and real-time TMPA) over the entire conterminous 85 United States (COUNS). Maggioni et al. (2016b) further investigated the systematic errors for TMPA products over COUNS. However, these studies were only concentrated in limited regions and lacked the investigations at global scale. Meanwhile, the https://doi.org/10.5194/hess-2020-294 Preprint. Discussion started: 25 September 2020 c Author(s) 2020. CC BY 4.0 License. transferability of the regional evaluation results to other similar areas still needs to be investigated, which has always plagued algorithm developers and users. Besides, it 90 needs to figure out which component of total bias tends to produce larger systematic errors.
Topography is a crucial factor that impacts the satellite precipitation retrievals (Tapiador et al., 2012;Xu et al., 2017;Chen et al., 2019b). Several studies strive to investigate the 95 total bias of satellite precipitation retrievals under different terrains (e.g., Takido et al., 2016;Guo et al., 2017;Xu et al., 2017;Chen et al., 2019b). Nevertheless, the analysis of error components for satellite precipitation estimates under different topography categories is lacking in previous studies. In particular, there is no literature to investigate the potential link between systematic (random) error and terrain. Meanwhile, the impact 100 of the solely topographic factor on systematic and random errors is not clear due to lacking relevant investigations in previous studies. These limitations inherent in previous studies block the characterization of satellite precipitation error. Furthermore, previous literatures used mean elevation to describe the terrain of the grid cell, yet the mean elevation of each pixel often cannot objectively represent the complexity of the 105 topography. A more reasonable metric is needed to be introduced to describe the topography of the grid cell.
Precipitation intensity is also an important factor associated with the errors of satellite precipitation estimates (Chen et al., 2020). Previous efforts found that satellite 110 https://doi.org/10.5194/hess-2020-294 Preprint. Discussion started: 25 September 2020 c Author(s) 2020. CC BY 4.0 License.
precipitation products overestimated the precipitation in the light rainfall events and underestimated the precipitation in the heavy rainfall events (Tian et al., 2009;Kirstetter et al., 2013;Chen et al., 2013). Tian et al. (2009) investigated the major components of the total bias for different rainfall intensities, and Maggioni et al. (2016b) revealed the relationship between the systematic (random) error and rainfall intensity for TMPA 115 products. Nevertheless, the potential link between the systematic (random) error components of five evaluated SPPs and precipitation intensity is still absent.
Consequently, the objectives of this study include five-fold: (1) to investigate the major components of errors (including total bias and total mean squared error) for five SPPs (2) to investigate the potential for the transferability of the regional assessment results to other similar regions; (3) to analyze 125 the major components of the total bias and total mean squared error for five SPPs under different rainfall intensities; (4) to analyze the major components of the total error for five SPPs under different terrains and study the impact of the solely topographic factor on systematic and random errors; (5) to answer the question which component of the total bias tends to produce larger systematic errors.

Study area, datasets and methodology 2.1 Study area
Our study areas cover the global land areas (60°N/S). Fig.1a shows the topographic relief across the global land areas, the standard deviation of elevation (SDE) (more information on this concept see methodology section) was introduced to better describe 135 the terrain of the grid cell. The complex degree of the topography increases with increasing color depth, the areas with a rather complex terrain mainly include western COUNS, Andean mountains, southern Europe, Turkey, Iran, Afghanistan, Tibetan Plateau (TP), most humid regions in mainland China, Japan, and so on. Furthermore, the global land areas can be divided into four climate regions namely humid regions 140 (average annual precipitation (AAP) > 800mm/yr), semi-humid regions (AAP between 400 -800 mm/yr), semi-arid regions (AAP between 200 -400 mm/yr), and arid regions (AAP < 200 mm/yr) (see Fig. 1b). The detail information about the climate region can be found in Fig.1c.

Reference products
To achieve the objectives of this study, three high-accuracy rain gauge data sets are employed as the references. Climate Precipitation Center unified (CPCU) data was used as the benchmark over the global land areas except for mainland China. CPCU produces continuous daily precipitation at 0.5º spatial resolution using optimal interpolation (OI) 150 based on > 17,000 gauges (Xie et al., 2007;Chen et al., 2008). For the benchmark over https://doi.org/10.5194/hess-2020-294 Preprint. Discussion started: 25 September 2020 c Author(s) 2020. CC BY 4.0 License.
mainland China, China Gauge-based Daily Precipitation Analysis (CGDPA) data was employed as one of the references. This dataset, with 0.25º spatial and daily temporal resolution, was developed from ~ 2400 rain gauges, using OI method. The assessment results indicated that this ground-based precipitation dataset outperforms CPCU data 155 and East Asia gauge analysis (EA_Gauge; Xie et al., 2007) over mainland China (Shen and Xiong, 2016). Regarding the component analysis of errors for five SPPs in different topographies, high-accuracy and high spatiotemporal resolution (hourly temporal and 0.1° spatial resolution) ground observations from 26326 rain gauges were used as the benchmark. The spatial distribution of the rain gauge can be found in our published 160 paper (i.e., Chen et al., 2019b;Chen et al., 2020). However, this product has large uncertainties in cold seasons due to freezing weather. The analysis is executed at a finer spatial resolution (0.1°), which avoids the smoothing of topography relief as much as possible. In this study, only the pixels with at least one rain gauge are considered, the spatial distribution of rain gauges (including CPCU and CGDPA) is shown in Fig. 1d.

Satellite-only precipitation products
The main focus of this study is to analyze the components of error for five SPPs including IMERG-Late V6, IMERG-Early V6, GSMaP-MVK V7, GSMaP-NRT V6/V7, PERSIANN-CCS over global land areas. Given that the gauge-adjusted satellite precipitation products (e.g., IMERG Final run, gauge-adjusted GSMaP, and 170 PERSIANN Climate Data Record) merge the ground-based rain gauge observations, these gauge-adjusted products did not employ in this study. It is because the overlaps between gauge-adjusted products and benchmark result in some potential evaluation information about the production processes of these five SPPs can be found in our 180 previous paper (i.e., Chen et al., 2020).
In global analysis, all SPPs need to be resampled to the 0.5º spatial resolution and aggregated to daily temporal resolution. This is for consistency with CPCU data (0.5º, daily). The information of these five SPPs is listed in Table 1. Tian et al. (2009) proposed an error decomposition scheme to separate the total bias (TB) into hit bias (HB), miss bias (MB), and false bias (FB). This technique is more effective in identifying the major error components of the total bias, which can provide 190 valuable information to customize retrieval algorithms and remove errors. The four bias scores can be defined as follows (Tian et al., 2009): Another error decomposition technique is to decompose the total mean squared error into systematic and random error components. This strategy was used to separate 205 numerical weather prediction models into systematic and random errors by Willmott,

Error decomposition technique
where and represent the systematic and random components of error, respectively; and are slope and intercept, respectively, and they can be computed by using least square method. Note that the systematic error component ( ) plus  The differences in precipitation intensity may be inevitable between various topographies, which hinders the study of influence of the solely topographic factor on systematic (random) error. Thus, a novel metric called normalized error component 225 (NEC) was proposed to strictly explore the impact of the solely topographic factor on systematic and random errors. This metric can be defined as follows:

Normalized error component
where ̅ indicates the mean value of ground-based observations.

Index of topography complexity 230
Mean elevation cannot reflect the complex degree of the terrain, using the errors of SPPs as a function of the mean elevation to study the relationship between errors and topography is unreasonable. To better describe the complex degree of the topography for each grid cell, we proposed standard deviation of elevation (SDE) instead of average elevation to describe topographic complexity. This score effectively reflects the 235 topography relief. The larger the SDE value, the greater the relief of topography. The calculated formula of SDE can be defined as follows:

Spatial analysis of error components for the total bias over different seasons
This is known to all that the errors of SPPs have a strongly seasonal dependency, and the analysis associated with the total bias and its major error components is therefore 250 necessary to perform from different seasons. We implemented the following seasonal division scheme: (1)  Finally, the summary of the total bias and its major error components for these SPPs in main regions of the world is listed in supplementary materials to help readers quickly finding the needed information, see Table S1.

Error components of different precipitation intensities
The three bias scores (i.e., total bias, hit bias, and false bias) of five SPPs in different rainfall intensities are shown in Fig. 7. Note that false error component does not exist because rainfall intensity categories are from the benchmark. Generally speaking, these 335 SPPs show a high degree of consistency in three bias scores in different precipitation intensities. In addition, hit bias is the major error component in most rainfall intensities.
Compared with other SPPs, GSMaP-NRT shows relatively larger biases in the light rainfall events (1-2 mm/day). It can be due to lacking backward-propagated PMW in morphing process, which leads to seriously overestimate the precipitation values in the 340 light rainfall events.
On the other hand, the variations of the systematic error for these five SPPs with six rainfall intensities are depicted in Fig. 8. Each SPP shows unique variations of systematic errors with increasing precipitation intensities. One can be seen that IMERG proportions of the systematic error when rainfall intensity is over 40mm/day. Besides, all SPPs underestimated the precipitation volume in the heavy rainfall events with intensity exceeding 40mm/day (see Fig. 7). The underestimated precipitation of these SPPs in such heavy rainfall events might result in generating large systematic errors.  observations. Consequently, whether the evaluation results between similar regions can be extended to one another is a scientific question that needs to be answered urgently.

Error components of different topographies
The comparisons in errors between the humid regions of COUNS and China are regarded as a representative example for analysis because these two chosen areas are 400 located in the same latitude and have similar AAP (see Figs. 1b,c). In addition, these two selected areas are dominated by monsoon climate. One can see that all evaluated SPPs exhibit relatively larger discrepancies in spatial maps of the four bias scores for all four seasons over these two selected study areas (see . Similarly, these two chosen humid areas also have larger differences in the spatial maps of the systematic 405 error (see Fig. 6). In addition, there are no any two areas where the assessment results can be extended to one another in the residual land areas of the world. Our previous study (i.e., Chen et al., 2019b) found that the large performance differences exist between various sensors. Meanwhile, the sensors onboard different satellites exist significant differences in the spatial maps of the sampling frequency (see Fig. 2

Impact of topography on the systematic error
In section 3.2, the results indicated that systematic errors are related to rainfall intensity to some extent. Although we used the humid regions of China as study areas and 420 analyzed in summer season to alleviate the interferences of the climate and season factors on the systematic error for these five SPPs. However, the discrepancies in precipitation intensity are inevitable between different topography categories, which affect the proportions of the systematic error for these products. Thus, we proposed NEC metric to exclude the impact of the precipitation intensity on systematic error and 425 subsequently assessed the influence of the solely topographic factor on systematic error. complexity when the SDE value is less than 300 m. However, they are negative correlation starting from 300 m.

Which component of the total bias tends to produce larger systematic error?
Speaking generally, the proportions of the systematic error for five evaluated SPPs are 440 below 20% for all four seasons over most areas of the global land areas. However, it cannot be ignored that these SPPs have larger systematic errors in several regions, such as parts of COUNS, China and Russia (see Fig. 6). In addition, we found that these areas with larger systematic errors always have relatively larger miss biases (see Figs. 2-6). Thus, there is a very attractive question whether miss bias tends to produce larger 445 systematic errors. According to the definition of systematic and random errors (see equations (7-8)), missed precipitation tends to produce larger systematic error relative to hit and false biases. We believe that missed precipitation is a definitive factor producing larger systematic errors. Baez-Villanueva et al., 2020). In practice, we found that the errors of these five 460 evaluated SPPs show significantly regional features. Meanwhile, the impact of several crucial factors (i.e., topography, season, climate, and rainfall intensity) on the errors of satellite precipitation estimates is very remarkable and has been proved in the results of this study and our previous studies (i.e., Chen et al., 2019bChen et al., , 2020. Consequently, there are reasons to believe that incorporating all four factors (i.e., topography, season, 465 climate region/different areas, and rainfall intensity) into error adjustment models and blending algorithms is expected to further reduce the errors of satellite precipitation estimates.

Potential directions of the improvement in satellite retrieval algorithms and
Second, the global maps of the total bias (and total mean squared error) and its three Finally, the results of this paper are useful to improvement the adjustment algorithms for gauge-adjusted version of GSMaP (GSMaP-Gauge) because this gauge-adjusted product was made of GSMaP-MVK adjusted by CPCU data.

485
This paper executed the investigations on the major error components of the total error (including total bias and total mean squared error) for five SPPs (i.e., IMERG-Early, IMERG-Late, GSMaP-NRT, GSMaP-MVK, and PERSIANN-CCS) over different seasons, rainfall intensities, and terrains. The major conclusions based on the study results are summarized as follows: 490 1. This paper is the first to depict the global maps of the total bias (total mean squared error) and its three (two) independent components for five SPPs over four seasons. We found that these five SPPs have remarkably regional features in error, and the evaluation results between similar regions could not be copied to one another. This is due to the differences of satellite samples used in 495 satellite precipitation retrieval systems between different areas. On the other hand, this finding indicated that the assessment of satellite precipitation products is very necessary over various regions of the world. Future efforts should focus on the areas still lacking evaluation and investigating novel evaluation techniques that do not rely on ground-based observations. 500 2. Hit and/or false errors are the major components of the total bias for five SPPs evaluated over most areas of the world except for China (see Table S1), while the influence of the solely topographic factor on systematic error. It is found that the pattern of the impact of the solely topographic factor on systematic errors is almost the same with the relationship between systematic error and topography, primarily due to mean precipitation (i.e., ̅ , see equation (10)) of around 0.24 mm/h in all terrain categories. 535 We hope that the new findings reported in this paper will be useful to improvement of satellite precipitation retrieval algorithms and error adjustment models and to improvement the potential applications of these products.

Competing interests
The authors declare that they have no conflict of interest.          Fig. 1. (a) Global map of topography; (b) mean annual precipitation of the global land