Validation of precipitation estimates from various products is a challenging
problem, since the true precipitation is unknown. However, with the increased
availability of precipitation estimates from a wide range of instruments
(satellite, ground-based radar, and gauge), it is now possible to apply the
triple collocation (TC) technique to characterize the uncertainties in each
of the products. Classical TC takes advantage of three collocated data
products of the same variable and estimates the mean squared error of each,
without requiring knowledge of the truth. In this study, triplets among
NEXRAD-IV, TRMM 3B42RT, GPCP 1DD, and GPI products are used to quantify the
associated spatial error characteristics across a central part of the
continental US. Data are aggregated to biweekly accumulations from January
2002 through April 2014 across a 2

Precipitation is one of the main drivers of the water cycle; therefore, accurate precipitation estimates are necessary for studying land–atmosphere interactions as well as linkages between the water, energy, and carbon cycles. Surface precipitation is also a principal driver of hydrologic models with a wide range of applications. A wide suite of instruments (in situ and remote sensing) monitor precipitation incident at the Earth's surface. Specifically, there has been a great effort during the last 2 decades to use microwave radar and radiometer instruments on board low-earth-orbit satellites to accurately estimate precipitation over large areas. These estimates, when combined with infrared-based cloud-top temperature observations from geostationary satellites, provide high spatial and temporal resolution precipitation estimates that are appropriate for hydrological and climatological studies.

However, precipitation estimation is inevitably subject to error. The errors are caused by different factors depending on the measurement instrument. For gauge measurements, the sparse distribution of gauges, environmental conditions such as wind and evaporation, and topography contribute to the errors. For ground-based radars, beam blockages in mountainous regions, the empirical backscatter–rain rate relationship (and the simplifications embedded in their functional form), and clutter are among the sources of error. Lastly, for satellite retrievals (both radiometer and radar), assumptions about the surface emissivity, neglecting evaporation below clouds, and empirical relationships are the driving factors of error.

The new Global Precipitation Measurement (GPM) mission aims to integrate
precipitation estimates from a constellation of satellites to provide high
spatial and temporal resolution estimates of precipitation over the Earth

Several studies investigate and model the uncertainties in remotely-sensed
precipitation estimates; however, they all depend on assuming the
ground-based (gauge and/or radar) observations or models representing the
zero-error precipitation (

Triple collocation (TC) provides a platform for quantifying the
root mean square error (RMSE) in three or more products that estimate the
same geophysical variable. Developed by

While TC has been used extensively to estimate errors in soil moisture
products

New variants of TC are introduced with wider applications in recent years.

In this study, we estimate the spatial RMSE between triplets of precipitation
products across a central part of the US. Unlike

This paper is organized as following: Sect.

In this section, we review the TC formulation and introduce the
multiplicative error model. In the multiplicative error model for
precipitation, the true precipitation is related to the estimation as

In this study, we use the multiplicative model to relate the precipitation
estimates to the true value; however, without having the truth or making any
assumptions about the distribution of the error, we estimate the RMSE of each
estimate. Taking the logarithm of Eq. (

Study domain. The six numbered pixels are used in
Sect.

Based on the ETC introduced by

Figure

Precipitation estimates from five products NEXRAD-IV, TRMM 3B42RT, TRMM 3B42,
GPI, and GPCP 1DD are evaluated. NEXRAD-IV is the national mosaicked
precipitation estimates from the National Weather Service ground-based
WSR-88D radar network

TRMM 3B42RT is a multi-satellite precipitation estimate from the Tropical
Rainfall Measuring Mission (TRMM) together with other low-earth-orbit
microwave instruments

The GOES Precipitation Index (GPI) is a rainfall retrieval algorithm that
only uses cloud-top temperatures from IR-based observations of geostationary
satellites to estimate rain rate

Climatology of precipitation across the study domain from each of the products.

RMSE of the precipitation rate in logarithmic scale estimated from
TC using triplets in group 1;

RMSE of the precipitation rate in logarithmic scale estimated from
TC using triplets in group 2;

The Global Precipitation Climatology Project (GPCP) is a globally merged daily
precipitation rate at 1

The NEXRAD, TRMM 3B42, and TRMM 3B42RT data are upscaled to a 1

The time domain for this error estimation study is from January 2002 until April 2014. All the data products have a complete record within this time window which is more than 1 decade. Moreover, to generate temporally uncorrelated samples that do not have zero precipitation, the data from each product are temporally aggregated to biweekly values. Precipitation is a bounded variable and can only take values greater and equal to zero. If the precipitation estimate at a specific time and space is equal to zero; then, the error in that estimate can be from a limited set of numbers (basically any number greater than zero). Therefore, the error is dependent on the measurement (or equivalently the truth). As a result, if we have zero value in the precipitation measurement for all the triplets, the error of each of them is dependent on the measurement; and therefore, on each other. This dependence would violate the assumption that all errors are independent and identically distributed. The error dependence decreases as the measurement value moves away from zero. Among the aggregated data, there are a few percentage of samples that have zero biweekly precipitation accumulation which are removed from the analysis. The percentage of samples with zero value is less than 2% in most of the region other than eight pixels in the southwest of the region (the driest part of the domain) that have up to 8 % of the samples equal to zero. In the accumulation algorithm, any biweekly data with missing hourly or daily measurements are treated as missing values.

This data aggregation reduces the number of samples across the temporal
domain of this study. TC analysis needs enough samples to be able to provide
an accurate estimation of the error. Therefore, we combine the estimates from
four neighboring 1

In the main analyses of the paper, the four products NEXRAD, TRMM 3B42RT,
GPI,
and GPCP 1DD are used. The TRMM 3B42 is used in Sect.

In this section, we apply the multiplicative TC technique to the
precipitation products introduced in Sect.

Figures

The RMSE reported in these figures is based on bootstrap analysis. We run
1000 bootstrap simulations (i.e., sampling with replacement from the original
data time series) and estimate the RMSE using Eqs. (

The first observation and control check from Figs.

The RMSE estimates, shown in Figs.

Equation (

RMSE of the precipitation rate estimated from TC using triplets in
group 1;

RMSE of the precipitation rate estimated from TC using triplets in
group 2;

There is, again, consistency between the results of NEXRAD and TRMM 3B42RT in
both groups. The RMSE of the TRMM 3B42RT product in both of the triplets and
in majority of the pixels is small compared to the other two products, and it
is also relatively small compared to the mean precipitation from climatology
maps in Fig.

Comparing the pattern of RMSE in NEXRAD, TRMM 3B42RT, and GPCP 1DD with the
climatology maps (Fig.

A recent study by

Figure

Correlation coefficient between the truth and each precipitation product. The left column shows the results for triplets in group 1, and the right column shows the results for triplets in group 2.

The combined and quantitative analyses of the RMSE estimate and the correlation coefficients show that the TRMM 3B42RT product has the best performance among the four products considered here. The RMSE for TRMM 3B42RT has relatively less variation across the domain. This means that the TRMM 3B42RT product has better performance in diverse climatic and geographical conditions. However, the correlation coefficients in TRMM 3B42RT decrease in the west side of the domain. This region is the coldest and snowiest part of the domain and it is covered with snow during the winter. The accuracy of microwave-based precipitation retrievals, which are the input measurements to the TRMM 3B42RT product, is affected by the snow on the ground. Some of the retrieval algorithms for these instruments cannot appropriately distinguish the snow on the ground from the falling precipitation. This phenomenon can contribute to the low correlation coefficient between the TRMM 3B42RT and the truth in the west part of the domain.

The NEXRAD product has a distinct error pattern. Both the RMSE and
correlation coefficient of the NEXRAD estimates are small toward the west of
the domain. However, comparing the error estimates from NEXRAD with the
climatology values reveals that the errors are sometimes on the same order as
the climatology toward the west of the domain. This is also revealed by the
correlation coefficient values, which have a smaller value in the west side
of the domain for NEXRAD. This pattern is consistent with the NEXRAD coverage
maps provided by

The GPI and GPCP 1DD products are, in general, lower quality than TRMM 3B42RT and NEXRAD. They have higher RMSE and lower correlation coefficients with the truth. They both show the east–west pattern in the correlation coefficient; however, the GPI product has a sharper gradient and is poorly correlated with the truth toward the west of the study domain. Precipitation events in this region are mostly driven by frontal systems that generate clouds not necessarily well-correlated to precipitation; therefore, the GPI estimates that are solely based on cloud-top temperature are not well correlated with the truth. GPCP 1DD also uses IR-based observations of the clouds, but those are merged with microwave observations from low-earth-orbit satellites that are more accurate. Therefore, the resulting correlation coefficients are generally higher, especially in the west side of the study domain. If the analysis was limited to the RMSE estimates, GPI might be considered to be performing uniformly well across the entire domain. But with the correlation coefficients, we can clearly see the change in quality of GPI estimates across the domain.

In this section, we will review the assumptions that are embedded in TC
estimates of RMSE and evaluate them using in situ gauge data. Gauge data are
used a proxy for truth. As mentioned in Sect.

Decomposition of TC-based estimates of RMSE in the NEXRAD product
across the six pixels shown in Fig.

For this evaluation analysis, we need accurate ground-based observations in
order to avoid errors due to differences in the spatial coverage between the
gauges and the other products. The six pixels shown in Fig.

It is understood that gauge data also have errors including
representativeness error (they are point measurements unlike the other
products that provide an average value over each pixel); however, as it is
shown in

Figure

Here, we compare the ranking of the products based on the TC-derived errors
and the ones based on the gauge analysis (

To further evaluate the impact of error cross covariance, we replace the
TRMM 3B42RT product with the TRMM 3B42 product, and estimate the RMSEs in each
triplet again. As it was mentioned in Sect.

This study presents, for the first time, error estimates of four precipitation products across a central part of the continental US using triple collocation (TC). A multiplicative error model is introduced to TC analysis that is a more realistic error model for precipitation. Furthermore, an extended version of TC is used with which not only the standard deviation of random errors in each product, but the correlation coefficient of each product with respect to an underlying truth are estimated. The results show that the TRMM 3B42RT product performs relatively better than the other three products. TRMM 3B42RT has the lowest RMSE across the domain, and the highest correlation coefficient with the underlying truth. Meanwhile, NEXRAD performs relatively poorly in the west side of the study domain that is probably caused by the terrain beam blockage. The performance of the GPCP 1DD and GPI products was lower than that of TRMM 3B42RT and NEXRAD. GPI has significantly lower performance in the west side of the study domain, that is likely caused by the simple retrieval algorithm used in this product. Meanwhile, GPI has a reasonably good correlation with the underlying truth in the east side of the domain.

In the second part of the paper, an evaluation of the assumptions built into TC is carried out using surface gauge data as a proxy for the truth across selective pixels. These pixels have a dense coverage of in situ gauges. The results of this evaluation reveal that the TC error estimates underestimate the true error in different products due to a violation of the assumption of the zero-error cross covariance. Moreover, replacing the TRMM 3B42RT with TRMM 3B42 revealed that the gauge correction in the TRMM 3B42 violates the zero- error cross-covariance assumption and leads to smaller RMSE estimates. However, the results of RMSE estimates from TC have a lot of potential to be incorporated into data assimilation and data-merging algorithms.

Triple collocation analysis has a lot of potential to be applied to various precipitation products at a wide range of spatial and temporal resolutions. This will provide a better understanding of the true error patterns in different products. Error quantification of precipitation products is a necessity if one aims to merge precipitation estimates from several instruments/models. However, care should be taken in choosing triplets that have zero- or small-error cross covariance. Otherwise, the error variances will be underestimated.

The multiplicative error model used in this study is shown to be an
appropriate choice relative to the additive model. However, it would be
beneficial to investigate more complex models that can take into account any
higher order dependence of the estimate of the truth. A modification to this
study would be to include a gauge-only precipitation product. This would
reduce the error cross covariance between the products, since the gauge
measurement system is different from the remote-sensing instruments. Although
gauge estimates have representativeness error, this error will be part of the
total error in the gauge product resulting in higher RMSE values of gauge
product. Furthermore, conducting TC analysis on precipitation data with
different temporal resolution will provide valuable insight on the
performance of different products at different temporal scales. However, this
should be carried out with care, as precipitation errors at certain temporal
resolutions are highly correlated and are not appropriate for TC analysis. The code for implementing multiplicative
triple collocation in MATLAB is available at

In this section, we derive Eqs. (

Equations (

The authors wish to thank Wade Crow and another anonymous reviewer for their constructive feedback that led to improvements in this paper. The authors also thank all the producers and distributors of the data used in this study. The TRMM 3B42 and TRMM 3B42RT data used in this study were acquired as part of the NASA Earth-Sun System Division and archived and distributed by the Goddard Earth Sciences (GES) Data and Information Services Center (DISC). The GPCP 1DD data were provided by the NASA/Goddard Space Flight Center's Mesoscale Atmospheric Processes Laboratory, which develops and computes the 1DD as a contribution to the GEWEX Global Precipitation Climatology Project. The GPI data are produced by science investigators, Drs. Phillip Arkin and John Janowiak of the Climate Analysis Center, NOAA, Washington, D.C., and distributed by the Distributed Active Archive Center (Code 610.2) at the Goddard Space Flight Center, Greenbelt, MD, 20771. The Oklahoma Mesonet data are provided courtesy of the Oklahoma Mesonet, a cooperative venture between Oklahoma State University and The University of Oklahoma and supported by the taxpayers of Oklahoma. Edited by: E. Morin