A new uncertainty estimation technique for multiple datasets and its application to various precipitation products

The uncertainty among climatological datasets can be characterized as the variance in space and time between various estimates of the same quantity. However, some of the current uncertainty estimates only evaluate variations in one single dimension (time or space) due to the limitation of estimation methodology as averaging variation is necessary for the other dimension. The influence on the uncertainty assessment of the ignorance of variations in one dimension is not well studied. 5 This study introduces a new three-dimensional variance partitioning approach which avoids the averaging and provides an new uncertainty estimation (Ue) technique with consideration of both temporal and spatial variations. Comparisons of Ue to classic uncertainty estimations show that the classic metrics underestimate the uncertainty because of the averaging of variation in the time or space dimension, and Ue is around 20% higher than classic estimations. The deviation between the new and classic metrics is higher for regions with strong spatial heterogeneity and where the spatial and temporal variations significantly differ. 10 Decomposing of the new metric demonstrates that Ue is a comprehensive assessment of model uncertainty which has been included the model variations identified by the classic metrics. Multiple precipitation products of different types (gauge-based, merged products and GCMs) are used to better explain and understand the peculiarity of the new methodology. The new uncertainty estimation technique is flexible in its structure and particularly suitable for a comprehensive assessment of multiple datasets over a large regions within any given period. 15 Copyright statement.


Introduction
With the technical development for monitoring the natural climate variables and the increasing knowledge of the physical mechanisms in the climate system, many institutes have the ability to provide different kinds of climate datasets.Taken the precipitation, which is the dominant variable in the land water cycle, as an example, there are point measurements as GHCN-D (global historical climatology network-daily, Menne et al., 2012) , grid products based on gauge measurements and interpola-1 Hydrol.Earth Syst.Sci.Discuss., https://doi.org/10.5194/hess-2019-49Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 5 February 2019 c Author(s) 2019.CC BY 4.0 License.tion (e.g., CRU, Harris et al., 2014) , products derived from remote sensing (e.g., the Tropical Rainfall Measuring Mission -TRMM), reanalysis datasets (e.g., NCEP) and those estimates from models (e.g., GCMs).These products are developed using different original data, different technologies or different model settings for various purposes (Phillips and Gleckler, 2006;Tapiador et al., 2012;Beck et al., 2017;Sun et al., 2018).Therefore, there are differences (including systemic bias and uncertainty) among the various products, and the uncertainty can be regarded as the deviation around what is believed to be the truth.
Although many studies have attempted to understand the causes of the uncertainties in different products, the uncertainties are very difficult to be removed from datasets.Thus, using ensembles consisting of multiple datasets to generate a weighed average has become popular in the climate-related researches.For example, the IPCC uses 42 CMIP5 (Coupled Model Intercomparison Project Phase 5) models to show the historical temperature change and 39 CMIP5 models to average the temperature projection in future RCP 8.5 scenario (Figure SPM.7 in IPCC, 2013b).Schewe et al. (2014) use nine global hydrological models to evaluate the global water scarcity under climate change.GLDAS (Global Land Data Assimilation System) involves four different land surface models (Rodell et al., 2004) and GRACE (Gravity Recovery and Climate Experiment) provides estimations from three independent institutes (Landerer and Swenson, 2012).Using multiple datasets reduces the dependence on a single dataset and decreases the risk of using a single dataset which might contain undiscovered uncertainties.
Extra uncertainty information has to be provided along with the ensemble means because the uncertainty influences the significance or the reliability of the ensemble result.In general, the uncertainty can be quantified as the range of maximum and minimum values (i.e., V max − V min ), range of values at different quantiles (e.g., V 5% − V 95% ), the consistency of models as the ratio of models following a certain pattern to the total number of models, the variation (σ 2 ) or the standard deviation (σ), which represents different characteristics of the multiple datasets.Among the uncertainty metrics, the standard deviation (σ) is the most used because it has the same magnitude as the original dataset; it avoids influence of extreme samples and it is less sensitive to the number of datasets used for the investigation.The ratio of the standard deviation (σ) to the mean value (µ), which is called coefficient of variance (CV ), representing the dispersion or spread of the distribution of various ensemble members (Everitt, 2013), is a unit-less value which also shows the degree of uncertainty.
Depending on the purpose of the evaluation, the uncertainty among datasets can be displayed over space to show the spatial heterogeneity of the consistency among multiple datasets.For example, the predicted future temperature increase has a higher significance in the northern high-latitudes among different models than in the middle-latitudes (Box TS.6 Figure 1 in IPCC, 2013a).The other typical implementation is to evaluate the evolution of the model uncertainty across time.In general, the uncertainty range decreases in the historical period over time because more observations are accessible in recent while the uncertainty increases for future projections because the increasing spread of the model simulations (Figure SPM.7 in IPCC, 2013b).The increasing uncertainty range indicates the decreasing of consistency and increasing variations among various datasets.
The above metrics have been widely used as they show the temporal evolution or spatial distribution of the uncertainty easily.But the short-coming is obvious as we have to average the values in one of the dimensions (time or space, Figure 1) when we use either of the assessments for specific purpose.For example, the averaging over a specific region (spatial mean) is estimated at each time step before the temporal evolution of the model uncertainty can be obtained (red flowcharts in Figure 1).And the averaging over a certain period (temporal mean) is estimated at each grid cell before the spatial distribution of the model uncertainty can be obtained (blue flowcharts in Figure 1).While, the averaging in either dimension means a loss of the information, especially the data variability in that dimension.This may result in that the uncertainty among datasets not being fully considered when estimating the uncertainties.In other words, either of the uncertainty estimates cannot represent the full differences among datasets.Therefore, the uncertainty among datasets can be underestimated and the similarity among them can be overestimated with these two procedures.However current studies have not paid attention to the ignorance of variation due to the averaging as well as its influence on the uncertainty assessment.In this study, we aim to introduce a new technique for uncertainty estimation among multiple datasets.The new uncertainty estimation technique avoids any averaging in time or space dimension thus all the information across the two dimensions can be maintained for the uncertainty assessment along with an ensemble of estimates.Multiple precipitation products are used to explain the peculiarity of the new methodology.In section 2, the detailed methodology of the three-dimensional variance partitioning approach is introduced.The characteristics of multiple precipitation datasets and estimations of the two classic uncertainty metrics are shown in section 3. The results of the new approaches for the precipitation products are discussed in section 4. The differences between the new uncertainty estimation and the two classic metrics introduced previously are analyzed and the causes are discussed in section 5.The discussion and conclusions are followed in the end of this article.
2 Methodology and datasets

Mathematical Derivation
The multiple climatic dataset have to be organized in three dimensions of (1) time with a regular time interval (e.g.monthly or annual), ( 2) space with regular spatial units where all the grids are re-organized in a new dimension from the original latitudelongitude grids, (3) ensemble with different ensemble datasets regarded as the third dimension.Thus, the dataset array can be reformed as with i-th time step (i = 1, 2, . . ., m), j-th grid (j = 1, 2, . . ., n), and k-th ensemble member or ensemble model (k = 1, 2, . . ., l).
We define the three dimensions as time, space and ensemble dimension and the means for these three dimensions are called temporal mean, spatial mean and ensemble mean, respectively.The corresponding variances are named time variance, space variance and ensemble variance, respectively.The grand mean (µ), grand variance (σ 2 ) across time, space and ensemble dimensions as well as the total sum of squares (SST ) are defined as.
The total variations is contributed by the variation in all dimensions.Thus, it should be reformulated as an express of variations in three dimensions.The derivation of the total squares can start from the third ensemble dimension.For a specific k th ensemble member, the grand mean is formulated as n j=1 z ijk /(mn), leading to the total squares rewritten as and then expanded and rearranged as Where σ 2 (µ ts ) is the variation of the grand mean for each member of the ensemble, and σ 2 ts [k], the grand variance in space and time for ensemble member k, can be split using the mean of the spatial variation at each time step σ 2 s [k, :] and the variation of the spatial mean σ 2 (µ s [k, :]), denoted as The detailed derivation of Eq. ( 9) is shown in Eqs. ( 10) -( 17).For a specific dataset k, the grand mean µ ts [k] through space-time scale is The total squares for difference from the grand mean is 11) and the grand variance σ 2 ts is If the derivation is started from the space dimension, Eq. ( 11) can be rewritten by incorporating the spatial mean of each time It can be expanded and then rearranged as Here σ 2 s [k, :] is the mean of the spatial variation at each time step and σ 2 (µ s [k, :]) is the variation of the spatial mean.Or, the grand variance can be split using the average of the temporal variation from all regions σ 2 t [:, k] and the space variation of the temporal mean σ 2 (µ t [:, k]) if we started from the time dimension: With Eq. ( 9) and Eq. ( 18), we can have Substituting Eq. ( 19) into Eq.( 8) results in The first term on the right-hand side of Eq. ( 20) can be transformed to: where σ 2 s_t is the mean of space variation of the temporal mean across each ensemble member, σ 2 s represents the grand mean of σ 2 s , which is the grand variance across time and ensemble dimensions.Then Eq. ( 20) becomes: where σ 2 t_s is the mean of time variation of the spatial mean across ensembles, σ 2 t represents the grand mean of σ 2 t , the grand variance across space and ensemble dimensions.σ 2 e (µ ts ) represents the variation of the spatial-temporal means (µ ts ).Similarly, the derivation can start from any of the other two dimensions.And the SST derived from time and space dimensions are formulated, respectively, as Where each variable is defined in the Appendix A. Averaging these three expressions of SST defined in Eqs. ( 22) -( 24) leads to With the total degree of freedom (m × n × l), the grand variance is expressed as where V t , V s and V e represent the time, space and ensemble variances, respectively.To facilitate the understanding of the partitioning results, an illustration of the present approach is shown in Figure 2.
Note that V e is only based on the combination of variation across the ensemble dimension.The four components are the variations of temporal and spatial values (σ 2 e , zone B3), temporal mean (σ 2 e_t , zone C3), spatial mean (σ 2 e_s , zone C6) and the grand variance of the spatiotemporal mean for a single ensemble member (σ 2 e (µ ts ), zone F3).Similarity, the other variances only rely on the variances in the corresponding dimension, which shows the independence in the three dimensions.

Metrics definition for model uncertainty
Since the temporal evolution or the spatial heterogeneity is natural in the climate variables and the purpose of this study is to evaluate the model uncertainty among datasets, we focus mainly on the variance in the ensemble dimension.The uncertainty among the ensemble member is normalized as the ratio of the square root of the ensemble variance (V e ) divided by the mean value of the datasets (µ).
For each basic spatial unit (grid cell in this study), we can estimate the long-term mean of the target variable for each dataset can estimate the ensemble variations across different datasets of the mean values as σ 2 (µ t [j, :]) (expressed as σ 2 e_t [j] in this study).The spatial distribution of the σ 2 e_t shows the magnitude of model uncertainty over space and its root σ e_t [j] is the model deviation at each space unit.The overall estimation of the model uncertainty over the entire region can be expressed as: has different values for each spatial unit and the values for all the grid cells are averaged to obtain σ 2 e_t , which shows the general magnitude of the ensemble variation over space.The N.s.std is normalized as the ratio of the square root of the mean of variations σ 2 e_t to the average value of all the datasets µ.
Similarly, the model uncertainty can also be expressed as the normalized as the ratio of the square root of the averaged ensemble variation at all time steps σ 2 e_s to the entire means (Eq.29).
where the σ 2 e_s [i], i = 1, ..., m is the ensemble variation of the spatial mean of each dataset across different datasets of the spatial means of each products at each time unit The two uncertainty estimates (Eqs.28 and 29) correspond to the two classic metrics presented in the Introduction.And we will compare the U e with the two classic metrics (N.t.std and N.s.std) to show their relations and differences.

Study area and data descriptions
China is large in its area, and different climate types are encountered in the mainland (Kottek et al., 2006).To facilitate the comparisons and analyses that have spatial variations, ten different subregions are defined in Figure 3 as the (1) Songhua River Basin, (2) Liao River Basin, (3) Hai River Basin, (4) Yellow River Basin, (5) Huai River Basin, (6) Yangtze River Basin, (7) Southeast China, (8) South China, (9) Southwest China, (10) Northwest China.The entire Chinese mainland is numbered as the 11 st region.Most of the regions are natural river basins, and this definition is more natural when considering water resources analysis than definitions using longitude-latitude grids or that are based on administrative regions.Thirteen precipitation datasets from different sources are collected for comparison (Table 1).These datasets are categorized into three groups according to the methodologies used to generate the products, i.e., gauge-based products, merged products Among the merged precipitation products, the CMAP, GPCP and MSWEP use different sources of precipitation data (e.g., gauge observations, satellite remote sensing, atmospheric model re-analysis).These different precipitation sources are averaged using different weights.Thus the differences among the three merged products are associated with the precipitation sources and the weight of the gauge observations.ERA-Interim is a re-analysis product, while it uses near-real-time assimilation with data from global observations (Dee et al., 2011).Thus the forecasting model is constrained by observations and forced to follow the real system to some degree.Because of the usage of observations, ERA-interim is also belonging to the merged products.
GCM precipitation is model estimation, therefore, the physical and numerical choices will affect the accuracy of model results.In addition, observations are not used to constrain the simulations.The lack of constraints on the GCMs will cause them not following the actual synoptic variability and explore other trajectories in the solution space.Kay et al. (2015) repeatedly run the same GCM with a very small difference in the initial conditions, and there is a spread of the model outputs after a number of time steps of running (see Figure 2 in Kay et al., 2015).Therefore, the uncertainty estimated is due to the differences in the model settings and the initial conditions.There are more than 20 datasets of GCMs, while only four are randomly taken to match the number of gauge-based products and merged products.All the datasets have been interpolated to 0.5 o spatial resolution to unify the spatial units and the overlap time span of all the datasets is from 1979 to 2005 for the maximum coverage of all products.

Spatial patterns of ensemble annual precipitation
The ensemble means of the long-term annual precipitation , obtained by averaging the multiple datasets in the corresponding precipitation group, are mapped in Figure 4.The long-term annual mean precipitation obtained from the CMA data is 589.8 mm yr −1 (1.6 mm day −1 ) over mainland China.The gauge-based precipitation has the least bias (-4.1mm yr −1 , -0.7% in proportion) compared to the CMA precipitation.Precipitation in the merged products and GCMs is larger than CMA by 63.1 and 232.0 mm yr −1 (with the bias as +20.4% and +41.3%), respectively.
The spatial pattern of the annual precipitation shows a decreasing gradient from the southeastern China (>1600 mm yr −1 ) to the northwestern China (<400 mm yr −1 ).All the ensemble means of the three precipitation groups capture the spatial gradient, while they have different ability to express in some details.For instance, there are some isolated areas with larger or smaller area in the CMA precipitation which could be caused by the topography (e.g., the northern Tienshan Mountain, the Qilian Mountains), while they are not shown in the gauge-based products.As we know, the precipitation gauges are mainly

Spatial distribution of model uncertainties
In addition to differences of the long-term annual precipitation, differences are found among datasets within the same precipitation group.The spatial distribution of the model uncertainty, which is expressed as the ensemble deviation across multiple products of the annual precipitation, is calculated for each group and mapped in Figure 5.
Among the datasets based on gauge observations (Figure 5-a), the ensemble deviation value is small in most land area of China (<50 mm yr −1 ).It is higher in the south of China (50-100 mm yr −1 ) but the area is not continuous in space.The highest deviation occurs along the Himalayas, indicating a high variation among datasets.Regarding the merged precipitation products, the deviation shows high values (>200 mm yr −1 , Figure 5-c The magnitude of the ensemble deviation demonstrates the model uncertainty among different precipitation products in the same precipitation group and it shows the ability of the precipitation estimation with different methodologies.For all products, the ensemble deviation are relatively larger where the precipitation is higher, especially along the mountains and the subtropical regions.The ratio of ensemble deviation to the means showing the uncertainty more clearly is higher in the northwestern China where the precipitation is among the lowest in China.Particularly for the gauge-based products, the higher ratio occurs where the gauge density is low and the orographic effect is apparent (e.g., the Tibet Plateau and the mountainous area).For the merged products and the GCMs, the ratio increases especially in the southeastern China, showing decreasing similarities among different GCMs.Because the ratio has taken into account both the variation and the means (which may has a systematic bias), the ratio is better than the absolute ensemble deviation to represent the uncertainty.

Temporal evolution of model uncertainties
Figure 5 shows the spatial distribution of the ensemble deviation among different products of the annual precipitation.However, the temporal evolution of the deviation among the various products is not captured because the temporal variation has been averaged before estimating the ensemble deviation in Figure 5.In this section, we examine the temporal evolution of model uncertainty of the regional annual precipitation across different products.The analysis is based on the ten subregions defined in Figure 3 and the whole Chinese mainland.
The annual precipitation of each precipitation group has been normalized as the ratio to the long-term annual means of CMA (black line in Figure 6).The magnitude of the annual precipitation in the gauge-based products (blue) is similar to that of CMA except in the southwestern China (Figure 6-i) for the overestimation along the Himalayas (Figure 4).The precipitation in the merged products (green) is higher in the southwestern and northwestern China, in accordance with Figure 4-c.The annual precipitation of the GCMs (red) is apparently higher than that of the gauge-based products or merged products for almost all regions, which agrees with the spatial patterns in Figure 4-d.
The ensemble deviation (shaded area) shown in Figure 6 represents the variations of the products in the same precipitation group at each time step.The normalized deviation facilitates the comparisons between different regions by scaling it to the means of corresponding group to obtain the width of the uncertainty range in the same scale of the y-axis.High deviations while it is among the highest in the 8-south China and the west China (9,10), agreeing with the deviation maps in Figure 5.
The temporal evolution of the gauge-based products and merged products agree well with that of the CMA dataset, while the temporal evolution of GCMs ensemble is weaker and not well correlated with that of the CMA.The main reason is that GCMs are not constrained in their synoptic variability and the sequence of the wet and dry years can be very different from that of the observations.So a smoother result can be obtained when we build the ensemble means from the GCMs.While this is different for the gauge-based and merged products, as they have a strong co-variance and the ensemble mean preserves this co-variance.
For the entire mainland of China (Figure 6-k), the ensemble deviation remains stable for different precipitation groups.In contrast, the annual precipitation spans the largest spatial heterogeneity in the mainland compared to those divided subregions (Figure 4).However, the spatial variation has been collapsed when estimating the regional precipitation for temporal analysis.
It is therefore interesting to see how the uncertainty estimate changes when the variations in the time dimension and in the space dimension are considered together in the precipitation datasets.

Variations in the time and space dimensions
The precipitation varies in time and space, however, it is averaged either in the time dimension to obtain the spatial patterns of model uncertainty (Figure 5) or in the space dimension to obtain the temporal evolution of the model uncertainty (Figure 6).But the deviations in the time and space dimensions are indeed very rarely compared.Herein, the standard deviation of the temporal and spatial variations in the precipitation datasets are compared in Figure 7 in ten subregions and the Chinese mainland for different precipitation groups.
The gauge-based products provide similar annual regional precipitation to CMA over the China mainland and ten specific regions except for the region 7-southeast China (Figure 7-g) and region 9-southwest China (7-i).It might indicate the decreased ability of remote sensing, the important data source in the merged products, to estimate the precipitation amount in storms as the storms mainly contribute to the total precipitation for the two subregions.The regional precipitation is larger in merged products than that of observations and the magnitude of the deviation in GCMs is even larger except in the region 8-south China (Figure 7-h).These results indicate the reduced ability of merged products and GCMs in reproducing the total value of the annual precipitation.
Regarding the variations in time and space dimensions, the regions 9, 10 and 11 have the largest ratio of the spatial standard deviation (to the mean), indicating the most significant spatial heterogeneity over the regions.The 7-southeast and the 3-Hai River have the smallest variations either because of the small area or because of the homogeneity in the subregion as the Ratio of the spatial standard deviation to the mean Ratio of the temporal standard deviation to the mean spatial correlation is high in the area.The relative ratio of the temporal standard deviation to the spatial standard deviation is among the smallest in the regions 9, 10 and 11 (k=0.1,0.12 and 0.05, respectively.k is the ratio of the temporal deviation to the spatial deviation), showing an apparent difference between the variation in the time and space dimensions.While, the difference between variation in two dimensions is small in the 3-Hai River basin (k=1.15) and 7-southeast China (k=0.90),mainly because the relatively strong variability of the annual precipitation in different years.
In addition to the differences across regions, the variations in different precipitation groups are also different.Excluding the CMA dataset which only consists of one single product, the variations in the gauge-based products are higher than that of the other two groups.The difference demonstrates that on one hand the gauge-based may have the largest variation over space or on the other hand the correlation among different gauge-based products are high so that the variation is preserved when doing the ensemble.On the contrary, the GCMs have the smallest variations, either because the precipitation estimated in GCMs are more homogenous over space, or because the spatial patterns in different GCMs are not consistent and the spatial correlation is lower since there is no constrain in the GCM simulation.

Variances in three dimensions
We have introduced the general spatial and temporal characteristics of the precipitation in different groups and their variations in different dimensions in the above section.In this section, we will present the results that estimated by the newly proposed variance approach.As introduced in the methodology section, the input annual precipitation to the approach is re-organized into three dimensions as (1) time, 27 years from 1979 to 2005, (2) space, the number of 0.5 o grids in a specific region and (3) ensemble, the number of the models in a same precipitation group (four models in all three groups).
The grand variance and the variances in three different dimensions (i.e., time, space and ensemble) for all the subregions are mapped in Figure 8.The grand variance (total value of the variance for all three dimensions) is similar for data groups of gaugebased products and the merged products (Figure 8-a,b,c), while the grand variance in GCMs is large and is approximating twice the values of the other two groups in regions 9-south China and 10-southwest China.The differences are mainly constituted by the space variance and ensemble variance (Figure 8-i,l).
The time variance (V t ) is the smallest among all three variance proportions, and there are very little differences of V t in the northern China (Figure 8-d,e,f).V t in the gauge-based products is higher than that in the merged products and GCMs in regions 8-southeast China and 9-south China, indicating a relatively strong temporal variation in the annual precipitation series which consists with the larger uncertainty ranges shown in Figure 6-h,i.Similar patterns of the space variance (V s ) are found in the gauge-based products and merged products (Figure 8-g,h), and the 7-Yangtze River basin and 9-southwest China have the largest V s because the precipitation significantly varies in space in these two regions.V s is higher in the precipitation of GCMs especially in the 9-southwest China, indicating the strong spatial heterogeneity in the GCM models over the Himalayas (Figure 8-i).The ensemble variance (V e ) is relatively small in most regions in gauge-based products (Figure 8  In conclusion, the grand variance and individual variance for each of the three different dimensions are generally larger in the dataset group consisting of GCMs.The variations for the gauge-based products and merged products are similar in values 5 and spatial distribution.However, in addition to the variances, the uncertainty defined as the ratio of the square root of the variance to the mean (i.e., U , U t , U s , U e ) contains extra information of the regional means, and will be discussed in the next section.

Deviations in three dimensions
In contrast to the spatial patterns of the variances magnitude distributed in the ten subregions (Figure 8), the larger values of the deviation (U = √ V /µ) occur in the northwest, and lower values occur in the southern China in general (Figure 9).A possible reason is the decreasing gradient of precipitation magnitude from the southeast to the northwest (Figure 4).Although the variances are among the lowest in the northwest China, the total deviation is the highest in this region (U =0.89, Figure 9-a,b,c) for all three precipitation groups because of the low precipitation rate in the northwest.U is relatively small in the 1-Songhua River (U =0.27) in the northeast and 8-South China (U =0.29) for the gauge-based products and 6-Yangtze River has relatively lower U in the merged products and GCMs in the east part of China.
The variations in time and space dimension are natural, and they show the temporal evolution and spatial heterogeneity of the characteristics in different precipitation products (Sun et al., 2010(Sun et al., , 2012)).It is found that the U t is small and contributes very little to the total U , indicating the weak fluctuation of annual precipitation compared to spatial variations (Figure 9-d,e,f).
The U t values are the smallest for the GCMs, in accordance with the weak temporal variations in Figure 6.products, the U e is smaller than 0.1 for regions in the eastern China, indicating that the model differences are relatively small compared to the annual means.The U e value is higher for the 9-southwest (=0.30) and 10-northwest China (=0.37), showing large variations even in the gauge-based products.For the merged products, U e is similar to that of the gauge-based products in the western China (=0.36), while it is larger in the east especially for the 6-Yangtze River and 4-Yellow River (more than two times larger than U e of the gauge-based products).
For the GCM precipitation, the uncertainty increases compared to the other two groups in the eastern regions, corresponding to the higher ensemble variations in GCM over the eastern regions shown in Figure 5. While, it decreases in 10-northwest China (U e =0.25) and a possible reason is that the spatial homogeneity of the variations in the region 10-northwest China (Figure 5-f) is stronger than that of the other groups (Figure 5-b,d).In the GCMs, the highest U e occurs in the southwestern China where both the means and the variations are higher (Figure 4 and 5).As conclusion, the U e is linked with the magnitude of the model uncertainties in Figure 5 and Figure 6.It indicates that the U e is to some degree correlated to the classic metrics as the higher U e covers the grid cells or regions with higher model uncertainty.

Decomposition of the ensemble uncertainty
We now decompose the ensemble variance to explore the possible reason for the deviation of U e from the N.s.std and N.t.std.
As shown in Eq. ( 26), the ensemble variance is formulated as It combines four elements which contribute to the variation of different values across the ensemble dimension (i.e., the variance of original temporal and spatial valuesσ 2 e , of the temporal meanσ 2 e_t , of the spatial meanσ 2 e_s and of the grand mean σ 2 e (µ ts )).Among which, the σ 2 e_t is the mean of the square of spatial standard deviation in Figure 5-a,c,e for all grids in a specific region and σ 2 e_s is the mean of the square of the temporal standard deviation in Figure 6 for each time step in a specific region.These two components are related to the two classic metrics N.s.std (Eq.28) and N.t.std (Eq.29), respectively.By decomposing the Eq. ( 30), the contributions of the four components to the ensemble variance (V e ) are shown in Figure 11.For all three precipitation groups, σ 2 e is the dominant component simply because all the information on variations among the original datasets is retained in the uncertainty estimation.While, the other three components are estimations after averaging is performed in time, space or the full spatiotemporal dimensions, which indicates a loss of information.The contribution of the σ 2 e_t and σ 2 e_s is approximating 0.15 for regions from 1 to 8.While the σ 2 e_t increases for the region 9, 10 and 11, indicating that the spatial heterogeneity is significant for these regions.On the contrary, σ 2 e_s decreases because the spatial averaging has Although all the components can be used as metrics for evaluating the variations among multiple datasets, there are limitations for each of the variations.For the variation of temporal mean σ 2 e_t and spatial mean σ 2 e_s , the collapse of a dimension has ignored part of the information (also introduced in the Introduction).Moreover, the variation of the grand mean σ 2 e (µ ts ) has ignored both the temporal variability and spatial heterogeneity, which further decreases its applicability in uncertainty assessment.The variation σ 2 e is estimated based on the original data without averaging, and thus represents the most information.However, it cannot account for the systematic uncertainty (bias in the mean values) which is expressed as σ 2 e (µ ts ).Therefore, all the four elements represent the model variations from different aspects and neither of the single element is able to represent all the others.Integration of different components (V e ) is therefore a solution to indicate all metrics to different degrees.What is interesting is that the variability of the proportions of σ 2 e_t and σ 2 e_s (or σ 2 e and σ 2 e (µ ts )) are opposite and the sum of their proportions is stable around 0.3 (or 0.7).This indicates a complementary relation between the two pairs of elements (σ 2 e_t & σ 2 e_s ; σ 2 e & σ 2 e (µ ts )).On the other word, some of the information is ignored in one of the element but remained in the other one within the same pair.And therefore it indicates that the variation in the time dimension and that in the space dimension should be considered together as done in the estimation of the ensemble variance (V e ).The normalized metric (U e ) derived from the integrated variation (V e ), which has better ability to demonstrate the uncertainties compared to the classic metrics, should be a better choice for the uncertainty analysis.

Metrics differences in value and proportion
Figure 10 shows that the U e is generally higher than the uncertainty identified by the two classic metrics, N.s.std and N.t.std.
Figure 12 then summaries the magnitude of the changes from the classic metrics to the new uncertainty estimation identified by U e .We can find that the two classic metrics generally underestimate the uncertainty by around 0.03 (Figure 12-a).The variation of the underestimation of N.t.std is larger than that of the N.s.std, showing a larger deviation between the U e with N.t.std.Applying the new uncertainty metric will increase the estimation of uncertainty by around 20-40% for half of the cases compared to the N.s.std (Figure 12-b).For nearly 25% of the cases, the new U e increases the estimation of uncertainty by more than 50%.In the extreme cases, U e is larger than twice N.t.std (Figure 12-b).The results show that the known uncertainty estimated by the two classic metrics, which have been widely applied to climatic analysis, have underestimated the uncertainty among different models / datasets, which has been assumed when introducing the peculiarity of the new method.
The underestimation may especially occur for assessment of temporal evolution of the uncertainties (N.t.std), which is very commonly seen in scientific reports and articles to illustrate the temporal evolution of the variables of interest.6 Discussion and Conclusion

Features and applicability of the approach
The proposed variance partitioning approach works in three dimensions, and it is able to use all of the information in the time and the space dimensions among the multiple ensemble members.The proposed U e estimation technique is especially suitable for the overall assessment of the variations among multiple datasets over a certain period and over a specific space.Though, the compensation is that the U e technique cannot provide the temporal evolution or spatial heterogeneity for users' consideration.
However, in most cases we would like to know the general performance of the ensemble models with single estimate.The two classic metrics (eqs.28 and 29) are also single values but their estimation has averaged the variations which means a loss of informations.
The results of the partitioning approach can be affected by the choice of the time step intervals.For example, the time variation or time variance proportion will significantly increase if the time interval is chosen as one month.The inter-annual variation of precipitation will result in higher V t and lower V s or V e .It depends how significant the inter-annual variability is compared to the intra-annual variations.
The proposed approach has a flexible structure that potentially deals with different problems from global to regional dimensions.The time dimension can consider intervals from daily, monthly, annual or to decadal analysis in different scopes.
The ensemble dimension is applicable from 2 members (i.e., model evaluation between simulations and observations) to any number of multi-models (consensus evaluation, Tebaldi et al., 2011;McSweeney and Jones, 2013).The present approach is applicable to any variables that are organized in the three dimensions such as climatic variables (e.g., temperature, evaporation), hydrological variables (e.g., soil moisture, runoff) or environmental variables (e.g., drought index).Based on these advantages, the three-dimensional partitioning approach can widely be applied in the hydro-climatic analysis.

Figure 1 .
Figure1.The two classic uncertainty assessments in the current researches as the temporal evolution of the model uncertainty (red) and the spatial distribution of the model uncertainty (blue).Either of the uncertainty estimates has to do the averaging in one of the dimensions in space or time, and it will lead to the loss of information in the corresponding dimension.

Figure 2 .
Figure2.The illustration of the partitioning time-space-ensemble variance method.The original dataset is reorganized into three dimensions of time, space and ensemble.The denotations of the zones are listed to the right.The grand variance is defined as σ 2 and the grand mean as µ.The subscripts t, s, and e represent time, space and ensemble, respectively.Zone A (µi) indicates the means of the i dimension; zone B (σ 2 i ) indicates the variation for i dimension; zone C (σ 2 i_j ) indicates the variation across i dimension of the means of µj; zone D (µij) indicates the means across i and j dimensions; zone E (σ 2 ij ) indicates the variation across i and j dimensions; zone F (σ 2 i (µ jk )) indicates the variation across i dimension of the means across j and k dimensions.The detailed definitions of these denotations can be found in Appendix A.

Figure 3 .
Figure 3.Ten subregions are identified in this study.These subregions are mainly divided as the river basins (regions 1-8) and 9 as the southwestern China and 10 as the northwestern China.The 11 represents the whole mainland.
and General Circulation Models (GCMs).The gauge-based products (i.e., GPCC, CRU, CPC and UDEL) use observed data 9 Hydrol.Earth Syst.Sci.Discuss., https://doi.org/10.5194/hess-2019-49Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 5 February 2019 c Author(s) 2019.CC BY 4.0 License.from global atmospheric gauges, while the density of ground observation gauges, the representatives of the gauges and the interpolation algorithms for converting the gauge observations to grids dataset vary from product to product.CMA (provided by China Meteorological Administration) dataset uses the densest gauges and probably has the best quality to capture the spatiotemporal variations of the precipitation.But CMA is excluded when estimating the ensemble means of the gauge-based products and chosen as the reference datasets for comparison.
UDELUniversity of Delaware Air Temperature & Precipitation Global (land) precipitation and tem-, Japan, NIES, Ibaraki, Japan, JAMSTEC, Kanagawa, Japan Hydrol.Earth Syst.Sci.Discuss., https://doi.org/10.5194/hess-2019-49Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 5 February 2019 c Author(s) 2019.CC BY 4.0 License.distributed on the lower altitude and therefore, they have difficulty in captures the precipitation events over mountains.The precipitation in the merged products and the GCMs is higher than CMA in Himalayas and especially the GCMs show higher precipitation in the northern Tibet Plateau as well as the southern part of the Hengduan Mountains.These differences show the general characteristics and their difference of all the three types of precipitation products.

Figure 4 .
Figure 4. Long-term (1979-2005) annual precipitation in different precipitation groups.(a) Annual precipitation of CMA dataset, (b) ensemble means of the annual precipitation in gauge-based products excluding CMA, (c) ensemble mean of the annual precipitation of all merged products, (d) ensemble means of the annual precipitation of all GCMs.The observations in Taiwan are not included in the CMA dataset.

Figure 5 .
Figure 5-e) in the southern China, indicating a significant model uncertainty of the annual precipitation between different GCMs.

Figure 6 .
Figure6.The temporal evolution of the model uncertainty, which is expressed as the normalized ensemble deviation of annual precipitation across datasets in each precipitation group for specific subregions.The value on the top right of each panel is the annual regional precipitation estimated in CMA dataset.The annual precipitation is normalized as the ratio to the CMA long-term annual precipitation.The shaded area represents the standard deviation of the annual precipitation in each year among the datasets within that group (divided by the annual precipitation of the corresponding group).

Figure 7 .
Figure 7.The spatial standard deviation (horizontal) and temporal standard deviation (vertical) of the annual precipitation in different precipitation groups for ten regions and the mainland China.The cross centre represents the long-term means of the regional annual precipitation.The horizontal error bar represents the spatial standard deviation (spatial variation of the long-term annual precipitation at all the grids).The vertical error bar represents the temporal standard deviation (temporal variations of region-averaged annual precipitation in different years).The P values in the left bottom is the annual precipitation of CMA.
-j), with the highest V e occurring in 9-southwest China.It indicates that the model variation between datasets in the observation group is Hydrol.Earth Syst.Sci.Discuss., https://doi.org/10.5194/hess-2019-49Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 5 February 2019 c Author(s) 2019.CC BY 4.0 License.

Figure 8 .
Figure 8.The maps of the grand variance (V) and variances in different dimensions (Vt, Vs, Ve) for three different precipitation groups.
The relative variance in space dimension (U s ) contributes the most to the total variance, especially in the northwestern China (U s =0.77 for the gauge-based products, Figure 9-g).The high values indicate the strong spatial heterogeneity of precipitation in the region compared to the mean values.However, because the spatial variations characterized by GCMs in the northwestern China is less significant than other two groups, the U s for region 10-southwest China (=0.51) is smaller than that of the gauge-based and merged products.The variations in time and space dimensions show the natural precipitation patterns but the deviation of the values at same spatiotemporal points show the ability of the products to consistently represent the spatiotemporal patterns.The relative variance in the ensemble dimension (U e ) shows the variations among different products in the same group.For the gauge-based more information than that in the time dimension, the deviation between N.t.std and U e (Figure10-b) is larger than that between N.s.std and U e (Figure10-a).The priority of the precipitation types also change from the model dominated (the model uncertainty in GCMs are larger than the other) to the region dominated (uncertainty in specific regions 9,10,11 are larger than other regions no matter in which precipitation data).This indicates that difference of model uncertainty over space has been reflected in the new uncertainty estimation U e .

Figure 11 .
Figure11.The proportion of the four components in Eq. (30) to the Ve in three precipitation groups, (a) gauge-based products, (b) merged products and (c) GCMs.The contribution are normalized so that the sum of them is 1.0 for each region.Among the four components, the σ 2 e_t and σ 2 e_s are associated with the two classic metric N.s.std and N.t.std, respectively.
Hydrol.Earth Syst.Sci.Discuss., https://doi.org/10.5194/hess-2019-49Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 5 February 2019 c Author(s) 2019.CC BY 4.0 License.collapsed the spatial variations.The very small contribution of σ 2 e_s related to N.t.std is the cause for larger deviations between N.t.std and U e (Figure 10-b).

Figure 12 .
Figure 12.The changes in (a) value and (b) percentage when using Ue as the new uncertainty metric compared to classic metrics N.s.std (Eq.28) and N.t.std (Eq.29).
Hydrol.Earth Syst.Sci.Discuss., https://doi.org/10.5194/hess-2019-49Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 5 February 2019 c Author(s) 2019.CC BY 4.0 License.6.2ConclusionA new three-dimensional partitioning approach is proposed in this study to assess the model uncertainties among multiple datasets.The new uncertainty metric (U e ) is estimated with an overall consideration of temporal and spatial variations among the ensemble products.Results show that U e is generally larger than the classical uncertainty metrics N.s.std and N.t.std which require a collapse in either of the time or space dimension.The deviation occurs where the spatial variations are significant but being averaged in N.t.std estimation.The decomposing of the V e shows the complementary relation of the two classic metrics and therefore the new uncertainty estimation (U e , derived from V e ) technique is a more comprehensive estimation way of uncertainty for multiple datasets.Thirteen precipitation datasets generated by different methodologies are categorized into three groups (i.e., gauge-based products, merged products and GCMs) and the model uncertainty in the ensemble products in the same group is analyzed with the new and two classic uncertainty metrics.The GCMs are identified with the largest model uncertainty with the classical metrics in most regions, while the new estimation U e indicates the largest model uncertainty occurs in specific regions no matter in which precipitation group.The spatial heterogeneity of the model uncertainty over space has been represented well in the new uncertainty metric.Thus, the overall model uncertainty (U e ) is a new uncertainty estimate which involves more information and should receive more attention in the uncertainty assessment field.