A Tri-Approach for Diagnosing Gridded Precipitation Datasets for Watershed Glacio-Hydrological Simulation in Mountain Regions

In mountain regions, validation and local correction of gridded precipitation datasets (GPDs) are pre-requisites for glacio-hydrological simulations. However, insufficient observed data and glacial involvement make it a complicated task in 10 glacierized watersheds. To diagnose the potential problems in GPDs from multiple perspectives and provide directions for their correction, a Tri-approach framework, consisting of statistical analysis, physical diagnosis, and practical simulation, is proposed. Truc-Budyko theory is introduced into this framework, which can identify the actual underor over-estimation of GPDs based on watershed water-energy balance, diagnose their possible causes, and provide directions for local correction. This framework was applied to the glacierized Upper Indus Basin (UIB) for evaluating GPDs, including APHRODITE, 15 CFSR, PGMFD, TRMM, and HAR, against adjusted observed precipitation (OBS), specific runoff, and glacier mass balance over varying periods during 1951−2017. The Spatial Processes in HYdrology (SPHY) model was used to simulate the hydrology and glacier changes (2001−2007). The results suggest that (a) patterns of interand intra-annual variations of OBS precipitation were better captured by APHRODITE (CC >0.6), but it was underestimated (-40%), (b) UIB was characterized as “Leaky” catchment based on overestimated CFSR (106%) and HAR (77%), indicating positive glacier storage changes 20 (0.37 and 0.21 m w.e. yr, respectively). In contrast, UIB was characterized as “Gaining” watershed for remaining underestimated datasets, indicating negative storage changes (-0.42 to -0.34 m w.e. yr). (c) For constant mass balance, the simulated runoff was overestimated in SPHY_CFSR (66%) and SPHY_HAR (53%), whereas it was underestimated for SPHY_APHRODITE (-41%), SPHY_PGMFD (-26%), and SPHY_TRMM (-33%). It highlights that evaluated GPDs could not generally meet the requirements of the rational output of glacier mass balance and streamflow concurrently. The physical 25 diagnosis directs local correction based on underand over-estimation. The practical simulation explores the extent of expected uncertainties in intra/inter-annual characteristics of glacio-hydrology.

streamflow, the relationship among precipitation, evapotranspiration, and streamflow will go beyond the principles mentioned above. The potential problems of GPDs and the involvement of glaciers can thus be detected. 100 A Tri-approach is proposed in the current study, which is a combination of (1) statistical validation-comparisons of GPDs against the observed precipitation based on climatology (2) physical diagnosis-assessing physical realism of GPDs to represent a plausible water-energy balance, and (3) practical simulation-based on the simulation of hydrology and glacier changes using glacio-hydrological models. The purpose of this framework is to provide a way of evaluating a GPD from multiple perspectives, diagnose the potential problems in it, and suggest the directions for its local correction. 105 The Tri-approach is applied as a case study in the glacierized Upper Indus Basin (UIB), which is located in a high elevation zone and covered by extensive glaciers (Bajracharya and Shrestha, 2011).
Some popular GPDs, such as APHRODITE, CFSR, HAR, PGMFD, and TRMM, are evaluated in UIB using the proposed Tri-approach as an example application. Performance of these datasets is evaluated by answering: (a) How do the GPDs perform against the observed precipitation? (b) Can the GPDs represent the real water-energy balance (physical realism)? 110 and (c) Can GPDs simulate the rational outputs of hydrology and glacier changes in glacierized catchments simultaneously?
Based on the evaluation, suggestions to further corrections to these datasets are made.

Study area
Upper Indus Basin (UIB) covers an area of more than 173,000 km 2 shared among China, India, and Pakistan (31−37°N and 115 72−82°E). About 50% area of UIB lies within Pakistan. UIB hosts the eastern Hindukush, western Himalaya, and Karakoram mountain ranges (Inman, 2010;Khan et al., 2015;Mukhopadhyay and Khan, 2014). The hydroclimatic characteristics of different spatial sub-regions of UIB are different from each other. Westerlies and summer monsoon precipitation systems (Figure 1a) are responsible for the annual precipitation in UIB; however, the effect and contributions of both sources differ temporally and spatially. The total number of glaciers is about 12,000, having more than 15,000 km 2 glacier area with glacier 120 area ratio (GAR) of about 9% (Bajracharya and Shrestha, 2011) (Figure 1c). The percent snow-covered area of UIB varies from more than 10 to 70% (Hasson et al., 2014), In UIB, the snow cover has distinct seasonal patterns with maximum snow in spring and a minimum in summer (Gurung et al., 2017). It is difficult to treat the UIB as a single unit because of the influence of multiple climatic systems as well as unique interactions among the cryosphere, atmosphere, and hydrosphere (Palazzi et al., 2013). Therefore, in this study, UIB was divided into three sub-regions based on their spatial location in the 125 three major mountain ranges: Himalaya,Hindukush,and Karakorum ([ and Figure 1b).

Data collection and preparation 130
The details of hydrology, climate, soil, land use, DEM, glacier mass balance, and reanalysis datasets are provided in [. Annual average observed discharge varied between 140-2431 m 3 /sec at the given hydrological stations (Figure 1e). The average annual temperature was 5.3 ºC in the entire domain of UIB for 1985−2014. The average annual minimum and maximum temperature was -0.98 ºC and 11.58 ºC, respectively ( Supplementary Fig. S1).
[ Table 2] 135 The basic wrangling, analysis, and extraction of GPDs were done using Climate Data Operators (CDO v1.9.7) package (https://code.mpimet.mpg.de/projects/cdo), GIS 10.2, and R (https://www.r-project.org). The spatial fields for adjusted observed precipitation (OBS) were generated by interpolating and resampling the point observations. The OBS and GPDs were resampled at a common resolution of 0.25º×0.25º. A simple resampling technique 'nearest neighbor' was used to resample the datasets at 0.25º×0.25º resolution. 140

Evapotranspiration calculation
The ET p was calculated based on the Hargreaves method (Hargreaves and Samani, 1985) using Drought Indices Calculator (DrinC V1.7) software (Tigkas et al., 2014). The calculated ET p was between 795 mm yr -1 to 1015 mm yr -1 , with an average of 907 mm yr -1 ([). The highest values were in the Hindukush, while the lowest in the Karakorum sub-region.
The choice of the formula is very critical in calculating ET p (Zhou et al., 2020) because it can affect the rest of the analysis; 145 therefore, it is essential to validate calculated ET p . For validating the calculated ET p and calculating the actual evapotranspiration (ET a ), reference ET p data ( Supplementary Fig. S2a) was also extracted from Global Reference Evapo-Transpiration (Global-ET0) at 1km resolution over the period 1970−2000 (Antonio and Zomer, 2018), available at the CGIAR-CSI GeoPortal. In this product, the potential evapotranspiration data was estimated based on the FAO Penman-Monteith method. The actual evapotranspiration data ( Supplementary Fig. S2b) was extracted from Esri_hydro "average 150 annual actual evapotranspiration" derived by the researchers at the University of Montana based on the data from MOD16 Global Evapotranspiration Product (ESRI, 2019). The ET a was estimated using the following relationship: Here, ET a represents the estimated actual evapotranspiration, is the actual evapotranspiration based on Esri_hydro "average annual actual evapotranspiration", is the potential evapotranspiration based on Global Reference Evapo-Transpiration, and ET p is calculated evapotranspiration. The ET a was less than by -17% to -4% in sub-regions and -155 averagely -11% in entire UIB ([).

Precipitation adjustment
Observed precipitation encounters with several uncertainties including undercatch snow, wind effect on snow redistribution, catchment located on the leeward side (e.g., Himalaya sub-region), low station density, uneven distribution of observation 160 network, and variations in overlapping periods for all the meteorological stations. The problem of low density and uneven distribution of meteorological stations is huge, common, and unavoidable in most of the alpine regions (Isotta et al., 2015;Liu et al., 2019), while the undercatch and measurement errors may also be amplified during different seasons (Rasmussen et al., 2012). Furthermore, the elevation of UIB ranges between ~300−8569 masl; however, the meteorological stations are located below ~5000 masl with an average elevation of 3100 m ( Figure 1b). Hence, there are no observed data at above 5000 165 masl, which is an unavoidable limitation in this region (Winiger et al., 2005). Previous studies (Basist et al., 1994;Bookhagen and Burbank, 2006;Hu et al., 2015;Immerzeel et al., 2015;Johansson and Chen, 2003;Yoon et al., 2019) have proven that precipitation is largely affected by topography, and this correlation is due to vertical deflection of moist winds aloft, the hindrance or modification of low pressure and frontal systems, and the promotion of local convection currents (Roe, 2005). So, the low elevation meteorological stations might not represent the higher elevation precipitation; therefore, using 170 such station data for performance evaluation may induce uncertainty in the analyzed results. A number of previous studies in the region have corrected the precipitation using different reverse modeling approaches (Dahri et al., 2018;Immerzeel et al., 2015). In the current study, observed precipitation is adjusted for the selected station based on the corrected precipitation in Dahri et al. (2018). The interpolated observed precipitation was resampled at 0.25º×0.25º resolution based on the 'nearest neighbor resampling technique' for comparison purposes. The adjusted precipitation was higher by 73% than the uncorrected 175 observed precipitation in UIB ([a). The average annual adjusted precipitation was 540±180 mm yr -1 from 1951−2017 in UIB.
The spatial distribution of adjusted precipitation in UIB is presented in [b.

The Tri-Approach framework
The Tri-approach framework includes three approaches: (a) statistical performance evaluation of GPDs against OBS for 180 investigating the ability of GPDs to represent the climatology, (b) testing the physical realism of GPDs to represent the plausible water-energy balance based on a hydrological alternative of Truc-Budyko theory, and (c) practical simulation using hydrological models to investigate the rationality of simulated hydrology and glacier changes. A schematic diagram of Triapproach is provided in [a.

Statistical analysis
The statistical analysis is based on the comparisons between GPDs and OBS precipitation using different statistical indices.
The analysis was performed to identify the differences among the GPDs in representing the patterns of monthly and seasonal distribution and inter-annual variations of OBS precipitation. The statistical indices include correlation coefficient (CC) ((2), https://doi.org/10.5194/hess-2020-194 Preprint. Discussion started: 26 June 2020 c Author(s) 2020. CC BY 4.0 License. percent bias (PBIAS) ((3), root means square error (RMSE) ( (4), and standard deviation (SD) ((5). The similarity in spatial 190 or temporal patterns between two datasets can be indicated by CC, the absolute mean difference between two datasets can be measured by RMSE, the systematic under-or over-estimation of a dataset can be shown by PBIAS. SD can measure the spread of data about the mean value. All of these statistical parameters were calculated as follows: Here, is the i th observation for the precipitation (superscript 'obs' and 'gd' represents OBS and GPD, respectively), is the mean of observed precipitation, is the mean value for the precipitation in GPDs, and n is the total number of 195 values in the corresponding dataset.
The precipitation data was available varying from 1951 to 2017 for different datasets ([) and OBS precipitation (Supplementary Table S1). The climatology was derived for Himalaya, Hindukush, and Karakorum sub-regions using all the available data for each rescaled dataset. The spatial distribution of GPD and OBS precipitation was explored based on annual average precipitation in sub-regions of UIB. Temporal distribution of OBS and GPDs precipitation was explored at monthly, 200 seasonal, and annual time scales in sub-regions of UIB. The trend analysis of the OBS and GPDs was performed based on the Mann-Kendall test, which has been widely used for non-parametric analysis in hydrometeorological studies (Hirsch et al., 1991). Sen's non-parametric method (Sen, 1968)  Winter. It is important to mention that in some of the previous studies, only two seasons were considered to represent the seasonal precipitation, i.e., winter (usually Oct−Mar) and summer (Jul-Sep) season, e.g., (Dahri et al., 2016;Hewitt, 2007).
The annual average precipitation for each dataset was compared with OBS precipitation for the corresponding overlapped period in sub-regions of UIB. Taylor's diagram (Taylor, 2001) was used to express the comparison results graphically.
ETCCDI indices (Peterson, 2005) are generally used to analyze the extreme precipitation characteristics in GPDs (Nastos et 210 al., 2013). In this study, four ETCCDI, including consecutive dry days (CDD), consecutive wet days (CWD), precipitation https://doi.org/10.5194/hess-2020-194 Preprint. Discussion started: 26 June 2020 c Author(s) 2020. CC BY 4.0 License. due to extremely wet days (R99pTOT), and simple precipitation intensity index (SDII), were selected to compare the performance of GPDs in representing the precipitation extremes. CDD is the maximum length of dry spells with precipitation <1mm, CWD is the maximum length of wet spells with precipitation >1mm, R99pTOT is the annual total precipitation when daily wet day amount >99 th percentile, and SDII is the mean precipitation amount on wet days. 215 RClimDex software package (https://github.com/ECCC-CDAS/RClimDex) was used to calculate ETCCDI. The average values of ETCCDI for GPDs were compared with those for OBS in sub-regions of UIB.

Physical diagnosis
The physical diagnosis was performed to identify the actual over-and under-estimation of the GPDs at watershed and annual scales based on the water-energy balance of the watershed. The hydrological alternative of the Truc-Budyko plot was used to 220 diagnose the GPD for reproducing a plausible water-energy balance. First, the water-input was compared with the wateroutput in the sub-regions of UIB. To do so, the precipitation from selected datasets was compared with the specific runoff to assess the quantitative relationship between the specific runoff and precipitation in different datasets, including OBS at monthly and annual scales. Then, a non-dimensional representation of physical water-energy balance was applied to estimate the actual under-or over-estimation in the glacierized catchment. The most widely used type of representation is proposed 225 by Truc (1954) and Budyko (1974). Finally, the water balance equation of a glacierized catchment was used to estimate the change in glacier storage for each dataset.
In this study, the physical realism of each precipitation dataset was verified using a hydrological alternative of a nondimensional Truc-Budyko plot (Andréassian and Perrin, 2012). In this approach, the realistic closure of water-energy balance was tested using precipitation from each dataset. Long-term water yield or runoff coefficient (Q/P) was plotted as a 230 function of long-term aridity-index (P/ET p ) (Coron et al., 2015;Valéry et al., 2010), i.e., Here, Q, P, and ET p represent specific runoff, precipitation, and evapotranspiration in a catchment, respectively. Plotting aridity-index on the x-axis also allows focusing on the wettest and driest catchments based on input precipitation (wetter catchment corresponds to higher P/ET p value). The physical interpretation of this hydrological representation is based on three assumptions: (1) Q ≥ 0, (2) Q ≥ P -ET p , and (3) Q ≤ P ([b). All three limits are based on the water balance equation of 235 a water-tight (conservative) catchment. The water balance equation for a conservative catchment can be written as follows: The point (representing a catchment-different positions in the plot for different precipitation datasets), which falls within the feasible domain is considered as realistic or "True" catchment ([b). The feasible domain is an area below or equal to the water limit and above or equal to the energy limit. If a data point falls above the water limit (i.e., Q > P) or below the energy limit, then it is called "Gaining" or "Leaky" catchment, respectively (Andréassian and Perrin, 2012). 240 When a glacierized catchment falls in the "Gaining" zone (Q > P) ([b) based on the water-energy balance, it implies that there must be an additional water term that contributes to total runoff. The meltwater contributions to total runoff highlight that higher precipitation is required to sustain such glacier systems in glacierized catchments. Hence, the precipitation in that GPD is underestimated as compared to the actual water-input in a glacierized catchment. In the case of "Leaky" catchment ( Fig. 2b), when the runoff is less than the available energy, it implies that a part of total runoff is missing from the water 245 balance or the precipitation is overestimated. In glacierized catchments, the missing water can be stored in the form of a positive glacier mass balance. Therefore, playing with the rational output of mass balance and streamflow can lead to the corrected precipitation, which would be sufficient for sustaining both water and mass balance.
In glacierized catchments, the simplest water balance equation can be written as: Here, ET a and MB represent actual evapotranspiration and mass balance in the watershed, respectively. The imbalance in (8  250 is the change in storage (∆S). When ∆S = 0, the catchment is the perfect "True" catchment. The "True" catchment can have a slight positive or negative change in storage depending on the quality of observed mass balance and ET a . However, the "Gaining" catchments always have a negative change in storage (∆S < 0), and the "Leaky" catchments always have positive changes in storage (∆S > 0). The negative change in storage results in the melting of glaciers and contributing additional water to total runoff. Whereas, positive change in storage represents advancing glaciers in a catchment where heavy 255 precipitation falls in solid form and stored in the form of glaciers.

Practical simulation
The practical simulation is used to ensure that GPDs are capable of producing a balanced output of streamflow and glacier changes at the same time. The observed glacier, snow cover, and hydrology data are used to calibrate the glacio-hydrological model, and then the simulated results of runoff and mass balance are analyzed for rationality in a glacierized catchment. The 260 Spatial Processes in HYdrology (SPHY) model (Terink et al., 2015) was used for the practical validation of the ability of all the precipitation datasets to simulate hydrology and glacier changes. The SPHY model is a fully distributed, leaky bucket type hydrologic model. It has been developed using key components of HydroS (Droogers and Immerzeel, 2010), PCR-GLOPWB (Bierkens and van Beek, 2009), SWAT (Arnold et al., 1998), HimSim (Immerzeel et al., 2012), and SWAP (Van Dam et al., 1997). The SPHY model is a raster-based glacio-hydrological model, and it has been used in different glacierized 265 regions. The primary advantage of the SPHY model is its glacier module, which can distinguish between the clean ice and debris-covered glaciers. The debris-covered glacier area is ~18% (Khan et al., 2015), and debris-covered glaciers can affect https://doi.org/10.5194/hess-2020-194 Preprint. Discussion started: 26 June 2020 c Author(s) 2020. CC BY 4.0 License. the overall meltwater contributions Kaab et al., 2012;Khan et al., 2015). The SPHY model allows assigning different degree-day factors for the two to differentiate between their melt rates.
In the SPHY model, total runoff is a sum of four possible components: glacier runoff, snow runoff, baseflow runoff, and rain 270 runoff.
Here, R represents the runoff (mm) for a unit time step. The glacier runoff is composed of supraglacial snowmelt, ice melt, and direct rain on ice runoff. As the SPHY model is based on grid cell spatial discretization, sub-grid parameterization is used to differentiate glacier cover, i.e., clean and debris-covered glacier fraction within a grid cell using a debris cover mask starting from lower elevations. The dynamic snow and soil water storage are solved within the remaining fraction of the grid 275 cell. The detailed description of the SPHY model is referred to Terink et al. (2015). Here, a brief description of snow and glacier runoff generation is provided.
Based on the threshold temperature, precipitation is differentiated into rain or snowfall (P = Snow, when T avg ≤ T threshold ).
Snowmelt is calculated using degree-day model (Hock, 2003) as follows: Here, M snowPot is the potential snowmelt (mm), and the actual snowmelt (M snowAct ) (mm) is calculated using the snow storage 280 of the previous day (∆Snow t-1 ). DDF snow (mm ºC -1 day -1 ) is the degree-day factor for snow, and it is a calibration parameter.
The snow runoff is generated when the melting point is below the air temperature, and melted snow cannot be refrozen within the snowpack. The snow runoff is the balance of actual snowmelt, liquid precipitation, and the refrozen meltwater.
Here, M glacierCI and M glacierDC is the daily glacier melt from clean ice and debris-covered glaciers, respectively; F glacierCI and F glacierDC is the fraction of debris-free and debris-covered glaciers, respectively; DDF glacierCI (mm ºC -1 day -1 ) and DDF glacierDC (mm ºC -1 day -1 ) is the degree-day factor for debris-free and debris-covered glaciers, respectively. The total glacier melt within a grid cell (M glacierT ) is calculated by multiplying the total glacier fraction (F glacier ) with the sum of daily glacier melt from debris-free and debris-covered glaciers as follows: 290 The glacier runoff (R glacier ) is calculated as a product of glacier runoff factor (G RF -a calibration parameter used to allow the percolation) multiplied by total glacier melt within a grid cell as follows: = × The remaining meltwater percolates into soil layers and recharges groundwater, which after baseflow recession days (BF days -a calibration parameter), is added up in total runoff as baseflow (Terink et al., 2015). A three-fold multi-objective calibration is adopted to avoid the issues of equifinality caused by the glacier compensation effect.
In the first step, the degree-day factors for clean ice and debris-covered glaciers were optimized ([) based on the areaweighted mean glacier mass balance. The observed mass balance data were extracted from the literature. The mass balance 300 in the SPHY model was taken as the accumulation in the form of solid precipitation on the grid cell with glacier fraction and adjacent grid cells with a slope steeper than 0.2 . Then, the parameters related to snow were calibrated based on the snow extent in the basin. The average monthly snow cover was compared with MODIS snow cover, which was averaged over for every month from the MODIS 8-day product. In the third step, the parameters related to baseflow and routing were calibrated based on observed daily runoff at Besham Qila gauge station. 305 After parameterizing the base SPHY project, six SPHY projects were developed using the calibrated set of sensitive parameters for each precipitation dataset in the entire domain of UIB. All SPHY projects had the same datasets and all other specifications, except precipitation data. To assess the rationality of simulated glacio-hydrological results, either runoff or the mass balance should be identical among all the SPHY projects. Therefore, the SPHY projects based on GPDs were retuned for glacier and snow parameters to achieve a similar average mass balance for all the SPHY projects. The rationality 310 between the glacio-hydrological outputs was investigated. Comparisons among the simulated glacio-hydrology for each SPHY project were made for hydrological performance at daily scale. The inter-annual variations in total runoff and mass balance and PBAIS with observed runoff were investigated. The contributions of total runoff components were also compared among the outputs of six SPHY projects.

Statistical validation based on climatology
The first component in the Tri-approach framework is the statistical comparisons among the abilities of GPDs in representing the climatology of OBS precipitation. The GPDs were compared against OBS over the varying periods from 1951 to 2017 to evaluate their spatiotemporal performance. In all the GPDs, the northeast part of UIB had low precipitation 320 compared to the other parts, whereas the southwest part with minimum average elevation had the maximum amount of precipitation ([a). Spatial distribution patterns in CFSR and HAR were in resemblance with that of OBS; however, the amount of precipitation was overestimated. In UIB, average annual precipitation was found 323±99 mm yr -1 , 1115±419 mm yr -1 , 955±218 mm yr -1 , 410±84 mm yr -1 , and 342±86 mm yr -1 for APHRO, CFSR, HAR, PGMFD, and TRMM over varying periods from 1951 to 2017, respectively. The trend analysis of OBS precipitation data showed a significant positive trend in 325 all sub-regions, except the Hindukush (Supplementary Table S3). Mann-Kendall test statistics revealed that all the GPDs showed random trends in UIB.

[Figure 5]
The evaluation results of the GPDs are graphically presented using Taylor's diagrams in sub-regions and UIB ([). The proximal distance and position on correlation bars represent the performance. The CC for CFSR was the lowest (<0.2), 330 whereas the highest values of bias (106%) in UIB ([). The performance of HAR to represent the inter-annual variations of OBS was also unsatisfactory due to higher bias (77%) and lower correlation. The performance of APHRO to represent the pattern of annual variations was identified as the better in UIB with a higher correlation (CC > 0.6) ([); however, it was underestimated by -50% in UIB ([).  The annual precipitation cycle is represented using the hyetographs for each dataset in three sub-regions at the monthly time scale ([a). The annual cycle of OBS precipitation had a bi-modal hyetograph, where the first peak occurred in April and second in August in all sub-regions. The monthly distribution of area-weighted precipitation indicated a bi-modal weather system in UIB. The annual precipitation distribution pattern was associated with the westerlies and Indian monsoon in winter 340 and summer, respectively. The first peak of the OBS precipitation is due to the westerlies as most of the precipitation occurs in the winter and spring seasons in solid form. On the other hand, the second peak is due to the summer monsoons in the region. Most datasets capture the second peak of OBS precipitation in the Himalaya sub-region, while the first peak was mimicked by most of the datasets in Hindukush and Karakorum sub-regions. This highlights that the GPDs can represent the influence of westerlies up to some extent in Karakorum and Hindukush sub-regions, and monsoon in Himalaya. However, 345 the precipitation amount is underestimated in APHRO, TRMM, and PGMFD; whereas, it is overestimated in CFSR and HAR as compared to the OBS. APHRO performed better (CC > 0.8) in representing the patterns of monthly distribution of OBS ( Supplementary Fig. S5).
The seasonal distribution of precipitation was explored and compared for the winter, spring, summer, and autumn seasons in sub-regions of UIB ([b). The most part (61%) of annual OBS occurred in the winter and spring season in UIB. In sub-regions, 350 Himalaya, Hindukush, and Karakorum, 63%, 62%, and 56% of annual precipitation occurred in winter and spring ([b). Some of the previous studies combined the two seasons and labeled it as winter precipitation. Averagely, the winter and spring season precipitation was overestimated by CFSR and HAR by 13% and 22%, respectively, whereas, it was underestimated by APHRO (-23%), PGMFD (-7%), and TRMM (-17%) as compared to the OBS in UIB. This highlights that mostly westerlies influenced the distribution of annual precipitation in UIB. 355

[Figure 7]
The average values of precipitation extremes in OBS and GPDs based on selected ETCCDI are presented in [. CDD based on APHRO was the highest in Himalaya and Karakorum sub-region, while CDD based on TRMM was the highest in Hindukush. Among the GPDs, the lowest CDD values were for CFSR in all the sub-regions. Average values of CDD for OBS, APHRO, CFSR, HAR, PGMFD, and TRMM were 30±20 days, 52±27 days, 22±6 days, 26±7 days, 46±25 days, and 360 45±18 days in all sub-regions, respectively. It showed that the average duration of dry spells CDDs in APHRO, PGMFD, and TRMM was overestimated, and it was underestimated in CFSR and HAR as compared to that in OBS. On the contrary, the maximum length of wet spells was found the longest for CFSR, while the lowest for APHRO. The average values for CWD for OBS, APHRO, CFSR, HAR, PGMFD, and TRMM were 10±4 days, 6±2 days, 22±13 days, 13±5 days, 8±3 days, and 8±4 days in UIB, respectively. The highest value for R99pTOT was for OBS averagely, and all GPDs were 365 underestimated for this ETCCDI compared to OBS. Among the GPDs, CFSR had the highest value for R99pTOT, whereas APHRO had the lowest. The SDII values for HAR, CFSR, and PGMFD were greater than the OBS, whereas APHRO and TRMM showed smaller values. The greatest SDII was for HAR dataset.

Precipitation versus specific runoff
First, the monthly specific runoff was compared to the monthly area-weighted region-wise precipitation. It was found that the intra-annual distribution of specific runoff was quite different from that of precipitation ([). The comparisons among the datasets were made using the runoff coefficients (Q/P) for each month. The values of runoff coefficients greater than one (Q/P > 1) means that runoff is higher than the precipitation. Based on the values of runoff coefficients, it was noted that the 375 runoff peaks occurred during Dec-Apr in all sub-regions ([) because most of the precipitation fall in winter and spring seasons ([). The runoff coefficients in the Hindukush were lower than those in Himalaya and Karakorum sub-regions for all datasets. Such a relationship between the runoff and precipitation is because the winter and spring precipitation occurs mostly in solid form as snow and remains there until it starts melting and contributes to late-spring and early-summer flows.
The accumulation of snow during winter and melting of snow and glaciers during summer creates the difference between the 380 distribution of precipitation and specific runoff.
[ Figure 9] Water-year precipitation totals based on all the selected datasets were compared with annual runoff in the Himalaya, Hindukush, and Karakorum over varying periods from 1983 to 2010, depending on the data availability and the overlapped period of the respective dataset and sub-region ([). Considerable differences were spotted among the datasets when 385 compared with the runoff in sub-regions. The water-year total precipitation was identified lesser than the runoff in the Hindukush and Karakorum sub-region for APHRO, PGMFD, and TRMM. The correlation between annual runoff and precipitation was significant and satisfactory for APHRO, PGMFD, and TRMM in the Himalaya sub-region. Overall, TRMM and PGMFD showed good and significant correlation with annual runoff in UIB (0.68 and 0.54, respectively); however, they were lesser as compared to the annual runoff. The impact of the Indian summer monsoon probably played a 390 significant role in greater precipitation totals in the Himalaya sub-region.

Physical realism to reproduce plausible water balance
The physical realism of each dataset to represent the water balance in each region was tested and plotted based on a hydrological alternative of the Truc-Budyko plot ([). The aridity index (P/ET p ) and runoff coefficient (Q/P) were plotted on 395 the x-axis and y-axis, respectively. Each point represents a catchment, and the colors differentiate among different datasets.
The catchments fallen within the feasible domain were considered as physically realistic or "True" catchments. In "True" catchments, precipitation was enough to reproduce the water balance; however, this amount of precipitation may or may not be sufficient to represent the mass balance in glacierized catchments.
The points above P/Q = 1 line or under the energy limit (right side of theoretical Budyko line) were considered physically 400 unrealistic. Most of the points were above the Q/P = 1 line for OBS and other gridded datasets except CFSR. These points represent the "Gaining" catchments. In gaining catchments, precipitation was not sufficient to close the water balance. The points out of the energy limit (i.e., Q < P-ET p ) were characterized as "Leaky" catchments, i.e., runoff deficit was greater than the potential evapotranspiration, in our case, HAR and CFSR represent the Hindukush, Karakorum, and average of entire UIB as "Leaky" ([). Such behavior and possible deviations can be explained by potential errors and uncertainties in observed 405 runoff and calculated ET p in the study area. Moreover, the theoretical Budyko curve (energy limit) is usually different for glacierized basins because of an additional water term in water balance from glacier melting. Based on the aridity-index values, CFSR and HAR were identified to make the whole study area extremely wet.
In the "Gaining" catchments, which break the water limit (Q > P), additional water term is added to the water balance. This additional water is contributed by glacier melt in the glacierized catchments, and such behavior of the catchment results in a 410 negative change in glacier storage (∆S < 0). For example, in the Himalaya sub-region, all the datasets except CFSR were "Gaining", and meltwater contributed to the total runoff. On the other hand, in the "Leaky" catchments, which break the https://doi.org/10.5194/hess-2020-194 Preprint. Discussion started: 26 June 2020 c Author(s) 2020. CC BY 4.0 License. energy limit (Q < P-ET p ), some quantity of water is missing in the water balance. This missing water is stored in the form of positive glacier storage (∆S < 0). For example, CFSR and HAR made Hindukush and Karakorum sub-regions and entire UIB domain as "Leaky", where missing water from the water balance may result in advancing glaciers. 415

Practical validation based on simulated hydrology and glacier changes
This section provides the results of glacio-hydrological simulations based on the SPHY model to testify the ability of GPDs to generate the rational output of streamflow and glacier changes. These results may find the problems of temporal 430 distribution, water balance, and involvement of glaciers, as found in previous sections.
The degree-day factors for debris-covered and debris-free glaciers were calibrated based on the observed mass balance data.
The degree-day factor for snow, water storage capacity, and threshold temperature were calibrated using MODIS snow cover data in UIB. The baseflow and routing related parameters were calibrated using observed runoff data at Besham Qila hydrological gauge station. The calibrated degree-day factors ([) fall within the range of observed degree-day factors in the 435 Karakorum mountains (Zhang et al., 2006).
The average simulated mass balance was -0.17 m w.e. yr -1 ([), which was in a very good agreement with the mass balance derived in previous studies (Brun et al., 2017;Gardelle et al., 2013;Kääb et al., 2012;Kääb et al., 2015;Muhammad et al., 2019). Previous studies indicated the minimum snow cover as less than 10% using MODIS data (Hasson et al., 2014). The 445 difference in values with our study is due to the study area size and selected period for the evaluations. They included the Jhelum and Kabul basins in their evaluations, where minimum snow cover is used to reduce up to less than 5% of the total basin area. The simulated snow cover is in a good match with a previous study in the region (Lutz et al., 2016). To assess the rationality in the simulated glacio-hydrology in UIB for different precipitation datasets, the SPHY projects were calibrated to produce similar average mass balance as in the base calibrated model (i.e., SPHY_OBS). In UIB, the simulated mass balance by SPHY_OBS, SPHY_APHRO, SPHY_CFSR, SPHY_HAR, SPHY_PGMFD, and SPHY_TRMM 470 was -0.17±0.17 m w.e. y -1 , -0.17±0.21 m w.e. y -1 , -0.17±0.48 m w.e. y -1 , -0.17±0.56 m w.e. y -1 , -0.17±0.18 m w.e. y -1 , and -0.17±0.10 m w.e. y -1 for 2002−2007, respectively ([e). It highlights that when the simulated mass balance is calibrated for the observed mass balance in the basin, the simulated runoff breaks the rationality of glacio-hydrological outputs. In such cases, simulated runoff is either over-or under-estimated as compared to the observed runoff in the basin. In the current study, SPHY_CFSR and SPHY_HAR simulated overestimated runoff, whereas SPHY_APHRO, SPHY_PGMFD, and 475 SPHY_TRMM generated underestimated runoff as compared to the observed runoff in UIB ([d).
In UIB, the total runoff was contributed first by snow runoff in the late spring to early summer, and then glacier runoff started contributing to generate maximum flows in summer. Meanwhile, summer monsoon also played a role in producing https://doi.org/10.5194/hess-2020-194 Preprint. Discussion started: 26 June 2020 c Author(s) 2020. CC BY 4.0 License. peak values of total runoff during summer ([). Baseflow joined the total runoff having a recession of more than three and a half months ([) after percolation during the melting season, in addition to the running baseflow runoff. At annual scale, the 480 glacier runoff, snow runoff, baseflow runoff, and rainfall-runoff contributed to total runoff ranging between 44−49%, 30−35%, 14−20%, and 3−5%, respectively, simulated based on six SPHY projects, which were forced using OBS and GPDs ([a).
For all the SPHY projects, the snow runoff contributions were the highest in the spring season, while the glacier contributions were highest in the summer. In the spring season, the snow runoff contributions were 48%, 64%, 56%, 59%, 485 65%, and 67% to the total runoff simulated under SPHY_OBS, SPHY_APHRO, SPHY_CFSR, SPHY_HAR, SPHY_PGMFD, andSPHY_TRMM during 2002-2007 over UIB, respectively ([b). It was noted that all the simulated snow runoff contributions based on GPDs were higher than those by SPHY_OBS. Similarly, in the summer season, for SPHY_OBS, SPHY_APHRO, SPHY_CFSR, SPHY_HAR, SPHY_PGMFD, and SPHY_TRMM, 51%, 62%, 59%, 55%, 61%, and 58% runoff was contributed by glacier runoff in UIB, respectively ([b). Again, the simulated glacier runoff 490 contributions during the summer season were higher for the GPDs compared to the OBS. It is also important to mention that the combined amount of contributed water by glacier and snow runoff during summer and spring season were higher for CFSR (69%) and HAR (53%) as compared to OBS, whereas these were lower in the case of APHRO (-44%), PGMFD (-27%), and TRMM (-35%) ([b). It also highlights the irrational behavior of hydrological outputs as contributions from meltwater were overestimated for overestimated GPDs and underestimated for underestimated GPDs while keeping the mass 495 balance constant.

Discussions
In this study, a Tri-approach framework is proposed to diagnose the potential issues in GPDs from multiple perspectives.
This framework can identify the actual under-or over-estimation of GPDs on the basis of watershed water and energy 500 balance, diagnose their possible causes, and provides directions for local correction. The approach was applied in UIB as a case study. It has the ability to investigate climatology, water-energy balance, and rationality of simulated hydrology and glacier changes in a mountain glacierized watershed.

Ability to diagnose the problems in representing climatology
The statistical analysis component in the Tri-approach framework basically helps to investigate the performance of GPDs in 505 representing the observed climatology. This component focuses on the monthly and seasonal distribution and inter-annual variations and precipitation extremes in GPDs. The comprehensive diagnosis of GPDs in representing observed climatology would be very useful for temporal correction of GPDs and analyzing the expected uncertainties in simulated glaciohydrologic outputs. This statistical approach is more common and has been applied in multiple previous studies to evaluate the performance of GPDs. However, the authenticity of such statistical evaluation is questionable when the observed data is 510 insufficient or of inferior quality due to the uneven distribution of meteorological stations, which is the case in high elevation glacierized river basin. In the current study, the observed precipitation data were adjusted using the corrected precipitation in UIB. The adjusted precipitation is 73% greater than the uncorrected precipitation.
The GPDs have differences in their spatiotemporal resolutions, covered time span, and underlying methodologies ; therefore, in this study, the GPDs based on various sources and different methods were selected for analysis (i.e., 515 reanalysis-CFSR, observed interpolation-APHRO, the combination of reanalysis and observed interpolations-PGMFD, satellite observations-TRMM, and downscaled model output-HAR).
The spatial and temporal distribution of mean annual GPDs' precipitation shows diverse differences in magnitudes and patterns when compared with OBS ([-8). APHRO performed better to represent the patterns in interannual variations (CC > 0.6) ([); however, it was highly underestimated (-40%) ([). The bimodal annual cycle of OBS precipitation ([) indicates a 520 multi-sourced weather system in UIB, which is influenced by westerlies during the Winter and Spring seasons, whereas monsoon impacts the distribution during the summer season. In previous studies, researchers have explained the bimodal weather system in this region (Dahri et al., 2016;Hasson et al., 2017). UIB receives >60% of total annual precipitation during the winter and spring seasons ([b), which is in good agreement with the arguments of Hewitt (2007) Dahri et al. (2016). The reanalysis datasets account for both solid and liquid precipitation more consistently, which may explain their overestimation in high mountain glacierized regions (Blacutt et al., 2015), whereas observation interpolated and satellite estimations based datasets have difficulties in detecting the snowfall (Rasmussen et al., 2012;Wang et al., 2013). It is important to note that continuous biases and changes in both models and observing systems can introduce fake trends and variability into reanalysis outputs (Bengtsson, 2004); therefore, trends and variabilities from reanalysis 540 datasets should be treated carefully for hydrological applications.
Although GPDs have captured the monthly distribution patterns of OBS precipitation ([), these datasets show significant differences in their monthly, seasonal, and annual magnitudes ([). Moreover, the large under-and over-estimations for the gridded datasets over elevational profile may have been caused by the dynamic climatic system (Pang et al., 2014), precipitation dependency on altitude Wortmann et al., 2018) and the approaches used to generate 545 these datasets (Harris et al., 2014;Huffman et al., 2010;Saha et al., 2010). The reason for the better representation of OBS climatology by the APHRO dataset is the use of observed data in its generation; however, the precipitation at ungauged elevation ranges is not extrapolated in APHRO dataset (Ji et al., 2020;Yatagai et al., 2012), which would affect its application in mountain glacierized catchments.
It must be noted that these findings over the selected glacierized mountain sub-regions may allow for a performance 550 assessment of the presented datasets in general for glacierized alpine regions. It is essential to highlight that most datasets are not independent of each other, as most of them include the same station observations directly or assimilate them in some way. This is, however, a common problem in comparison studies, which cannot be avoided.

Ability to diagnose the problems in representing the water-energy balance
The introduction of the Truc-Budyko theory into the Tri-approach framework is useful to identify the actual under-or over-555 estimation of GPDs on the basis of watershed water and energy balance, diagnose their possible causes, and provides directions for local correction.
The physical diagnosis makes sure that a GPD represents a plausible water-energy balance in a glacierized catchment. The water limit helps to identify the additional water term in the water balance or missing water-input. This gives directions to correct the underestimated GPDs based on the rational water and mass balance in a glacierized catchment. For example, 560 APHRO, TRMM, and PGMFD are out of water limit ([), and the missing amount of precipitation in these datasets may be the undercatch and undetected solid precipitation at higher elevations (Rasmussen et al., 2012). If such datasets are being used in hydrological simulations, they would result in a highly negative mass balance in the long term and simulate implausible conditions in the glacierized catchment. These datasets are insufficient to close the realistic water-energy balance in a glacierized watershed. The simulated hydrology using these datasets is underestimated in glacierized catchments 565 ([d). On the other hand, for example, CFSR and HAR are mostly out of energy limit ([), and the overestimated water-input will result in higher storage ([), and it may simulate implausible positive mass balance conditions in glacio-hydrological modeling. However, the higher inter-annual variations in these GPDs ([b) make the inter-annual variations in simulated mass balance very high ([e). The under-or over-estimated GPDs can be corrected based on the rational output of streamflow and glacier changes in a glacierized catchment. 570 The physical diagnosis of GPDs in UIB indicates that GPDs may not reproduce the true water balance, and most of them might be unsuitable for hydrological applications in such glacierized catchments. Similar concerns were highlighted by Dahri et al. (2016), who performed an evaluation of GPDs and concluded that these are not suitable to force the hydrological models in UIB. The runoff peak lags about four months behind the precipitation peak in the Himalaya, Hindukush, and Karakorum sub-regions ([). The distribution of runoff and precipitation in such a manner highlights the higher solid 575 precipitation in winter and the dominance of meltwater contributions during the summer season in UIB. Similar arguments have been made by several researchers in the region, e.g., (Hewitt, 2007;Khan et al., 2015;Lutz et al., 2016;Mukhopadhyay and Khan, 2014). The annual runoff is higher than precipitation for APHRO, TRMM, and PGMFD averagely ([), which highlights that these datasets would cause a negative change in glacier storage as additional meltwater term may be needed to compensate the water balance in the region, which is the case in UIB . 580 The physical diagnosis component in Tri-approach also helps in detecting the possible effects of meltwater on the changes in glacier storage based on the water-energy and mass balance. The problems in GPDs can be diagnosed by analyzing the physical factors involving in these effects. The physical diagnosis identifies the "True", "Gaining", and "Leaky" catchments based on the water-input into the catchment. UIB was identified as "Gaining" catchment based on APHRO, PGMFD, and TRMM, whereas it was "Leaky" based on CFSR and HAR ([). There are three possible reasons for the case of "Gaining" 585 catchment: (a) additional water contribution from glacier melt (characterized by a negative change in glacier storage in [), which is the case in UIB , (b) underestimated precipitation (Valéry et al., 2010), which is true for APHRO, PGMFD, and TRMM in UIB ([) and (c) errors in runoff measurements (Andréassian and Perrin, 2012).
Underestimation of precipitation and additional water term in water balance is evident from the conclusions of previous studies in UIB (Dahri et al., 2016;Immerzeel et al., 2015;Lutz et al., 2014). Similarly, Rasmussen et al. (2012) found that 590 the chances of snow undercatch might be as high as 20−50% in high altitude mountainous areas, which is the case in UIB, especially, in the Hindukush and Karakorum sub-regions. The possibilities of runoff measurement errors have been warned in different studies in the region (Mukhopadhyay and Khan, 2014).
On the other hand, for the "Leaky" catchments, there might be four reasons in addition to discharge measurement errors (underestimation): (a) errors in the estimation of ET p (underestimation; [), (b) overestimated precipitation, which is the case 595 for CFSR and HAR ([), (c) higher infiltration or local aquifer recharge, or (d) underground water flow towards another aquifer (Andréassian and Perrin, 2012). The impact of inter-catchment groundwater flow on the behavior of "Leaky" catchments has been analyzed in France by Le Moine et al. (2007), who suggest that underground water affects the overall water balance. However, in UIB, the most probable reason for "Leaky" catchment behavior under CFSR and HAR is the overestimation of precipitation in these datasets ([; [; [). The bed is rocky, and vegetation is very low in UIB, such conditions 600 strengthening the conclusion of overestimated CFSR and HAR precipitation. Several researchers (Blacutt et al., 2015;Liu et al., 2018;Silva et al., 2011) have provided evidence for overestimated CFSR precipitation in different parts of the world.
The actual over-and under-estimations identified based on physical diagnosis provides the basis to kick-off the correction process of GPDs. The violations of water and energy limits by GPDs give an idea of the impacts of meltwater contributions and glacier mass storage in the catchment, and thus, to adopt the proper correction technique. hydrological applications can be identified using the Tri-approach framework.
The quantitative investigation outputs of the Tri-approach framework ([; [) are useful for the correction of GPD and multiparameter calibration during glacio-hydrological simulations. The practical simulation, in combination with the statistical and physical diagnosis, helps in investigating the inter-and intra-annual variations based on a rational balance between hydrology and glacier changes ([). The Tri-approach framework helps to avoid the risk of equifinality as it provides 620 directions for the multi-parameter calibration based on the quantitative outputs of diagnosis of water-input and output in a glacierized watershed ([). The underestimated precipitation may be compensated with other water balance components, e.g., evapotranspiration, snow, or glacier melt (Ragettli and Pellicciotti, 2012;Schaefli, 2005;Shafeeque et al., 2019). A false calibration parameter set would enhance the simulated meltwater to reduce the BIAS between simulated and observed runoff (Ragettli and Pellicciotti, 2012;Wang et al., 2018). For example, in the case of the SPHY model forced by APHRO,625 PGMFD,and TRMM precipitation datasets ([) to enhance the simulated runoff, a higher negative mass balance would result.
However, keeping the average mass balance closer to observed mass balance data ([) eliminated the risk of equifinality and avoided the glacier compensation effect. At the same time, it was confirmed that combined contributions of snow and glacier melt during the spring and summer seasons were higher (69% and 53%) for overestimated GPDs (CFSR and HAR, respectively) and lower (-44%, -27%, and -35%) for underestimated GPDs (APHRO, PGMFD, and TRMM, respectively) as 630 compared to that of OBS ([). If an underestimated precipitation dataset is generating sufficient runoff in a glaciohydrological simulation, then it is for sure that the glaciers are compensating that amount of runoff. Therefore, the simulated results in such a situation are questionable.
Based on the quantitative results of the current study, it is concluded that GPDs generally cannot reproduce the rational output of glacier changes and hydrology in glacierized catchments. Therefore, it is recommended to correct the GPDs based 635 on the local mass and water balance in glacierized catchments before any hydrological application.

Uncertainties
The application of the Tri-approach in diagnosing the GPD for glacio-hydrological simulations in mountain regions can be influenced by uncertainties. A denser observation station network is required, especially at the higher elevations, to reduce the uncertainties in the observed datasets for hydrological simulations (Li et al., 2020;Liu et al., 2019). This is crucial for the 640 southeastern parts of the UIB that are characterized by low station density (Figure 1). Similarly, the observed runoff might also be affected due to random and measurement errors. Inconsistencies in the measurements and overlapped measuring periods for different hydrological stations amplify the overall uncertainty in the data. Besides, the runoff and precipitation have different peak-occurring timings due to multi-sourced precipitation systems (Dahri et al., 2016) with a maximum proportion of precipitation in the winter and spring seasons (Hewitt, 2007) contrasting to a maximum runoff in the summer 645 season (Mukhopadhyay and Khan, 2014). The availability of glacier data can also be a limitation and cause a certain amount of uncertainty in the Tri-approach results. The limitations associated with the observed datasets are the unavoidable common issue in glacierized regions.
The use of non-dimensional hydrological representation is suitable and advantageous because physical data like runoff, evapotranspiration, and precipitation are frequently measured or calculated in any watershed (Andréassian and Perrin, 2012). 650 However, the evapotranspiration data are mostly unavailable since more parameters are required to calculate it. The calculated ET p values are slightly lower (average -11%) than Glabal-ET0 data, which used Penman-Monteith method. It is in line with the conclusions of Zhou et al. (2020), who concluded that ET p calculated by Hargreaves method is lower than that by Penman-Monteith method. The methods applied to calculate ET p may affect the overall representation of the hydrological alternative of the Truc-Budyko plot. It has been argued that the energy limit may also depend to some extent on the chosen 655 ET p formula (Coron et al., 2015), which may ignore most climatic parameters and use the only temperature data in the current study. This may be the explanation for slightly negative mass balance for CFSR in the Himalaya sub-region and average OBS in UIB ([) even these were in the feasible domain and represent "True" catchments ([). Besides, estimated ET a ([) might also affect the final changes in glacier storage based on the physical diagnosis.
In the SPHY model, the glaciers are considered as melting surfaces, which cover a grid cell partly or entirely. Moreover, a 660 grid cell can have multiple parts of different glaciers, and it treats them as a single unit within that grid cell. Although the complex glacier processes cannot be resolved explicitly using the SPHY model; however, melting surfaces at a reasonable resolution serves the purpose of this study.

Implications of Tri-approach
The Tri-approach is very useful for the selection of most suitable GPD for glacio-hydrological applications in any 665 glacierized catchment. In glacierized catchments, it is a common understanding that GPDs need correction before any hydrological application. However, a preliminary evaluation of these GPDs is mandatory not only based on climatology but also water-energy and mass balance before any correction. The Tri-approach can provide basic directions for the correction factors based on climatology, plausible water-energy balance, and glacier changes simultaneously, and thus, assist in adopting the proper local correction of GPDs. Several researchers corrected the GPDs in different glacierized catchments 670 based on climatology (Dahri et al., 2016), conceptual water balance (Khan and Koch, 2018), and vertical gradients and mass balance distribution Wortmann et al., 2018). It is important to note that correction may also induce uncertainty in simulated results based on the technique applied. The corrected GPDs can also be verified using the Triapproach framework. Meanwhile, if there is no option for the correction of GPDs, then one must choose the best data representing the water-energy and mass balance in glacierized mountain regions. Besides, the Tri-approach detects the key 675 limitations of GPDs, and thus, helps to identify the expected uncertainties in the outcomes of glacio-hydrological simulations, for example, under-or over-estimations in simulated hydrology, variations in intra-and inter-annual distribution of streamflow and mass balance, deviations from a concurrent rational output of streamflow and glacier change, among the others. The analysis of change in glacier storage is critically important because precipitation patterns can also be influenced by changes in glaciers (Ren et al., 2020). The understandings developed using the Tri-approach framework are effective for 680 the data generators and algorithm developers to improve their work keeping in mind the application demands for real time scenarios.

Conclusions
The Tri-approach framework evaluates the GPDs statistically, physically, and practically. Application in the UIB confirms that it is plausible in the glacierized watershed where rain gauge data are scarce. The approach has the ability to investigate 685 climatology, water-energy balance, change in glacier storage, and rationality of simulated hydrology and glacier changes in mountain glacierized watersheds.
The statistical validation identifies the potential problems in the temporal distribution of the datasets, e.g., APHRO represents the monthly and seasonal distributions and interannual variations (CC > 0.6) but is underestimated. On the other hand, CFSR and HAR are overestimated and do not represent the inter-annual variations in UIB (CC ≤ 0.3). Reanalysis 690 based GPDs are generally overestimated (77%−106%), whereas, observation and satellite-based GPDs are underestimated (-41% to -24%). The underestimated GPDs would result in an underestimated hydrology when applied in hydrological simulations. The wet and dry duration spells were generally longer for overestimated and underestimated GPDs, respectively.
The physical diagnosis based on Truc-Budyko theory identifies that APHRO, TRMM, and PGMFD datasets make the catchments "Gaining", indicating an additional water term in water balance due to glacier melting, which results in negative 695 glacier storage (-0.42 to -0.34 m w.e. yr -1 ). On the other hand, CFSR and HAR make the catchments "Leaky", highlighting a positive change in glacier storage (0.37 and 0.21 m w.e. yr -1 , respectively). The actual under-and over-estimation based on physical diagnosis provides the basic directions for local correction of GPDs in glacierized mountain regions. The "Gaining" catchments (characterized with underestimated precipitation) need more input-water (higher precipitation) to sustain water and mass balance concurrently, whereas, "Leaky" catchments (characterized with overestimated precipitation) need lesser 700 input-water to reproduce the plausible water-energy and mass balance in a glacierized catchment simultaneously.
The glacio-hydrological simulation confirms the findings of statistical and physical diagnosis that GPDs are generally unable to represent the actual water-energy and mass balance in glacierized catchments. The selected GPDs generally cannot fulfill the requirements of the rational output of streamflow and glacier mass balance concurrently in glacierized catchments. It provides quantitative directions based on under-and over-estimations in simulated streamflow and glacier mass balance for 705 local correction of GPDs for glacio-hydrological simulations in mountain regions. Table 1. Study area details. Location of hydrological and meteorological stations are depicted in Figure 1. The adjusted precipitation is presented in [.    https://doi.org/10.5194/hess-2020-194 Preprint. Discussion started: 26 June 2020 c Author(s) 2020. CC BY 4.0 License.