Comprehensive Global Assessment of 24 Gridded Precipitation Datasets Across 18&thinsp;428 Catchments Using Hydrological Modeling

Abbas, Ather; Yang, Yuan; Pan, Ming; Tramblay, Yves; Shen, Chaopeng; Ji, Haoyu; Gebrechorkos, Solomon H.; Pappenberger, Florian; Pyo, JongCheol; Feng, Dapeng; Huffman, George; Nguyen, Phu; Massari, Christian; Brocca, Luca; Tan, Jackson; Beck, Hylke E.

doi:10.5194/hess-30-3399-2026

Articles | Volume 30, issue 11

https://doi.org/10.5194/hess-30-3399-2026

Articles | Volume 30, issue 11

Research article

03 Jun 2026

Research article |

| 03 Jun 2026

Comprehensive Global Assessment of 24 Gridded Precipitation Datasets Across 18 428 Catchments Using Hydrological Modeling

Ather Abbas, Yuan Yang, Ming Pan, Yves Tramblay, Chaopeng Shen, Haoyu Ji, Solomon H. Gebrechorkos, Florian Pappenberger, JongCheol Pyo, Dapeng Feng, George Huffman, Phu Nguyen, Christian Massari, Luca Brocca, Jackson Tan, and Hylke E. Beck

Abstract

Numerous gridded precipitation (P) datasets have been developed to address a variety of needs and challenges. However, selecting the most suitable and reliable dataset remains difficult for users. We conducted the most comprehensive global evaluation to date of gridded (sub-)daily P datasets using hydrological modeling. A total of 24 datasets – derived from satellite, (re)analysis, gauge sources, or combinations thereof – were assessed. To evaluate their performance, we calibrated the conceptual hydrological model HBV against observed daily streamflow for 18 428 catchments (each <10 000 km²) worldwide, using each P dataset as input. The Kling-Gupta Efficiency (KGE) was used as performance metric, with the calibration score serving as proxy for P dataset performance. Overall, Multi-Source Weighted-Ensemble Precipitation (MSWEP) V2.8 demonstrated the best performance (median KGE of 0.78), highlighting the value of merging P estimates from diverse data sources and applying daily gauge corrections. Among the purely satellite-based P datasets, the soil moisture- and microwave-based Global Precipitation Mission plus Soil Moisture to RAIN (GPM + SM2RAIN) dataset performed best (median KGE of 0.64). The Global Data Assimilation System (GDAS) analysis ranked highest among the (re)analyses (median KGE of 0.72), slightly outperforming the widely used European Centre for Medium-range Weather Forecasts ReAnalysis 5 (ERA5; median KGE of 0.71). Performance varied across Köppen-Geiger climate zones, with the highest scores in polar (E) regions (median KGE of 0.76 across datasets) and the lowest in arid (B) regions (median KGE of 0.53 across datasets). Spatial correlation analysis between catchment attributes and KGE scores identified aridity index, potential evaporation, and P occurrence as the strongest predictors of performance. Our assessment revealed significant regional differences in dataset performance and error characteristics, emphasizing the importance of careful dataset selection for water resource management, hazard assessment, agricultural planning, and environmental monitoring.

Download & links

Article (PDF, 3516 KB)

Supplement (39150 KB)

Download & links

Article (3516 KB)
Full-text XML
Supplement (39150 KB)
BibTeX
EndNote

How to cite.

Abbas, A., Yang, Y., Pan, M., Tramblay, Y., Shen, C., Ji, H., Gebrechorkos, S. H., Pappenberger, F., Pyo, J., Feng, D., Huffman, G., Nguyen, P., Massari, C., Brocca, L., Tan, J., and Beck, H. E.: Comprehensive Global Assessment of 24 Gridded Precipitation Datasets Across 18 428 Catchments Using Hydrological Modeling, Hydrol. Earth Syst. Sci., 30, 3399–3423, https://doi.org/10.5194/hess-30-3399-2026, 2026.

Received: 30 Dec 2024 – Discussion started: 20 Jan 2025 – Revised: 18 Feb 2026 – Accepted: 04 Mar 2026 – Published: 03 Jun 2026

1 Introduction

Understanding the spatio-temporal distribution of precipitation (P) is crucial for a wide range of applications, including water resources assessment, flood forecasting, agricultural monitoring, and disease tracking (Dresel et al., 2018; Liang and Gornish, 2019; McKinnon and Deser, 2021; Hinge et al., 2022; Dimitrova et al., 2022). However, P exhibits high variability across space and time, making it difficult to estimate, particularly in regions with complex topography, convection-driven P, or snow-dominated climates (Herold et al., 2016; Prein and Gobiet, 2017; Sharma et al., 2020 b; Li et al., 2020; Tarek et al., 2021). P estimates can be derived from satellites, models, and rain gauges, but each data source is subject to limitations. Satellite retrievals are hindered by surface snow and ice contamination (Cao et al., 2018; Chen et al., 2020), struggle to capture shallow orographic P (Yamamoto et al., 2017; Adhikari and Behrangi, 2022), and face challenges in detecting snowfall (You et al., 2021; Jääskeläinen et al., 2024; Girotto et al., 2024 b). Reanalyses (e.g., European Centre for Medium-range Weather Forecasts ReAnalysis 5 – ERA5; Hersbach et al., 2020) rely on uncertain parameterizations and often lack sufficient spatial resolution to adequately capture orographic effects (Skamarock, 2004; Ménégoz et al., 2013; Liu et al., 2018). Rain gauge networks are sparse and biased towards lower elevations (Schneider et al., 2014; Kidd et al., 2017; Ehsani and Behrangi, 2022) and gauges can severely underestimate snowfall due to wind-induced under-catch (Groisman and Legates, 1994; Sevruk et al., 2009; Rasmussen et al., 2012; Girotto et al., 2024 a).

In recent decades, numerous gridded P datasets have been developed based on these data sources and combinations thereof. Each dataset has a different design objectives, spatio-temporal resolution, coverage, algorithms, and latency (see Table 1 for an overview of quasi- and fully-global datasets). A plethora of studies have evaluated these datasets (see, e.g., reviews by Gebremichael, 2010; Maggioni et al., 2016, and Sun et al., 2018). However, the large majority of these studies use rain gauge observations as reference, which has limitations: (i) rain gauge observations are unavailable in many regions (Kidd et al., 2017); (ii) differences in scale between point-based rain gauges and grid-based P datasets (Ensor and Robeson, 2008; Yates et al., 2006) can skew results; (iii) time discrepancies between daily accumulations of gauges and satellite and (re)analysis datasets (Yang et al., 2020; Beck et al., 2019 b) can yield misleading daily evaluation results; (iv) the systematic P underestimation by rain gauges in snow-dominated and mountainous regions (Groisman and Legates, 1994; Sevruk et al., 2009; Rasmussen et al., 2012) can unfairly penalize P datasets in these regions; and (iv) using rain gauges already incorporated into the P datasets for validation results in misleading conclusions.

An alternative approach to evaluate P datasets is to use hydrological modeling, wherein streamflow simulations driven by different P datasets are compared to streamflow observations. The degree of correspondence between simulated and observed streamflow serves as a proxy for how accurately the P dataset captures the intensity and timing of P events. This approach avoids the aforementioned limitations by providing a direct, real-world measure of performance that reflects the dataset's ability to capture P dynamics in a hydrological context (Camici et al., 2018). Several studies have successfully employed this approach to evaluate various P datasets (e.g., Voisin et al., 2008; Su et al., 2008; Bitew et al., 2012; Tang et al., 2016; Beck et al., 2017 c; Lussana et al., 2018; Mazzoleni et al., 2019; Pradhan and Indu, 2021; Xiang et al., 2021; Gu et al., 2023; Gebrechorkos et al., 2023). However, many studies are limited in scope by (i) focusing on specific regions or subcontinents, or using streamflow data from relatively few catchments, thus restricting the generalizability of their findings; (ii) analyzing only a small subset of available P datasets, often excluding (re)analysis-based datasets; (iii) focusing on a monthly rather than daily time scale, which can obscure important short-term variability, such as extreme rainfall events or floods. Additionally, several studies failed to re-calibrate the hydrological model for each P dataset, including the recent global assessment by Gebrechorkos et al. (2023), which could result in biased conclusions.

In this study, we present the most comprehensive evaluation to date of gridded (sub-)daily (quasi-)global P datasets, aiming to identify their strengths and limitations across diverse geographical and climatological settings, and to inform their suitability for hydrological applications. We leverage an unparalleled database of streamflow observations from 18,428 catchments worldwide, spanning all climate zones and latitudes, to ensure broad generalizability of our results. Moreover, we evaluate an extensive collection of 24 P datasets, including new datasets like the microwave-based IMERG V7 (Huffman et al., 2019 b), the infrared-based PDIR-Now (Nguyen et al., 2020), and the reanalysis JRA-3Q (Kosaka et al., 2024), all three of which have not been comprehensively assessed at the global scale yet. To provide a fair and balanced assessment, we re-calibrate the hydrological model for each P dataset.

2 Data and Methods

2.1 Gridded P Datasets

Table 1 lists the 24 gridded P datasets included in our assessment. These datasets were selected based on their global or quasi-global coverage, widespread use in hydrological applications, and availability of daily or sub-daily data. Regional datasets, while valuable, were excluded to maintain consistency across diverse geographic areas (e.g., Asian Precipitation - Highly-Resolved Observational Data Integration Towards Evaluation – APHRODITE, Yatagai et al., 2012, and North American Land Data Assimilation System – NLDAS, Xia et al., 2012). The selected datasets are tailored for specific purposes: some, like IMERG-Early V7 and PDIR-Now, are designed for short-latency applications such as near-real-time monitoring heavy P events, while others with longer latency, such as CHIRPS V2.0 and IMERG-Final V7, are more suitable for comprehensive, long-term climate and hydrological analyses.

The 24 P datasets are grouped into six categories based on their input data sources (see Table 1 for full dataset names and references): (i) Satellite-only (S): IMERG-Early V7, IMERG-Late V6, IMERG-Late V7, PERSIANN-CCS, PDIR-Now, GSMaP-std V7, GSMaP-std V8, SM2RAIN-ASCAT, SM2RAIN-CCI, GPM + SM2RAIN, CMORPH-CDR, and CMORPH-RT; (ii) Reanalysis- or Analysis-only (R/A): ERA5, GDAS, and JRA-3Q; (iii) Gauge-only (G): CPC Unified and REGEN V1; (iv) Satellite and Gauge (S + G): IMERG-Final V7, GPCP V3.2, and PERSIANN-CCS-CDR; (v) Satellite, Reanalysis, and Gauge (S + R + G): CHIRPS V2.0, MSWEP V2.8; and (vi) Satellite and Reanalysis (S + R): CHIRP, MSWEP-ng V2.8. Version numbers are consistently indicated throughout the manuscript to ensure transparency and reproducibility.

Funk et al. (2015)Funk et al. (2015)Xie et al. (2019)Xie et al. (2017)Chen et al. (2008)Hersbach et al. (2020)NCEP (2024)Huffman et al. (2023)Huffman et al. (2019 a)Huffman et al. (2019 a)Huffman et al. (2019 a)Huffman et al. (2019 a)Kubota et al. (2020)Kubota et al. (2024)Kosaka et al. (2024)Beck et al. (2019 b)Beck et al. (2019 b)Hong et al. (2004)Sadeghi et al. (2021)Nguyen et al. (2020)Contractor et al. (2020)Brocca et al. (2019)Ciabatta et al. (2018 a)Massari et al. (2020)

Table 1Overview of the (sub-)daily (quasi-)global gridded P datasets evaluated in this study. Definition of abbreviations: S: satellite, G: gauge, R: Reanalysis, A: Analysis, and NRT: near real time.

Download XLSX

2.2 Streamflow Observations and Catchment Selection

We utilized a comprehensive global database of daily streamflow observations and catchment boundaries compiled from 29 national and international datasets. Appendix A provides a detailed list of the data sources, along with corresponding references or websites. Initially, the database contained 43 627 stations. However, as many stations appeared in multiple data sources, we performed a duplication check and discarded stations where both the station location and the corresponding catchment centroid were within 5 km of those of another station. In case of duplication, regional data sources were prioritized over international ones (e.g., CAMELS datasets were preferred over GRDC). After this process, the number of unique stations was reduced to 35 254.

To ensure the suitability of the catchments for the present analysis, we applied the following inclusion criteria:

Catchment areas were limited to <10 000 km² to minimize the influence of channel routing, which can become significant at the daily time scale in larger catchments (Gericke and Smithers, 2014). Moreover, since we use catchment-mean P time series to drive the hydrological model, larger catchments are prone to greater spatial averaging, leading to a less realistic representation of P patterns.
The total streamflow record had to be >3 years, not necessarily consecutive. This threshold was chosen due to the short records of GDAS and CMORPH-RT. We realize that such a short record may introduce some random variability in the KGE scores of these datasets, particularly in arid regions where P events are less frequent. However, this random variability will likely be averaged out due to the large number of catchments included in our assessment.
The number of days with appreciable runoff (>5 mm d⁻¹) had to exceed 10, and these days could not be consecutive (i.e., they should not be part of a single continuous event). This ensures that the calibration is based on a sufficient number of distinct runoff events.
The mean annual runoff had to be ≥5 and <5000 mm yr⁻¹, to filter out catchments with erroneous streamflow and/or catchment boundary data.
The reservoir influence (defined as the ratio of total reservoir capacity to mean cumulative annual streamflow) had to be <0.1, as Hydrologiska Byråns Vattenbalansavdelning (HBV), the hydrological model used in this study, does not explicitly simulate reservoirs. To determine the total reservoir capacity, we used the Global Reservoir and Dam (GRanD) dataset (V1.3; Lehner et al., 2011).

After applying these criteria, 18 428 catchments remained. The 2.5th, 10th, 50th, 90th and 97.5th percentiles of the catchment areas are 23, 55, 213, 2688 and 6165 km², respectively (Fig. 1).

https://hess.copernicus.org/articles/30/3399/2026/hess-30-3399-2026-f01

Figure 1Locations of the 35 254 gauges with daily streamflow data that passed the duplication checks, used to evaluate the gridded P datasets. Each data point represents the centroid of a catchment. The colors indicates the dominant major Köppen-Geiger climate class, based on the 1 km resolution map for 1991–2020 from Beck et al. (2023). For more information on the streamflow data sources, refer to Appendix A.

2.3 Hydrological Modeling

The performance of the gridded P datasets was assessed using hydrological modeling for the 18,428 catchments that passed the suitability checks. For each catchment, the HBV conceptual hydrological model (Bergström, 1992; Seibert and Vis, 2012) was calibrated against daily streamflow observations using time series from each P dataset. The HBV model was selected due to its versatility and computational efficiency, and numerous successful applications (see review by Seibert and Bergström, 2022). The model incorporates two groundwater stores, one unsaturated-zone store, and a triangular weighting function to simulate channel routing delays. Table 2 provides the model parameters and their calibration ranges. An additional parameter, PCORR, was introduced to further adjust for systematic P biases, which are generally easier to mitigate and should, therefore, not disproportionately penalize the datasets. Note that PCORR and SFCF are applied simultaneously: SFCF adjusts snowfall for gauge undercatch, while PCORR scales total P. Snowfall is therefore affected by both.

Table 2HBV model parameter descriptions and calibration ranges.

Download Print Version | Download XLSX

The model requires daily time series of P, potential evaporation, and air temperature as inputs. We used catchment-mean daily P time series from the gridded datasets listed in Table 1. Daily potential evaporation was estimated using the Penman-Monteith equation (Penman, 1948; Monteith, 1965), which requires daily time series of air temperature, downward shortwave and longwave radiation, relative humidity, and wind speed as input. Catchment-mean daily time series of these variables were sourced from the Multi-Source Weather (MSWX) dataset (Beck et al., 2022). MSWX improves on ERA5 by providing bias-adjusted fields at 0.1° resolution. The finer grid better captures mountain gradients that govern snowfall and snowmelt.

2.4 Calibration Procedure

The 15 model parameters were calibrated for each catchment and P dataset over the period where both observed streamflow and P data were available. Model initialization was done by running the model with 10 years of prior P data, if available. If 10 years of prior P data were not available, the model was run multiple times using the available P data until a total of more than 10 years was accumulated. Furthermore, simulation of 365 d was not used for calculating model performance. We used a (μ+λ) evolutionary algorithm, which is a population-based optimization method that iteratively evolves solutions through selection, crossover, and mutation to maximize the Kling-Gupta Efficiency (KGE) objective. The algorithm was implemented using the Distributed Evolutionary Algorithms in Python (DEAP) library (version 1.4; Ashlock, 2010; Fortin et al., 2012), with a population size (μ) of 20 and an offspring pool size (λ) of 48. Crossover was applied with a probability of 90 %, and mutation was applied with a probability of 10 % using a Gaussian-based mutation operator. To ensure convergence, the optimization process was terminated if the best KGE value did not improve by more than 0.01 for five consecutive generations after a minimum of 25 generations.

To assess the influence of systematic P bias correction using the PCORR and SFCF adjustment factors on model performance, we explored four calibration scenarios with varying bounds for the PCORR and SFCF parameters. In the first scenario, PCORR was allowed to vary between 0.0 and 2.0, providing full flexibility to adjust for both under- and overestimation of P, while SFCF was allowed to vary between 1.0 to 2.0. The second scenario limited PCORR to the range 0.5–2.0, while keeping the range of SFCF between 1.0 and 2.0. The third scenario fixed both PCORR and SFCF parameters at 1.0, effectively disabling P bias correction. The fourth scenario constrained both PCORR and SFCF to the range 1.0–2.0, allowing only upward correction. These scenarios enabled us to evaluate the sensitivity of model performance to P bias correction and assess the robustness of P dataset rankings under varying calibration constraints.

In line with several previous studies (e.g., Beck et al., 2017 c; Tarek et al., 2020; Arsenault et al., 2023), we opted not to split the record into separate calibration and validation periods. Instead, the full period of overlapping streamflow and P data was used to maximize the available information for parameter calibration and evaluation and yield more reliable scores. This is particularly critical for P datasets with short records (GDAS and CMORPH-RT), where splitting the data would lead to scores based on only one or two years of data which could cause instability in the performance scores (see Arsenault et al., 2018).

2.5 Performance Metric

To assess the performance of streamflow simulations forced by the different gridded P datasets, we calculated the Kling-Gupta Efficiency (KGE) scores between daily observed and simulated streamflow for each catchment. KGE, introduced by Gupta et al. (2009) and modified by Kling et al. (2012), is an objective performance metric that combines correlation, bias, and variability, and is defined as:

\begin{matrix} (1) & KGE = 1 - \sqrt{{(r - 1)}^{2} + {(γ - 1)}^{2} + {(β - 1)}^{2}}, \end{matrix}

where r represents the Pearson correlation coefficient, γ is the ratio of the estimated to observed coefficients of variation, and β is the ratio of estimated to observed means:

\begin{matrix} (2) & γ = \frac{σ_{s} / μ_{s}}{σ_{o} / μ_{o}}, β = \frac{μ_{s}}{μ_{o}}, \end{matrix}

where μ and σ are the mean and standard deviation, respectively, and the subscripts “s” and “o” refer to the estimated and observed values. Optimal values for KGE, r, β, and γ are all 1. The r term is primarily sensitive to the timing and intensity of P extremes, while β captures systematic over- or underestimation of P. While the PCORR and SFCF parameters, which account for systematic biases, were calibrated, the β component of KGE reflects residual biases that may persist due to limitations in the P dataset's ability to accurately represent the spatial and temporal distribution of P intensities and magnitudes (Sun et al., 2018).

https://hess.copernicus.org/articles/30/3399/2026/hess-30-3399-2026-f02

Figure 2Calibration KGE, correlation (r), long-term bias (β), and variability ratio (γ) scores achieved by the 24 P datasets. For a given catchment, calibration periods were not necessarily consistent across P datasets because their temporal coverage differs. The horizontal black and orange lines represent the mean and median, respectively. The box extends from the 25th to 75th percentiles, while the whiskers represent the 5th and 95th percentiles. The datasets are sorted according to their median KGE values. The colors represent the dataset type: S: Satellite; G: Gauge; R/A: Reanalysis or Analysis; S + R: Satellite and Reanalysis; S + R + G: Satellite, Reanalysis, and Gauge; and S + G: Satellite and Gauge.

Download

3 Results and Discussion

3.1 Overall Model Performance

Figure 2 presents median calibration scores obtained by HBV forced with 24 gridded P datasets across 18 428 catchments. Figure 3 maps the best-performing P dataset in each catchment, restricted to the five datasets with the highest median KGE for clarity. The key findings are as follows:

Among the six main categories of P datasets – satellite, gauge, (re)analysis, satellite+reanalysis, satellite + reanalysis + gauge, and satellite + gauge – the satellite category performed the worst overall. This challenges the common misconception that satellite datasets are inherently superior due to their high spatial resolution and observational nature. However, (re)analyses are also “observation-based”, as they assimilate vast quantities of satellite, surface, radiosonde, and aircraft data. Furthermore, our results indicate that higher spatial resolution does not necessarily guarantee better performance, though this may be because P data are spatially averaged at the catchment scale. Nonetheless, that satellite P datasets underperform globally should not be interpreted as a lack of value; for instance, they excel in tropical regions, as will be discussed in Sect. 3.2.
The multi-source MSWEP V2.8 dataset (Beck et al., 2019 b) attains the highest overall performance, with a median KGE of 0.78 (the spatial distribution of KGE values is provided in Fig. S1 in the Supplement). This dataset leverages the complementary strengths of gauge, satellite, and (re)analysis data to provide improved P estimates across the globe. Specifically, daily gauge observations enhance performance in regions with dense rain gauge networks, satellite retrievals enhance performance in convection-dominated regions and periods, while (re)analysis outputs improve performance in frontal-dominated regions and periods (Beck et al., 2019 b).
Among the purely satellite-based P datasets (CMORPH-CDR and -RT; IMERG-Early and -Late; GSMaP; PDIR-Now; PERSIANN-CCS; and SM2RAIN-ASCAT and -CCI; and GPM + SM2RAIN), the GPM + SM2RAIN dataset (Massari et al., 2020) exhibited the best overall performance (median KGE of 0.64; Fig. 2). This dataset combines satellite soil moisture retrievals from ASCAT H113 H-SAF, SMOS L3 and SMAP L3 with microwave-based P retrievals from IMERG using the so-called optimal linear combination approach (Bishop and Abramowitz, 2013). IMERG-Late V7 (median KGE of 0.55) introduced several improvements over V6, notably a climatological rain gauge adjustment, leading to a slight performance boost compared to V6 (median KGE of 0.54), particularly in the tropical, cold, and polar catchments (Fig. S14). In contrast, GSMaP-std V8 (median KGE of 0.43) performed worse than its predecessor, GSMaP-std V7 (median KGE of 0.50).
Among the purely infrared-based P datasets (PERSIANN-CCS and PDIR-Now), PERSIANN-CCS (Hong et al., 2004; median KGE of 0.46) performed similar to PDIR-Now (Nguyen et al., 2020; median KGE of 0.45). This is surprising as PDIR-Now features several improvements over PERSIANN-CCS, such as the dynamic adjustment of the relationship between cloud-top brightness temperatures and rain rates based on rainfall climatologies, as well as the use of a higher temperature threshold to enhance the detection of warm rain events (Nguyen et al., 2020). Further analysis revealed that PDIR-Now performs particularly poorly in the UK, Denmark, and Italy (Fig. S27), resulting in its overall poorer performance compared to PERSIANN-CCS.
Among the (re)analyses (ERA5, GDAS, and JRA-3Q), GDAS, based on V16.3 from 2022 of the Global Forecasting System (GFS) model (http://www.ncei.noaa.gov/products/weather-climate-models/global-forecast, last access: 5 May 2026), performed best (median KGE of 0.72). The recently released reanalysis JRA-3Q, based on the Japan Meteorological Agency (JMA) operational system as of December 2018 (Kosaka et al., 2024), performed similarly to ERA5 (both yielding a median KGE of 0.71). ERA5 is based on Cycle 41r2 of the Integrated Forecasting System (IFS) model from 2016 (Hersbach et al., 2020). While ERA5 is widely regarded as the most reliable reanalysis overall, these results suggest that JRA-3Q is a viable alternative for hydrological modeling. GDAS has a much shorter record than ERA5 and JRA-3Q (Table 1), which limits its usefulness.
Among the rain gauge-based P datasets (CHIRPS 2.0, CPC Unified, GPCP V3.2, IMERG-Final V7, MSWEP V2.8, REGEN V1, and PERSIANN-CCS-CDR), MSWEP V2.8 (Beck et al., 2019 b) achieved the best overall performance (median KGE of 0.78), underscoring the value of combining P estimates from satellite, reanalysis, and gauge data and applying daily gauge corrections. In contrast, CHIRPS V2.0 (median KGE of 0.66) applies 5 d gauge corrections, while the other datasets apply monthly corrections, which provide fewer benefits at the daily time scale. The main challenge in applying daily gauge corrections is accounting for offsets in daily gauge reporting times, as accumulations rarely align with midnight UTC (Yang et al., 2020). Furthermore, daily correction efforts are often hindered by the sparsity of gauge networks outside North America, Europe, and Australia (Kidd et al., 2017). Because CPC Unified and REGEN V1 rely exclusively on daily gauge observations, their performance is limited in these data-sparse regions, where values are interpolated between distant gauges (Fig. S29).
The marked differences in median KGE values between MSWEP V2.8 and MSWEP-ng V2.8 (median KGE of 0.78 vs. 0.73), between CHIRPS V2.0 and CHIRP (median KGE of 0.66 vs. 0.58), and between IMERG-Final V7 and -Late V7 (median KGE of 0.72 vs. 0.55) emphasize the importance of applying gauge corrections, in line with previous evaluations (Gochis et al., 2009; Beck et al., 2017 c, b; Shen et al., 2018). This highlights the critical role national meteorological agencies play in feeding rain gauge data into global databases such as the Global Historical Climatology Network daily (GHCNd; Menne et al., 2012) and the need to expand gauge coverage and promote open data sharing, particularly in data-scarce regions, to improve the utility of P datasets in those areas.
Our results reaffirm that higher-resolution P datasets do not necessarily yield better streamflow simulations compared to lower-resolution datasets, consistent with previous assessments (e.g., Bador et al., 2020; Huang et al., 2019; Chan et al., 2013). Notably, the 0.04° resolution satellite infrared-based datasets (PERSIANN-CCS and -CCS-CDR, and PDIR-Now; median KGE of 0.46, 0.50, and 0.45, respectively) – the highest resolution datasets included in our assessment – do not consistently perform better neither globally nor for any Köppen-Geiger climate zone, although this may reflect the generally poor performance of infrared-based datasets. However, IMERG-Final V7 (0.1° resolution) also does not perform better than GPCP V3.2 (0.5° resolution), which uses IMERG for disaggregation from monthly to daily. This may at least partly be due to the use of catchment-mean P to drive HBV, which omits local variability that high-resolution datasets might otherwise capture. Another potential factor is that coarser datasets may inadvertently improve reliability by averaging out small-scale random errors; however, our catchment-scale assessment cannot confirm this. Conversely, for the (re)analyses, the benefits of a higher resolution are evident in mountainous regions. Here, the 13 km GDAS outperformed the 31 km ERA5, which in turn outperformed the 40 km JRA-3Q (Fig. S57; see also Sect. 3.2). This indicates that higher-resolution NWP models are, as expected, more capable of accurately capturing complex orographic P dynamics.
A comparison of PCORR parameter values obtained after calibration using different P datasets reveals that IMERG-Early and -Late V7 necessitate the highest PCORR values, while PDIR-Now requires the lowest values (Figs. S3–S26). The lower PCORR for PDIR-Now reflects its tendency to overestimate P, as confirmed by the significant positive bias obtained by the datasets (Fig. 2). This may be because the algorithm was calibrated with a focus on heavy rainfall events for near real-time applications (Nguyen et al., 2020). Conversely, the higher PCORR values required for IMERG-Early and -Late V7 reflect their tendency to underestimate P, which is confirmed by their lower bias values (Fig. 2).
The overall ranking of P datasets remained largely consistent across the four PCORR calibration scenarios (Fig. S30; see Sect. 2.4). However, in the scenario where PCORR and SFCF were fixed at 1.0, GPCP V3.2 and ERA5 showed improved relative rankings – not due to higher performance, but because other datasets experienced greater performance drops under this constraint. Most datasets showed little sensitivity to the PCORR bound below 1.0, but a few – namely PDIR-Now, GSMaP V7, PERSIANN-CCS-CDR, and IMERG-Late V6 – exhibited notable use of PCORR values below 1.0 (Fig. S31). This suggests that these datasets tend to overestimate P, and that downward rescaling improves their hydrological performance.
The lower performance of PDIR-Now can be partially attributed to the default PCORR range of 1.0–2.0, which precludes the correction of P overestimation. This is confirmed by the lower calibrated PCORR values when allowed to vary below 1.0, leading to a decrease in the median calibrated PCORR from 1.2 to 1.1 and a marked improvement in median KGE from 0.43 to 0.47. Further analysis showed that the largest decrease in median calibrated PCORR (from 1.0 to 0.7) and corresponding improvement in KGE (from 0.15 to 0.37) occurred in CAMELS-GB (Fig. S33). However, across most other P datasets, the improvement in KGE was negligible when PCORR was allowed to drop below 1.0, confirming that the default PCORR range (1.0–2.0) is appropriate for most P datasets (Fig. S32).
We found that several satellite P datasets (notably IMERG-Early and -Late V7, SM2RAIN-ASCAT, SM2RAIN-CCI, GSMaP V8, and CMORPH-CDR) exhibit pronounced low-β tails (Fig. 2), indicating significant local P underestimation. This finding is further corroborated by maps of the difference between the mean annual P of each product and the multi-product mean (Figs. S34–S56), revealing extensive regions with negative values.

https://hess.copernicus.org/articles/30/3399/2026/hess-30-3399-2026-f03

Figure 3Precipitation (P) dataset with the highest calibration KGE in each catchment. Points mark catchment centroids (n=18 428). For clarity, only the five datasets with the highest median KGE are shown. MSWEP-np V2.8 is omitted because it is highly similar to MSWEP V2.8.

Overall, our findings align with those of Beck et al. (2017 c), Gu et al. (2023), and Gebrechorkos et al. (2023), who similarly evaluated multiple gridded P datasets using hydrological modeling in catchments worldwide. However, while Beck et al. (2017 c) assessed nine datasets across 9053 catchments, Gu et al. (2023) evaluated two datasets across 10 596 catchments, and Gebrechorkos et al. (2023) analyzed six datasets across 1825 catchments, the present study evaluates 24 datasets across 18 428 catchments. This broader scope significantly enhances the generalizability of our results. Additionally, Beck et al. (2017 c) and Gu et al. (2023) primarily assessed outdated versions of P datasets, whereas our analysis includes several new P datasets – such as PDIR-Now, IMERG V7, JRA-3Q, and MSWEP V2.8 – that have not yet been comprehensively evaluated. Furthermore, unlike Gebrechorkos et al. (2023), we recalibrated the hydrological model for each P dataset, reducing the risk of penalizing datasets for systematic biases that calibration can otherwise absorb.

3.2 Regional Performance Differences

Table 3 presents median calibration KGE scores for the 24 gridded P datasets across the five major Köppen-Geiger climate classes (see Fig. S2 for the distribution of KGE values). While satellite P datasets perform the worst overall (see Sect. 3.1), microwave-based satellite datasets such as IMERG and GSMaP generally outperform (re)analyses (ERA5, GDAS, and JRA-3Q) in tropical catchments. This is likely because tropical P events, typically localized and short-lived, can be directly observed by satellites, while numerical weather prediction (NWP) models generally struggle to simulate the complex convective processes driving these events (Yano et al., 2018; Peters et al., 2019; Lin et al., 2022). Conversely, in temperate and, most notably, cold regions, (re)analyses generally outperform satellite-based datasets. This is because the large-scale, long-duration frontal P systems dominant in these regions are reliably simulated by NWP models (Ebert et al., 2007; Beck et al., 2017 c, 2019 a; Sun et al., 2018). In arid climates, all P datasets tend to perform relatively poorly, with a slight advantage for (re)analyses over satellite-based datasets, consistent with previous evaluation (e.g., Beck et al., 2017 a, 2016). The lower arid-region scores mainly reflect (i) poorer forcing quality due to the short-lived, localized nature of storms and sub-cloud evaporation (virga) (Wang et al., 2018); (ii) more threshold-driven runoff generation that amplifies small forcing errors; and (iii) fewer runoff-producing events, which increases sampling uncertainty (Beck et al., 2017 a, c; Sun et al., 2018; El Kenawy et al., 2019; Beck et al., 2019 a; Williams, 2025). Thus, the lower performance does not necessarily indicate an inability of HBV to represent arid hydrology.

Table 3Median daily calibration KGE values obtained using HBV driven by the different P datasets for all catchments and the five major Köppen-Geiger climate classes. For the Köppen–Geiger classes, medians are omitted when a dataset has <100 catchments or covers <50 % of the catchments in that class. In each column, the dataset with the best performance is shown in bold. The catchments were classified based on the most dominant class, determined using the 1 km resolution Köppen-Geiger map for 1991–2020 from Beck et al. (2023). See Fig. 1 for a map of the dominant major Köppen-Geiger climate class for the catchments.

Download Print Version | Download XLSX

Figure 4 shows spatial correlations between static catchment attributes (Appendix B) and calibration KGE, correlation (r), variability ratio (γ), and long-term bias (β) scores across the catchments. We report these correlations for the multi-source MSWEP V2.8 dataset, the ERA5 reanalysis, and the satellite-based IMERG-Late V7 dataset, to assess how well different catchment attributes predict the performance of each dataset. MSWEP V2.8 and ERA5 exhibit similar patterns, likely because ERA5 is a key input to MSWEP V2.8. For MSWEP V2.8, the strongest predictors of high KGE are low Aridity Index, high P Occurrence, and high Mean NDVI – intercorrelated predictors indicative of humid conditions. For ERA5, the strongest predictors of high KGE are high P Occurrence, low Mean PE, and high Absolute Latitude – conditions that tend to favor frontal P generation. For IMERG-Late V7, KGE is generally less predictable, although high KGE is weakly associated with high Mean P, consistent with tropical regions dominated by convective rainfall. For IMERG-Late V7, a strong predictor of a low β (i.e., P underestimation) is high Topographic Slope, reflecting known difficulties in detecting shallow orographic P and snowfall (Sadeghi et al., 2019; Song et al., 2021). Rain Gauge Density (defined as the number of gauges per 100 km², smoothed using an exponential filter; see Table B1) shows a weak positive relationship with MSWEP V2.8 KGE and r, suggesting that higher gauge density contributes to improved performance, as expected.

https://hess.copernicus.org/articles/30/3399/2026/hess-30-3399-2026-f04

Figure 4Spatial Spearman rank correlations between static catchment attributes and calibration KGE, correlation (r), long-term bias (β), and variability ratio (γ) scores across catchments for (a) MSWEP V2.8, (b) ERA5, and (c) IMERG-Late V7. See Appendix B for details on the catchment attributes.

Download

To better analyze the influence of catchment-mean topographic slope on calibration KGE for each P dataset, we calculated median KGE values for flat catchments (mean slope <1°) and steep ones (mean slope >7°; Fig. S57a), as well as spatial correlations between KGE and catchment-mean slope values (Fig. S57b). The following conclusions can be drawn:

Each gauge-based P dataset tends to show better performance in flat catchments than in steep ones (Fig. S57a; e.g., the CHIRPS V2.0 median KGE is 0.05 higher in flat catchments). In contrast, each non-gauge-based dataset performs worse in flat catchments than in steep ones (e.g., the ERA5 median KGE is 0.06 lower). This pattern is further supported by negative spatial correlations between KGE and mean slope for each gauge-based dataset, while the correlations are positive for non-gauge-based datasets (Fig. S57b). The decline in the performance of gauge-based datasets in mountainous regions reflects the sparse gauge coverage in these less accessible, less populated areas (Kidd et al., 2017).
The tendency for non-gauge-based P datasets to perform better in steep catchments likely arises from the dominance of seasonal, rather than daily, hydrological variability in mountainous regions. These seasonal signals are easier for models to reproduce, resulting in higher KGE values (Beck et al., 2017 a). Steep terrain generates high runoff, evaporation is generally low, and streamflow is dominated by slowly releasing snowmelt and groundwater, with limited human modification (Müller Schmied et al., 2014; Beck et al., 2015; Wada et al., 2017).
Another reason for the stronger performance of (re)analysis datasets in mountainous regions is the ability of NWP models to represent large-scale uplift of moist air over terrain, which produces orographic P (e.g., Pontoppidan et al., 2017; Schumacher et al., 2020). GDAS performs particularly well, likely reflecting the high 13 km resolution of GFS V16.3, which allows more detailed representation of topographic gradients and associated atmospheric processes. JRA-3Q performs least well, consistent with the coarser 40 km resolution of the JMA NWP model as of December 2018 (Kosaka et al., 2024). ERA5 sits between the two, being based on the 31 km IFS model from 2016 (Hersbach et al., 2020).
The better hydrological performance of each satellite-based P dataset in mountainous regions conflicts with previous evaluations using rain gauges and radar data (e.g., Beck et al., 2019 a; Sharma et al., 2020 a; Adhikari and Behrangi, 2022). In these studies, poorer performance is generally attributed to surface snow and ice contamination (Cao et al., 2018; Chen et al., 2020), difficulties in detecting snowfall (You et al., 2021; Jääskeläinen et al., 2024; Girotto et al., 2024 b), and shallow orographic P (Yamamoto et al., 2017; Adhikari and Behrangi, 2022). Our results suggest that these limitations may be counterbalanced by the simpler, more predictable seasonal streamflow dynamics in mountainous regions.

https://hess.copernicus.org/articles/30/3399/2026/hess-30-3399-2026-f05

Figure 5Median calibration KGE scores for each P dataset across the different streamflow data sources (see Fig. 1a and Appendix A). White indicates that no catchments met the inclusion criteria (Sect. 2.2).

Download

Figure 5 presents median calibration KGE scores obtained from the different P datasets across the various streamflow data sources (see Fig. 1a and Appendix A). Overall performance is somewhat lower for BOMAustralia, CAMELS-INDIA, South Korea, and especially ADHI. Possible reasons for the lower performance for these data sources are discussed below:

For BOMAustralia (http://www.bom.gov.au/waterdata/, last access: 5 May 2026), the lower performance (Fig. 5) is attributed to arid regions exhibiting consistently low performance (Table 3), with Australian catchments having a particularly high median aridity index of 1.9. Additionally, the presence of numerous small dams used for irrigation, domestic water supply, and flood control likely contributes to reduced performance (Ouyang et al., 2021). Our hydrological model, HBV (Bergström, 1992; Seibert and Vis, 2012), does not explicitly simulate dams, and although we excluded catchments with significant dam influence (see Sect. 2.2), we relied on the GRanD dataset (Lehner et al., 2011), which only includes larger dams. Significant groundwater withdrawals in Australia – also not represented in HBV – may also have contributed to the degraded performance.
For CAMELS-INDIA (Mangukiya et al., 2025), the main data source for India, the lower performance (Fig. 5) is likely due to extensive human activity, particularly significant groundwater withdrawals (Rodell et al., 2009; Dangar et al., 2021). CAMELS-INDIA catchments have the highest median irrigated area (9.5 %) based on the Global Map of Irrigated Areas (GMIA) V5 (Siebert et al., 2005). Additionally, despite excluding catchments with substantial dam influence, CAMELS-INDIA still has the highest median reservoir influence (defined as total reservoir capacity divided by mean cumulative annual streamflow) across all data sources at 0.04. This suggests that dam regulation may have further degraded performance.
Similarly, for South Korea (https://water.nier.go.kr, last access: 5 May 2026), the lower performance (Fig. 5) is likely related to extensive human activity, including numerous dams not captured by the GRanD dataset. These dams mainly support domestic and municipal water supply and agriculture (the catchments have a median irrigated area of 6 % based on GMIA).
For ADHI (Tramblay et al., 2021), the main data source for Africa, arid conditions are likely a primary reason for the low performance (Fig. 5), given a mean aridity index of 1.9 across the catchments (identical to that of the Australian catchments). Another factor is the large number of mostly small dams across the continent that are not included in GRanD. Low streamflow data quality may also contribute, although a global assessment does not fully support this explanation (Crochemore et al., 2020). Additional challenges for rain gauge-based P datasets (CHIRPS 2.0, CPC Unified, REGEN V1, GPCP V3.2, IMERG-Final V7, MSWEP V2.8, and PERSIANN-CCS-CDR) in Africa include sparse rain gauge networks (Kidd et al., 2017), variable data quality, and frequent gaps. For (re)analyses (ERA5, GDAS, and JRA-3Q), limited availability of surface, radiosonde, and aircraft observations for assimilation further reduces performance (https://charts.ecmwf.int/catalogue/packages/monitoring/, last access: 5 May 2026). For ERA5 specifically, spurious P trends in central Africa (see Zsótér et al., 2020) – likely due to changes in the observing system – and intense localized rainfall events (so-called “rain bombs”) in eastern Africa contribute to degraded performance (Hersbach et al., 2020).
The low median calibration KGE scores for PDIR-Now in Poland, Denmark, and CAMELS-GB (Fig. 5) are associated with median bias (β) values of 1.1, 1.3, and 1.3, respectively (Fig. S27), indicating substantial P overestimation in these regions. Likewise, the low median calibration KGE for JRA-3Q in Thailand (Fig. 5) is mainly due to P overestimation, with a median bias of 4.6 (Fig. S28).

3.3 Potential Limitations and Future Work

We conducted the most extensive evaluation to date of quasi- and fully global gridded P datasets using hydrological modelling. Nevertheless, several limitations should be considered when interpreting the results:

The calibration process may potentially suppress certain systematic issues inherent in the P datasets, such as consistent under- or overestimation of peaks, long-term biases, or the presence of drizzle, due to the PCORR and SFCF parameters of HBV. As a result, these issues might not be fully reflected in our calibration scores. However, this should not necessarily be viewed as a limitation. Systematic biases, once identified, are relatively straightforward to correct through post-processing or bias-adjustment techniques. Consequently, penalizing datasets for such deficiencies may be unwarranted.
Although HBV has been widely and successfully applied across diverse climates and geographic settings (Seibert and Bergström, 2022), it is a parsimonious conceptual model with a fixed structure and simplified process representations. It does not represent spatio-temporal variability in land cover and land use, or spatial heterogeneity in soils and other catchment properties, and it is driven by catchment-mean meteorological forcings. More complex semi-distributed or fully distributed (gridded) models may yield improved streamflow simulations (Gu et al., 2023); however, we do not expect such models to yield materially different P dataset rankings or alter our main conclusions.
HBV does not explicitly represent human influences such as dam operations or groundwater withdrawals, both of which can substantially alter streamflow. Accounting for these processes is challenging because consistent, detailed data on water use and management is generally unavailable. For example, many large dams – and most smaller ones – are missing from global compilations (Zhang and Gu, 2023), and global sectoral water-use estimates are highly uncertain, especially at sub-national scales (e.g., Huang et al., 2018; Puy et al., 2022).
We compiled an unparalleled global observed streamflow dataset comprising 35,254 catchments (excluding duplicates) covering all climate zones and latitudes (Fig. 1). Yet, many highly populated and vulnerable regions, particularly in West Asia and parts of Central and Eastern Africa remain underrepresented. This underscores the continued need to improve access to local and regional streamflow data (Krabbenhoft et al., 2022).
Since the global distribution of streamflow gauging stations closely aligns with that of meteorological monitoring networks (see Krabbenhoft et al., 2022, and Kidd et al., 2017), our assessment may slightly overestimate the relative performance of gauge-based P datasets and (re)analyses – which assimilate in situ observations from these networks – compared to satellite-only datasets.
Some P datasets (GDAS and CMORPH-RT) have relatively short record lengths (Table 1), which can yield less stable KGE scores and may slightly overestimate performance, particularly in arid regions where P events are infrequent. Their limited temporal coverage also prevented the use of a single, uniform calibration period across all datasets. As a result, part of the variation in calibration performance may reflect differences in calibration periods rather than dataset quality. Nevertheless, given the large number of catchments analysed, the impact on the aggregated results and main conclusions is expected to be small.
Our assessment was carried out on a daily time scale, which obscures critical sub-daily dynamics, particularly in small catchments and arid regions prone to flash floods. Future research may expand our analysis to sub-daily time scales, which would enable a more rigorous evaluation of the timing and intensity of P estimates. Such a sub-daily assessment would likely improve scores for satellite-based P datasets due to their ability to directly observe events, unlike (re)analyses that rely on approximating when such events occur.

4 Conclusions

The availability of wide range of gridded P datasets, each with unique technical specifications, strengths, and weaknesses, can make choosing the best dataset for a particular application a complex task. To assist users in making better informed decisions, we conducted the most comprehensive assessment to date of (sub-)daily (quasi-)global gridded P datasets using hydrological modeling. We evaluated 24 P datasets across 18,428 catchments worldwide. For each catchment, we calibrated the HBV hydrological model using daily streamflow observations, driven by each P dataset as input. Our main findings can be summarized as follows:

Among all P datasets, MSWEP V2.8 consistently achieved the highest overall performance, owing to its inclusion of both satellite and (re)analysis data combined with daily gauge corrections. The best predictors of high KGE for MSWEP V2.8 are low Aridity Index and high P Occurrence. Satellite datasets performed worst overall. GPM + SM2RAIN performed best among the satellite-based datasets, due to its integration of satellite soil moisture and P retrievals. IMERG-Late V7 shows a modest improvement over V6, with gains most evident in arid and cold regions. Among the (re)analyses, GDAS performed marginally better than both ERA5 and JRA-3Q, which exhibited comparable performance. MSWEP V2.8 led among the gauge-corrected datasets, benefiting from its daily gauge corrections, unlike others with 5 d or monthly gauge corrections. Infrared-based satellite datasets showed lower overall scores, with PERSIANN-CCS outperforming PDIR-Now.
Regional performance of P datasets varied significantly across climates and data sources, influenced by local P characteristics, topography, data quality, and human activities. Tropical regions favor microwave-based satellite datasets like IMERG due to their ability to capture localized, convective rainfall, while all datasets perform poorly in arid regions, with a slight advantage for (re)analyses. In temperate and cold regions, (re)analyses such as JRA-3Q excel due to their ability to simulate large-scale, frontal P systems. Each gauge-based P dataset shows better performance in flat catchments than in steep ones, whereas each non-gauge-based dataset performs worse in flat catchments than in steep ones. Factors such as aridity, dam presence, and water use likely reduced dataset performance in regions like Australia, India, and Africa. The limited availability of in situ meteorological data, combined with potential streamflow data quality issues, may have further degraded performance in Africa.
Despite the comprehensiveness of our assessment, several limitations should be noted. Systematic P biases may have been partially masked during calibration, though these biases can often be easily mitigated through post-processing. Additionally, we employed a relatively simple conceptual hydrological model with catchment-average inputs, although this is unlikely to have affected the results significantly. The overlap in the global distribution of streamflow and meteorological networks may have slightly favored gauge-based datasets and (re)analyses over satellite-based datasets. Lastly, the use of a daily time scale may obscure important sub-daily dynamics, highlighting the need for future sub-daily assessments.

In conclusion, although our findings indicate that datasets like MSWEP V2.8 are well-suited for a broad range of uses, while satellite datasets generally perform worse overall, selecting the most appropriate P dataset ultimately depends on the study region and the specific needs of the application. For example, long-record datasets such as JRA-3Q may be suitable for climate analysis, while IMERG-Early V7 provides a reliable near real-time solution. The continued development of P datasets that balance long-term homogeneity, latency, and spatial-temporal coverage will be essential to meet the varied requirements of users for applications in water resource management, hazard assessment, agriculture, and environmental monitoring.

Appendix A: Streamflow Data Sources

We compiled an unparalleled database with daily streamflow observations and catchment boundaries for 35 254 catchments worldwide, drawing from the 29 data sources listed in Table A1. These sources are divided into two categories. The first category comprises published datasets, including ADHI, HYSETS, CAMELS, LamaHCE, LamaHIce, Germany, and CCAM. For the remaining sources, except GRDC, daily observed streamflow data were obtained from the websites of the respective countries' hydrological or meteorological agencies. Data from GRDC were acquired by submitting an application form on their website and receiving the data via email. For the second set of sources, we used streamflow observations exclusively from stations with available catchment boundaries, allowing us to calculate time series of meteorological forcings for these catchments, including P, temperature, radiation, and humidity. Catchment boundaries for USGS data were sourced from HYSETS, while those for Italy, Spain, France, Poland, Czech Republic, Sweden, Ireland, Denmark, and Finland came from EStreams (do Nascimento et al., 2024). For BOM Australia, Thailand, and Japan, boundaries were obtained from GSHA (Yin et al., 2023). The catchment boundaries for South Korea were acquired from the Environmental Geographic Information Service (EGIS) of South Korea (https://egis.me.go.kr/, last access: 5 May 2026).

Tramblay et al. (2021)Coxon et al. (2020)Mangukiya et al. (2025)Alvarez-Garreton et al. (2018)Chagas et al. (2020)Höge et al. (2023)Hao et al. (2021)Loritz et al. (2024)Arsenault et al. (2020)Klingler et al. (2021)Helgason and Nijssen (2024)

Table A1Daily observed streamflow data sources, number of catchments, and references/URLs. The number of catchments represents the amount after duplication checks but before suitability checks.

Download Print Version | Download XLSX

Appendix B: Static Catchment Attributes

Table B1 presents the static catchment attributes used for assessing performance predictability. Here, “static” refers to attributes that do not vary over time. The attributes were calculated for each catchment as described in the table.

Menne et al., 2012(Bontemps et al., 2011)(Beck et al., 2020)(Beck et al., 2020)(Hengl et al., 2017)Danielson and Gesch, 2011Legates and Bogart (2009)(Fick and Hijmans, 2017)(Hersbach et al., 2020)(FAO, 2000)(Huscroft et al., 2018)(Lehner and Döll, 2004)(Maisongrande et al., 2004)(Fuster et al., 2020)(Zomer et al., 2008)(Brown et al., 1997)(Fick and Hijmans, 2017)(Fick and Hijmans, 2017)(Fick and Hijmans, 2017)Lehner et al., 2011(Wang et al., 2021)(Fick and Hijmans, 2017)(Zomer et al., 2008)(Siebert et al., 2013)

Table B1Description and sources of static catchment attributes.

Download Print Version | Download XLSX

Code availability

The Python implementation of the HBV hydrological model used in this work is available at https://github.com/AtrCheema/rain2flow (AtrCheema, 2026). The AquaFetch Python (https://github.com/hyex-research/AquaFetchTS30, last access: 17 July 2025, https://doi.org/10.21105/joss.08051, Abbas et al., 2025) library was used to access and harmonize open source streamflow data. The Python code used to generate the results of this study is available from the corresponding author upon request.

Data availability

Most of the streamflow observations are freely available, and their sources are listed in Table A1. Please contact the authors regarding access to the portion of the streamflow data that can be shared.All P datasets are freely accessible for non-commercial research. CPC Unified is available on the NOAA Physical Sciences Laboratory (PSL) website (https://psl.noaa.gov/data/gridded/data.cpc.globalprecip.html, last access: 5 May 2026). IMERG can be accessed from the NASA Global Precipitation Measurement (GPM) website (https://gpm.nasa.gov/data, last access: 5 May 2026). JRA-3Q is available via the National Center for Atmospheric Research (NCAR) Research Data Archive (RDA; (https://gdex.ucar.edu/datasets/d728009/filelist/, last access: 5 May 2026). GPCP is accessible via the NOAA PSL website (https://psl.noaa.gov/data/gridded/data.gpcp.html, last access: 5 May 2026). SM2RAIN-ASCAT, SM2RAIN-CCI, and GPM + SM2RAIN are hosted on Zenodo (https://doi.org/10.5281/zenodo.10376109, Brocca et al., 2023; https://doi.org/10.5281/zenodo.1305021, Ciabatta et al., 2018 b; and https://doi.org/10.5281/zenodo.3854817, Massari, 2020, respectively). ERA5 data can be obtained from the Copernicus Climate Data Store (CDS; https://doi.org/10.24381/cds.adbb2d47, Copernicus Climate Change Service, Climate Data Store, 2023). CHIRP and CHIRPS are available via the University of California Climate Hazards Center (CHC) website (https://www.chc.ucsb.edu/data/chirps/, last access: 5 May 2026). MSWEP can be accessed via the GloH2O website (https://www.gloh2o.org/mswep/, last access: 5 May 2026). PERSIANN-CCS-CDR and PDIR-Now are accessible via the Center for Hydrometeorology and Remote Sensing (CHRS) website (https://chrsdata.eng.uci.edu/, last access: 5 May 2026).

Supplement

The supplement related to this article is available online at https://doi.org/10.5194/hess-30-3399-2026-supplement.

Author contributions

AA: modeling, analysis, visualization, and writing. HB: initial idea, conceptualization, writing, and project administration. All coauthors contributed to writing, revising, and refining the manuscript.

Competing interests

At least one of the (co-)authors is a member of the editorial board of Hydrology and Earth System Sciences. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Acknowledgements

We thank the developers of CMORPH, IMERG, GSMaP, PERSIANN-CCS, PERSIANN-CCS-CDR, PDIR-Now, SM2RAIN, JRA-3Q, GDAS, ERA5, CPC Unified, REGEN, GPCP, and CHIRPS for their efforts in creating and sharing these valuable resources. Our gratitude also extends to the streamflow data providers listed in Table A1, including the Global Runoff Data Centre (GRDC; Koblenz, Germany), the French National Research Institute for Sustainable Development (RDI), and the Korean National Institute of Environmental Research (NIER). We further thank the developers of the datasets used for the static catchment attributes listed in Table B1. Special thanks are due to Takuji Kubota and Munehisa K. Yamamoto for their valuable insights regarding the performance of GSMaP. For computer time, this research used the resources of the Supercomputing Core Laboratory at King Abdullah University of Science and Technology (KAUST) in Thuwal, Saudi Arabia.

Review statement

This paper was edited by Elena Toth and reviewed by three anonymous referees.

References

Abbas, A., Iftikhar, S., and Beck, H. E.: AquaFetch: A Unified Python Interface for Water Resource Dataset Acquisition and Harmonization. J. Open Sour. Softw., 10, 8051, https://doi.org/10.21105/joss.08051, 2025. a

Adhikari, A. and Behrangi, A.: Assessment of satellite precipitation products in relation with orographic enhancement over the western United States, Earth Space Sci., 9, e2021EA001906, https://doi.org/10.1029/2021EA001906, 2022. a, b, c

Alvarez-Garreton, C., Mendoza, P. A., Boisier, J. P., Addor, N., Galleguillos, M., Zambrano-Bigiarini, M., Lara, A., Puelma, C., Cortes, G., Garreaud, R., McPhee, J., and Ayala, A.: The CAMELS-CL dataset: catchment attributes and meteorology for large sample studies – Chile dataset, Hydrol. Earth Syst. Sci., 22, 5817–5846, https://doi.org/10.5194/hess-22-5817-2018, 2018. a

Arsenault, R., Brissette, F., and Martel, J.-L.: The hazards of split-sample validation in hydrological model calibration, J. Hydrol., 566, 346–362, https://doi.org/10.1016/j.jhydrol.2018.09.027, 2018. a

Arsenault, R., Brissette, F., Martel, J.-L., Troin, M., Lévesque, G., Davidson-Chaput, J., Gonzalez, M. C., Ameli, A., and Poulin, A.: A comprehensive, multisource database for hydrometeorological modeling of 14,425 North American watersheds, Sci. Data, 7, 243, https://doi.org/10.1038/s41597-020-00583-2, 2020. a

Arsenault, R., Martel, J.-L., Brunet, F., Brissette, F., and Mai, J.: Continuous streamflow prediction in ungauged basins: long short-term memory neural networks clearly outperform traditional hydrological models, Hydrol. Earth Syst. Sci., 27, 139–157, https://doi.org/10.5194/hess-27-139-2023, 2023. a

Ashlock, D.: Evolutionary computation for modeling and optimization, Springer Publishing Company, https://doi.org/10.1007/0-387-31909-3, 2010. a

AtrCheema: rain2flow, GitHub [code], https://github.com/AtrCheema/rain2flow (last access: 5 May 2026), 2026. a

Bador, M., Boé, J., Terray, L., Alexander, L. V., Baker, A., Bellucci, A., Haarsma, R., Koenigk, T., Moine, M.-P., Lohmann, K., Putrasahan, D. A., Roberts, C., Roberts, M., Scoccimarro, E., Schiemann, R., Seddon, J., Senan, R., Valcke, S., and Vanniere, B.: Impact of higher spatial atmospheric resolution on precipitation extremes over land in global climate models, J. Geophys. Res.-Atmos., 125, e2019JD032184, https://doi.org/10.1029/2019JD032184, 2020. a

Beck, H. E., van Dijk, A. I. J. M., and de Roo, A.: Global maps of streamflow characteristics based on observations from several thousand catchments, J. Hydrometeorol., 16, 1478–1501, https://doi.org/10.1175/JHM-D-14-0155.1, 2015. a

Beck, H. E., van Dijk, A. I. J. M., de Roo, A., Miralles, D. G., McVicar, T. R., Schellekens, J., and Bruijnzeel, L. A.: Global-scale regionalization of hydrologic model parameters, Water Resour. Res., 52, 3599–3622, https://doi.org/10.1002/2015WR018247, 2016. a

Beck, H. E., van Dijk, A. I. J. M., de Roo, A., Dutra, E., Fink, G., Orth, R., and Schellekens, J.: Global evaluation of runoff from 10 state-of-the-art hydrological models, Hydrol. Earth Syst. Sci., 21, 2881–2903, https://doi.org/10.5194/hess-21-2881-2017, 2017a. a, b, c

Beck, H. E., van Dijk, A. I. J. M., Levizzani, V., Schellekens, J., Miralles, D. G., Martens, B., and de Roo, A.: MSWEP: 3-hourly 0.25° global gridded precipitation (1979–2015) by merging gauge, satellite, and reanalysis data, Hydrol. Earth Syst. Sci., 21, 589–615, https://doi.org/10.5194/hess-21-589-2017, 2017b. a

Beck, H. E., Vergopolan, N., Pan, M., Levizzani, V., van Dijk, A. I. J. M., Weedon, G. P., Brocca, L., Pappenberger, F., Huffman, G. J., and Wood, E. F.: Global-scale evaluation of 22 precipitation datasets using gauge observations and hydrological modeling, Hydrol. Earth Syst. Sci., 21, 6201–6217, https://doi.org/10.5194/hess-21-6201-2017, 2017c. a, b, c, d, e, f, g, h

Beck, H. E., Pan, M., Roy, T., Weedon, G. P., Pappenberger, F., van Dijk, A. I. J. M., Huffman, G. J., Adler, R. F., and Wood, E. F.: Daily evaluation of 26 precipitation datasets using Stage-IV gauge-radar data for the CONUS, Hydrol. Earth Syst. Sci., 23, 207–224, https://doi.org/10.5194/hess-23-207-2019, 2019a. a, b, c

Beck, H. E., Wood, E. F., Pan, M., Fisher, C. K., Miralles, D. G., Van Dijk, A. I., McVicar, T. R., and Adler, R. F.: MSWEP V2 global 3-hourly 0.1 precipitation: methodology and quantitative assessment, B. Am. Meteorol. Soc., 100, 473–500, https://doi.org/10.1175/BAMS-D-17-0138.1, 2019b. a, b, c, d, e, f

Beck, H. E., Westra, S., Tan, J., Pappenberger, F., Huffman, G. J., McVicar, T. R., Gründemann, G. J., Vergopolan, N., Fowler, H. J., Lewis, E., Verbist, K., and Wood, E. F.: PPDIST, global 0.1^∘ daily and 3-hourly precipitation probability distribution climatologies for 1979–2018, Sci. Data, 7, https://doi.org/10.1038/s41597-020-00631-x, 2020. a, b

Beck, H. E., van Dijk, A. I. J. M., Larraondo, P. R., McVicar, T. R., Pan, M., Dutra, E., and Miralles, D. G.: MSWX: global 3-hourly 0.1^∘ bias-corrected meteorological data including near-real-time updates and forecast ensembles, B. Am. Meteorol. Soc., 103, E710–E732, https://doi.org/10.1175/BAMS-D-21-0145.1, 2022. a

Beck, H. E., McVicar, T. R., Vergopolan, N., Berg, A., Lutsko, N. J., Dufour, A., Zeng, Z., Jiang, X., van Dijk, A. I. J. M., and Miralles, D. G.: High-resolution (1 km) Köppen-Geiger maps for 1901–2099 based on constrained CMIP6 projections, Sci. Data, 10, https://doi.org/10.1038/s41597-023-02549-6, 2023. a, b

Bergström, S.: The HBV model – its structure and applications, SMHI Reports RH 4, Swedish Meteorological and Hydrological Institute (SMHI), Norrköping, Sweden, 1992. a, b

Bishop, C. H. and Abramowitz, G.: Climate model dependence and the replicate Earth paradigm, Clim. Dynam., 41, 885–900, https://doi.org/10.1007/s00382-012-1610-y, 2013. a

Bitew, M. M., Gebremichael, M., Ghebremichael, L. T., and Bayissa, Y. A.: Evaluation of high-resolution satellite rainfall products through streamflow simulation in a hydrological modeling of a small mountainous watershed in Ethiopia, J. Hydrometeorol., 13, 338–350, https://doi.org/10.1175/2011JHM1292.1, 2012. a

Bontemps, S., Defourny, P., and van Bogaert, E.: GlobCover 2009, products description and validation report, Tech. rep., ESA GlobCover project, https://due.esrin.esa.int/files/GLOBCOVER2009_Validation_Report_2.2.pdf (last access: 5 May 2026), 2011. a

Brocca, L., Filippucci, P., Hahn, S., Ciabatta, L., Massari, C., Camici, S., Schüller, L., Bojkov, B., and Wagner, W.: SM2RAIN–ASCAT (2007–2018): global daily satellite rainfall data from ASCAT soil moisture observations, Earth Syst. Sci. Data, 11, 1583–1601, https://doi.org/10.5194/essd-11-1583-2019, 2019. a

Brown, J., Ferrians, O. J., Heginbottom, J. A., and Melnikov, E. S.: Circum-Arctic Map of Permafrost and Ground-Ice Conditions. Version 2, Tech. rep., National Snow and Ice Data Center, Boulder, Colorado USA, 1997. a

Brocca, L., Filippucci, P., Hahn, S., Ciabatta, L., Massari, C., Camici, S., Schüller, L., Bojkov, B., and Wagner, W.: SM2RAIN-ASCAT (2007–2022): global daily satellite rainfall from ASCAT soil moisture (2.1.2n), Zenodo [data set], https://doi.org/10.5281/zenodo.10376109, 2023. a

Camici, S., Ciabatta, L., Massari, C., and Brocca, L.: How reliable are satellite precipitation estimates for driving hydrological models: A verification study over the Mediterranean area, J. Hydrol., 563, 950–961, https://doi.org/10.1016/j.jhydrol.2018.06.067, 2018. a

Cao, Q., Painter, T. H., Currier, W. R., Lundquist, J. D., and Lettenmaier, D. P.: Estimation of Precipitation over the OLYMPEX Domain during Winter 2015/16, J. Hydrometeorol., 19, 143–160, https://doi.org/10.1175/JHM-D-17-0076.1, 2018. a, b

Chagas, V. B. P., Chaffe, P. L. B., Addor, N., Fan, F. M., Fleischmann, A. S., Paiva, R. C. D., and Siqueira, V. A.: CAMELS-BR: hydrometeorological time series and landscape attributes for 897 catchments in Brazil, Earth Syst. Sci. Data, 12, 2075–2096, https://doi.org/10.5194/essd-12-2075-2020, 2020. a

Chan, S. C., Kendon, E. J., Fowler, H. J., Blenkinsop, S., Ferro, C. A., and Stephenson, D. B.: Does increasing the spatial resolution of a regional climate model improve the simulated daily precipitation?, Clim. Dynam., 41, 1475–1495, https://doi.org/10.1007/s00382-012-1568-9, 2013. a

Chen, H., Yong, B., Qi, W., Wu, H., Ren, L., and Hong, Y.: Investigating the evaluation uncertainty for satellite precipitation estimates based on two different ground precipitation observation products, J. Hydrometeorol., 21, 2595–2606, https://doi.org/10.1175/JHM-D-20-0103.1, 2020. a, b

Chen, M., Shi, W., Xie, P., Silva, V. B. S., Kousky, V. E., Wayne Higgins, R., and Janowiak, J. E.: Assessing objective techniques for gauge-based analyses of global daily precipitation, J. Geophys. Res.-Atmos., 113, https://doi.org/10.1029/2007JD009132, 2008. a

Ciabatta, L., Massari, C., Brocca, L., Gruber, A., Reimer, C., Hahn, S., Paulik, C., Dorigo, W., Kidd, R., and Wagner, W.: SM2RAIN-CCI: a new global long-term rainfall data set derived from ESA CCI soil moisture, Earth Syst. Sci. Data, 10, 267–280, https://doi.org/10.5194/essd-10-267-2018, 2018a. a

Ciabatta, L., Massari, C., Brocca, L., Gruber, A., Reimer, C., Hahn, S., Paulik, C., Dorigo, W., Kidd, R., and Wagner, W.: SM2RAIN-CCI (1 Jan 1998–31 December 2015) global daily rainfall dataset (Version 2), Zenodo [data set], https://doi.org/10.5281/zenodo.1305021, 2018b. a

Contractor, S., Donat, M. G., Alexander, L. V., Ziese, M., Meyer-Christoffer, A., Schneider, U., Rustemeier, E., Becker, A., Durre, I., and Vose, R. S.: Rainfall Estimates on a Gridded Network (REGEN) – a global land-based gridded dataset of daily precipitation from 1950 to 2016, Hydrol. Earth Syst. Sci., 24, 919–943, https://doi.org/10.5194/hess-24-919-2020, 2020. a

Copernicus Climate Change Service, Climate Data Store: ERA5 hourly data on single levels from 1940 to present, Copernicus Climate Change Service (C3S) Climate Data Store (CDS) [data set], https://doi.org/10.24381/cds.adbb2d47, 2023. a

Coxon, G., Addor, N., Bloomfield, J. P., Freer, J., Fry, M., Hannaford, J., Howden, N. J. K., Lane, R., Lewis, M., Robinson, E. L., Wagener, T., and Woods, R.: CAMELS-GB: hydrometeorological time series and landscape attributes for 671 catchments in Great Britain, Earth Syst. Sci. Data, 12, 2459–2483, https://doi.org/10.5194/essd-12-2459-2020, 2020. a

Crochemore, L., Isberg, K., Pimentel, R., Pineda, L., Hasan, A., and Arheimer, B.: Lessons learnt from checking the quality of openly accessible river flow data worldwide, Hydrol. Sci. J., 65, 699–711, https://doi.org/10.1080/02626667.2019.1659509, 2020. a

Dangar, S., Asoka, A., and Mishra, V.: Causes and implications of groundwater depletion in India: A review, J. Hydrol., 596, 126103, https://doi.org/10.1016/j.jhydrol.2021.126103, 2021. a

Danielson, J. J. and Gesch, D. B.: Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010), Open-File Report 2011–1073, United States Geological Survey (USGS), Reston, Virginia, 2011. a

Dimitrova, A., McElroy, S., Levy, M., Gershunov, A., and Benmarhnia, T.: Precipitation variability and risk of infectious disease in children under 5 years for 32 countries: a global analysis using Demographic and Health Survey data, The Lancet Planetary Health, 6, e147–e155, https://doi.org/10.1016/S2542-5196(21)00325-9, 2022. a

do Nascimento, T. V., Rudlang, J., Höge, M., van der Ent, R., Chappon, M., Seibert, J., Hrachowitz, M., and Fenicia, F.: EStreams: An integrated dataset and catalogue of streamflow, hydro-climatic and landscape variables for Europe, Sci. Data, 11, 879, https://doi.org/10.1038/s41597-024-03706-1, 2024. a

Dresel, P. E., Dean, J. F., Perveen, F., Webb, J. A., Hekmeijer, P., Adelana, S. M., and Daly, E.: Effect of Eucalyptus plantations, geology, and precipitation variability on water resources in upland intermittent catchments, J. Hydrol., 564, 723–739, https://doi.org/10.1016/j.jhydrol.2018.07.019, 2018. a

Ebert, E. E., Janowiak, J. E., and Kidd, C.: Comparison of near-real-time precipitation estimates from satellite observations and numerical models, B. Am. Meteorol. Soc., 88, 47–64, https://doi.org/10.1175/BAMS-88-1-47, 2007. a

Ehsani, M. R. and Behrangi, A.: A comparison of correction factors for the systematic gauge-measurement errors to improve the global land precipitation estimate, J. Hydrol., 610, 127884, https://doi.org/10.1016/j.jhydrol.2022.127884, 2022. a

El Kenawy, A. M., McCabe, M. F., Lopez-Moreno, J. I., Hathal, Y., Robaa, S. M., Al Budeiri, A. L., Jadoon, K. Z., Abouelmagd, A., Eddenjal, A., Domínguez-Castro, F., Trigo, R. M., and Vicente-Serrano, S. M.: Spatial assessment of the performance of multiple high-resolution satellite-based precipitation data sets over the Middle East, Int. J. Climatol., 39, 2522–2543, https://doi.org/10.1002/joc.5968, 2019. a

Ensor, L. A. and Robeson, S. M.: Statistical characteristics of daily precipitation: comparisons of gridded and point datasets, J. Appl. Meteorol. Climatol., 47, 2468–2476, https://doi.org/10.1175/2008JAMC1757.1, 2008. a

FAO: Forest Resource Assessment (FRA) forest cover, https://www.fao.org/forest-resources-assessment/fra-2000/en (last access: 5 May 2026), 2000. a

Fick, S. E. and Hijmans, R. J.: WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas, Int. J. Climatol., 37, 4302–4315, https://doi.org/10.1002/joc.5086, 2017. a, b, c, d, e

Fortin, F., De Rainville, F., Gardner, M., Parizeau, M., and Gagné, C.: DEAP: evolutionary algorithms made easy, J. Mach. Learn. Res., 13, 2171–2175, https://doi.org/10.5555/2503308.2503311, 2012. a

Funk, C., Peterson, P., Landsfeld, M., Pedreros, D., Verdin, J., Shukla, S., Husak, G., Rowland, J., Harrison, L., Hoell, A., and Michaelsen, J.: The climate hazards infrared precipitation with stations – a new environmental record for monitoring extremes, Sci. Data, 2, 150066, https://doi.org/10.1038/sdata.2015.66, 2015. a, b

Fuster, B., Sánchez-Zapero, J., Camacho, F., García-Santos, V., Verger, A., Lacaze, R., Weiss, M., Baret, F., and Smets, B.: Quality assessment of PROBA-V LAI, fAPAR and fCOVER collection 300 m products of copernicus global land service, Remote Sens., 12, 1017, https://doi.org/10.3390/rs12061017, 2020. a

Gebrechorkos, S. H., Leyland, J., Dadson, S. J., Cohen, S., Slater, L., Wortmann, M., Ashworth, P. J., Bennett, G. L., Boothroyd, R., Cloke, H., Delorme, P., Griffith, H., Hardy, R., Hawker, L., McLelland, S., Neal, J., Nicholas, A., Tatem, A. J., Vahidi, E., Liu, Y., Sheffield, J., Parsons, D. R., and Darby, S. E.: Global-scale evaluation of precipitation datasets for hydrological modelling, Hydrol. Earth Syst. Sci., 28, 3099–3118, https://doi.org/10.5194/hess-28-3099-2024, 2024. a, b, c, d, e

Gebremichael, M.: Framework for satellite rainfall product evaluation, in: Rainfall: State of the Science, edited by Testik, F. Y. and Gebremichael, M., Geophysical Monograph Series, American Geophysical Union, Washington, D. C., https://doi.org/10.1029/2010GM000974, 2010. a

Gericke, O. J. and Smithers, J. C.: Review of methods used to estimate catchment response time for the purpose of peak discharge estimation, Hydrol. Sci. J., 59, 1935–1971, https://doi.org/10.1080/02626667.2013.866712, 2014. a

Girotto, M., Formetta, G., Azimi, S., Bachand, C., Cowherd, M., De Lannoy, G., Lievens, H., Modanesi, S., Raleigh, M. S., Rigon, R., and Massari, C.: Identifying snowfall elevation patterns by assimilating satellite-based snow depth retrievals, Sci. Total Environ., 906, 167312, https://doi.org/10.1016/j.scitotenv.2023.167312, 2024a. a

Gochis, D. J., Nesbitt, S. W., Yu, W., and Williams, S. F.: Comparison of gauge-corrected versus non-gauge corrected satellite-based quantitative precipitation estimates during the 2004 NAME enhanced observing period, Atmósfera, 22, 69–98, 2009. a

Groisman, P. Y. and Legates, D. R.: The accuracy of United States precipitation data, B. Am. Meteorol. Soc., 72, 215–227, 1994. a, b

Gu, L., Yin, J., Wang, S., Chen, J., Qin, H., Yan, X., He, S., and Zhao, T.: How well do the multi-satellite and atmospheric reanalysis products perform in hydrological modelling, J. Hydrol., 617, 128920, https://doi.org/10.1016/j.jhydrol.2022.128920, 2023. a, b, c, d, e

Gupta, H. V., Kling, H., Yilmaz, K. K., and Martinez, G. F.: Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling, J. Hydrol., 370, 80–91, https://doi.org/10.1016/j.jhydrol.2009.08.003, 2009. a

Hao, Z., Jin, J., Xia, R., Tian, S., Yang, W., Liu, Q., Zhu, M., Ma, T., Jing, C., and Zhang, Y.: CCAM: China Catchment Attributes and Meteorology dataset, Earth Syst. Sci. Data, 13, 5591–5616, https://doi.org/10.5194/essd-13-5591-2021, 2021. a

Helgason, H. B. and Nijssen, B.: LamaH-Ice: LArge-SaMple DAta for Hydrology and Environmental Sciences for Iceland, Earth Syst. Sci. Data, 16, 2741–2771, https://doi.org/10.5194/essd-16-2741-2024, 2024. a

Hengl, T., Mendes de Jesus, J., Heuvelink, G. B., Ruiperez Gonzalez, M., Kilibarda, M., Blagotić, A., Shangguan, W., Wright, M. N., Geng, X., Bauer-Marschallinger, B., Guevara, M. A., Vargas, R., MacMillan, R. A., Batjes, N. H., Leenaars, J. G. B., Ribeiro, E., Wheeler, I., Mantel, S., and Kempen, B.: SoilGrids250m: Global gridded soil information based on machine learning, PLoS one, 12, e0169748, https://doi.org/10.1371/journal.pone.0169748, 2017. a

Herold, N., Alexander, L. V., Donat, M. G., Contractor, S., and Becker, A.: How much does it rain over land?, Geophys. Res. Lett., 43, 341–348, https://doi.org/10.1002/2015GL066615, 2016. a

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horanyi, A., noz Sabater, J. M., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., Chiara, G. D., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Holm, E., Janiskova, M., Keeley, S., Laloyaux, P., Lopez, P., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 global reanalysis, Q. J. Roy. Meteor. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803, 2020. a, b, c, d, e, f

Hinge, G., Hamouda, M. A., Long, D., and Mohamed, M. M.: Hydrologic utility of satellite precipitation products in flood prediction: A meta-data analysis and lessons learnt, J. Hydrol., 612, 128103, https://doi.org/10.1016/j.jhydrol.2022.128103, 2022. a

Höge, M., Kauzlaric, M., Siber, R., Schönenberger, U., Horton, P., Schwanbeck, J., Floriancic, M. G., Viviroli, D., Wilhelm, S., Sikorska-Senoner, A. E., Addor, N., Brunner, M., Pool, S., Zappa, M., and Fenicia, F.: CAMELS-CH: hydro-meteorological time series and landscape attributes for 331 catchments in hydrologic Switzerland, Earth Syst. Sci. Data, 15, 5755–5784, https://doi.org/10.5194/essd-15-5755-2023, 2023. a

Hong, Y., Hsu, K.-L., Sorooshian, S., and Gao, X.: Precipitation Estimation from Remotely Sensed Imagery Using an Artificial Neural Network Cloud Classification System, J. Appl. Meteorol. Climatol., 43, 1834–1853, https://doi.org/10.1175/JAM2173.1, 2004. a, b

Huang, Y., Bárdossy, A., and Zhang, K.: Sensitivity of hydrological models to temporal and spatial resolutions of rainfall data, Hydrol. Earth Syst. Sci., 23, 2647–2663, https://doi.org/10.5194/hess-23-2647-2019, 2019. a

Huang, Z., Hejazi, M., Li, X., Tang, Q., Vernon, C., Leng, G., Liu, Y., Döll, P., Eisner, S., Gerten, D., Hanasaki, N., and Wada, Y.: Reconstruction of global gridded monthly sectoral water withdrawals for 1971–2010 and analysis of their spatiotemporal patterns, Hydrol. Earth Syst. Sci., 22, 2117–2133, https://doi.org/10.5194/hess-22-2117-2018, 2018. a

Huffman, G. J., Stocker, E. F., Bolvin, D. T., Nelkin, E. J., and Tan, J.: GPM IMERG final precipitation L3 half hourly 0.1 degree x 0.1 degree V06, Goddard Earth Sciences Data and Information Services Center (GES DISC): Greenbelt, MD, USA, https://doi.org/10.5067/GPM/IMERG/3B-HH/07, 2019a. a, b, c, d

Huffman, G. J., Adler, R. F., Behrangi, A., Bolvin, D. T., Nelkin, E. J., Gu, G., and Ehsani, M. R.: The new version 3.2 global precipitation climatology project (GPCP) monthly and daily precipitation products, J. Climate, 36, 7635–7655, https://doi.org/10.1175/JCLI-D-23-0123.1, 2023. a

Huscroft, J., Gleeson, T., Hartmann, J., and Börker, J.: Compiling and mapping global permeability of the unconsolidated and consolidated Earth: GLobal HYdrogeology MaPS 2.0 (GLHYMPS 2.0), Geophys. Res. Lett., 45, 1897–1904, https://doi.org/10.1002/2017GL075860, 2018. a

Jääskeläinen, E., Kouki, K., and Riihelä, A.: Detecting snowfall events over the Arctic using optical and microwave satellite measurements, Hydrol. Earth Syst. Sci., 28, 3855–3870, https://doi.org/10.5194/hess-28-3855-2024, 2024. a, b

Kidd, C., Becker, A., Huffman, G. J., Muller, C. L., Joe, P., Skofronick-Jackson, G., and Kirschbaum, D. B.: So, how much of the Earth's surface is covered by rain gauges?, B. Am. Meteorol. Soc., 98, 69–78, https://doi.org/10.1175/BAMS-D-14-00283.1, 2017. a, b, c, d, e, f

Kling, H., Fuchs, M., and Paulin, M.: Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios, J. Hydrol., 424–425, 264–277, https://doi.org/10.1016/j.jhydrol.2012.01.011, 2012. a

Klingler, C., Schulz, K., and Herrnegger, M.: LamaH-CE: LArge-SaMple DAta for Hydrology and Environmental Sciences for Central Europe, Earth Syst. Sci. Data, 13, 4529–4565, https://doi.org/10.5194/essd-13-4529-2021, 2021. a

Kosaka, Y., Kobayashi, S., Harada, Y., Kobayashi, C., Naoe, H., Koichi, Y., Harada, M., Goto, N., Chiba, J., Miyaoka, K., Sekiguchi, R., Deushi, M., Kamahori, H., Nakaegawa, T., Tanaka, T., Tokuhiro, T., Sato, Y., Matsushia, Y., and Onogi, K.: The JRA-3Q ReanalysisThe JRA-3Q Reanalysis, J. Meteorol. Soc. Jpn. Ser. II, 102, https://doi.org/10.2151/jmsj.2024-004, 2024. a, b, c, d

Krabbenhoft, C. A., Allen, G. H., Lin, P., Godsey, S. E., Allen, D. C., Burrows, R. M., DelVecchia, A. G., Fritz, K. M., Shanafield, M., Burgin, A. J., Zimmer, M. A., Datry, T., Dodds, W. K., Jones, C. N., Mims, M. C., Franklin, C., Hammond, J. C., Zipper, S., Ward, A. S., Costigan, K. H., Beck, H. E., and Olden, J. D.: Assessing placement bias of the global river gauge network, Nature Sustainability, 5, 586–592, https://doi.org/10.1038/s41893-022-00873-0, 2022. a, b

Kubota, T., Aonashi, K., Ushio, T., Shige, S., Takayabu, Y. N., Kachi, M., Arai, Y., Tashima, T., Masaki, T., and Kawamoto, N.: Global Satellite Mapping of Precipitation (GSMaP) products in the GPM era, Satellite Precipitation Measurement: vol. 1, Springer, 355–373, https://doi.org/10.1007/978-3-030-24568-9_20, 2020. a

Kubota, T., Yamamoto, M. K., Ito, M., Tashima, T., Hirose, H., Ushio, T., Aonashi, K., Shige, S., Hamada, A., Yamaji, M., Yoshida, N., and Kachi, M.: Construction of a longer-term and more homogeneous GSMaP precipitation dataset, 355–373, Springer, https://doi.org/10.1007/978-3-030-24568-9_20, 2024. a

Legates, D. R. and Bogart, T. A.: Estimating the proportion of monthly precipitation that falls in solid form, J. Hydrometeorol., 10, 1299–1306, https://doi.org/10.1175/2009JHM1086.1, 2009. a

Lehner, B. and Döll, P.: Development and validation of a global database of lakes, reservoirs and wetlands, J. Hydrol., 296, 1–22, https://doi.org/10.1016/j.jhydrol.2004.03.028, 2004. a

Lehner, B., Reidy Liermann, C., Revenga, C., Vörösmarty, C., Fekete, B., Crouzet, P., Döll, P., Endejan, M., Frenken, K., Magome, J., Nilsson, C., Robertson, J. C., Rödel, R., Sindorf, N., and Wisser, D.: High resolution mapping of the world's reservoirs and dams for sustainable river flow management, Front. Ecol. Environ., 9, 494–502, https://doi.org/10.1890/100125, 2011. a, b, c

Li, Z., Chen, M., Gao, S., Hong, Z., Tang, G., Wen, Y., Gourley, J. J., and Hong, Y.: Cross-examination of similarity, difference and deficiency of gauge, radar and satellite precipitation measuring uncertainties for extreme events using conventional metrics and multiplicative triple collocation, Remote Sens., 12, 1258, https://doi.org/10.3390/rs12081258, 2020. a

Liang, M. and Gornish, E. S.: Rainfall regulation of grazed grasslands, P. Natl. Acad. Sci. USA, 116, 23887–23888, https://doi.org/10.1073/pnas.1915399116, 2019. a

Lin, J., Qian, T., Bechtold, P., Grell, G., Zhang, G. J., Zhu, P., Freitas, S. R., Barnes, H., and Han, J.: Atmospheric convection, Atmosphere-Ocean, 60, 422–476, https://doi.org/10.1080/07055900.2022.2082915, 2022. a

Liu, Z., Liu, Y., Wang, S., Yang, X., Wang, L., Baig, M. H. A., Chi, W., and Wang, Z.: Evaluation of spatial and temporal performances of ERA-Interim precipitation and temperature in Mainland China, J. Climate, 31, 4347–4365, https://doi.org/10.1175/JCLI-D-17-0212.1, 2018. a

Loritz, R., Dolich, A., Acuña Espinoza, E., Ebeling, P., Guse, B., Götte, J., Hassler, S. K., Hauffe, C., Heidbüchel, I., Kiesel, J., Mälicke, M., Müller-Thomy, H., Stölzle, M., and Tarasova, L.: CAMELS-DE: hydro-meteorological time series and attributes for 1582 catchments in Germany, Earth Syst. Sci. Data, 16, 5625–5642, https://doi.org/10.5194/essd-16-5625-2024, 2024. a

Lussana, C., Saloranta, T., Skaugen, T., Magnusson, J., Tveito, O. E., and Andersen, J.: seNorge2 daily precipitation, an observational gridded dataset over Norway from 1957 to the present day, Earth Syst. Sci. Data, 10, 235–249, https://doi.org/10.5194/essd-10-235-2018, 2018. a

Maggioni, V., Meyers, P. C., and Robinson, M. D.: A review of merged high resolution satellite precipitation product accuracy during the Tropical Rainfall Measuring Mission (TRMM)-era, J. Hydrometeorol., 17, 1101–1117, https://doi.org/10.1175/JHM-D-15-0190.1, 2016. a

Maisongrande, P., Duchemin, B., and Dedieu, G.: VEGETATION/SPOT: an operational mission for the Earth monitoring; presentation of new standard products, Int. J. Remote Sens., 25, 9–14, https://doi.org/10.1080/0143116031000115265, 2004. a

Mangukiya, N. K., Kumar, K. B., Dey, P., Sharma, S., Bejagam, V., Mujumdar, P. P., and Sharma, A.: CAMELS-IND: hydrometeorological time series and catchment attributes for 228 catchments in Peninsular India, Earth Syst. Sci. Data, 17, 461–491, https://doi.org/10.5194/essd-17-461-2025, 2025. a, b

Massari, C.: GPM + SM2RAIN (2007-2018): quasi-global 25km/daily rainfall product from the integration of GPM and SM2RAIN-based rainfall products (0.1.0), Zenodo [data set], https://doi.org/10.5281/zenodo.3854817, 2020. a

Massari, C., Brocca, L., Pellarin, T., Abramowitz, G., Filippucci, P., Ciabatta, L., Maggioni, V., Kerr, Y., and Fernandez Prieto, D.: A daily 25 km short-latency rainfall product for data-scarce regions based on the integration of the Global Precipitation Measurement mission rainfall and multiple-satellite soil moisture products, Hydrol. Earth Syst. Sci., 24, 2687–2710, https://doi.org/10.5194/hess-24-2687-2020, 2020. a, b

Mazzoleni, M., Brandimarte, L., and Amaranto, A.: Evaluating precipitation datasets for large-scale distributed hydrological modelling, J. Hydrol., 578, 124076, https://doi.org/10.1016/j.jhydrol.2019.124076, 2019. a

McKinnon, K. A. and Deser, C.: The inherent uncertainty of precipitation variability, trends, and extremes due to internal variability, with implications for Western US water resources, J. Climate, 34, 9605–9622, https://doi.org/10.1175/JCLI-D-21-0251.1, 2021. a

Ménégoz, M., Gallée, H., and Jacobi, H. W.: Precipitation and snow cover in the Himalaya: from reanalysis to regional climate simulations, Hydrol. Earth Syst. Sci., 17, 3921–3936, https://doi.org/10.5194/hess-17-3921-2013, 2013. a

Menne, M. J., Durre, I., Vose, R. S., Gleason, B. E., and Houston, T. G.: An overview of the global historical climatology network-daily database, J. Atmos. Ocean. Tech., 29, 897–910, https://doi.org/10.1175/JTECH-D-11-00103.1, 2012. a, b

Monteith, J. L.: Evaporation and environment, in: Symposia of the society for experimental biology, Cambridge University Press (CUP) Cambridge, 19, 205–234, 1965. a

Müller Schmied, H., Eisner, S., Franz, D., Wattenbach, M., Portmann, F. T., Flörke, M., and Döll, P.: Sensitivity of simulated global-scale freshwater fluxes and storages to input data, hydrological model structure, human water use and calibration, Hydrol. Earth Syst. Sci., 18, 3511–3538, https://doi.org/10.5194/hess-18-3511-2014, 2014. a

NCEP: Global Data Assimilation System (GDAS), https://www.ncei.noaa.gov/products/weather-climate-models/global-data-assimilation (last access: November 2024), 2024. a

Nguyen, P., Ombadi, M., Gorooh, V. A., Shearer, E. J., Sadeghi, M., Sorooshian, S., Hsu, K., Bolvin, D., and Ralph, M. F.: PERSIANN Dynamic Infrared–Rain Rate (PDIR-Now): A Near-Real-Time, Quasi-Global Satellite Precipitation Dataset, J. Hydrometeorol., 21, 2893–2906, https://doi.org/10.1175/JHM-D-20-0177.1, 2020. a, b, c, d, e

Ouyang, W., Lawson, K., Feng, D., Ye, L., Zhang, C., and Shen, C.: Continental-scale streamflow modeling of basins with reservoirs: Towards a coherent deep-learning-based strategy, J. Hydrol., 599, 126455, https://doi.org/10.1016/j.jhydrol.2021.126455, 2021. a

Penman, H. L.: Natural evaporation from open water, bare soil and grass, P. Roy. Soc. A, 193, 120–146, https://doi.org/10.1098/rspa.1948.0037, 1948. a

Peters, K., Hohenegger, C., and Klocke, D.: Different representation of mesoscale convective systems in convection-permitting and convection-parameterizing NWP models and its implications for large-scale forecast evolution, Atmosphere, 10, 503, https://doi.org/10.3390/atmos10090503, 2019. a

Pontoppidan, M., Reuder, J., Mayer, S., and Kolstad, E. W.: Downscaling an intense precipitation event in complex terrain: the importance of high grid resolution, Tellus, 69, 1271561, https://doi.org/10.1080/16000870.2016.1271561, 2017. a

Pradhan, A. and Indu, J.: Assessment of SM2RAIN derived and IMERG based precipitation products for hydrological simulation, J. Hydrol., 603, 127191, https://doi.org/10.1016/j.jhydrol.2021.127191, 2021. a

Prein, A. F. and Gobiet, A.: Impacts of uncertainties in European gridded precipitation observations on regional climate analysis, Int. J. Climatol., 37, 305–327, https://doi.org/10.1002/joc.4706, 2017. a

Puy, A., Lankford, B., Meier, J., Van Der Kooij, S., and Saltelli, A.: Large variations in global irrigation withdrawals caused by uncertain irrigation efficiencies, Environ. Res. Lett., 17, 044014, https://doi.org/10.1088/1748-9326/ac5768, 2022. a

Rasmussen, R. M., Baker, B., Kochendorfer, J., Meyers, T., Landolt, S., Fischer, A. P., Black, J., Thériault, J. M., Kucera, P., Gochis, D., Smith, C., Nitu, R., Hall, M., Ikeda, K., and Gutmann, E.: How well are we measuring snow: The NOAA/FAA/NCAR winter precipitation test bed, B. Am. Meteorol. Soc., 93, 811–829, https://doi.org/10.1175/BAMS-D-11-00052.1, 2012. a, b

Rodell, M., Velicogna, I., and Famiglietti, J. S.: Satellite-based estimates of groundwater depletion in India, Nature, 460, 999–1002, https://doi.org/10.1038/nature08238, 2009. a

Sadeghi, L., Saghafian, B., and Moazami, S.: Evaluation of IMERG and MRMS remotely sensed snowfall products, Int. J. Remote Sens., 40, 4175–4192, https://doi.org/10.1080/01431161.2018.1562259, 2019. a

Sadeghi, M., Nguyen, P., Naeini, M. R., Hsu, K., Braithwaite, D., and Sorooshian, S.: PERSIANN-CCS-CDR, a 3-hourly 0.04° global precipitation climate data record for heavy precipitation studies, Sci. Data, 8, 157, https://doi.org/10.1038/s41597-021-00940-9, 2021. a

Schneider, U., Becker, A., Finger, P., Meyer-Christoffer, A., Ziese, M., and Rudolf, B.: GPCC's new land surface precipitation climatology based on quality-controlled in situ data and its role in quantifying the global water cycle, Theor. Appl. Climatol., 115, 15–40, https://doi.org/10.1007/s00704-013-0860-x, 2014. a

Schumacher, V., Fernández, A., Justino, F., and Comin, A.: WRF High Resolution Dynamical Downscaling of Precipitation for the Central Andes of Chile and Argentina, Front. Earth Sci., 8, https://doi.org/10.3389/feart.2020.00328, 2020. a

Seibert, J. and Bergström, S.: A retrospective on hydrological catchment modelling based on half a century with the HBV model, Hydrol. Earth Syst. Sci., 26, 1371–1388, https://doi.org/10.5194/hess-26-1371-2022, 2022. a, b

Seibert, J. and Vis, M. J. P.: Teaching hydrological modeling with a user-friendly catchment-runoff-model software package, Hydrol. Earth Syst. Sci., 16, 3315–3325, https://doi.org/10.5194/hess-16-3315-2012, 2012. a, b

Sevruk, B., Ondrás, M., and Chvíla, B.: The WMO precipitation measurement intercomparisons, Atmos. Res., 92, 376–380, https://doi.org/10.1016/j.atmosres.2009.01.016, 2009. a, b

Sharma, S., Chen, Y., Zhou, X., Yang, K., Li, X., Niu, X., Hu, X., and Khadka, N.: Evaluation of GPM-Era Satellite Precipitation Products on the Southern Slopes of the Central Himalayas Against Rain Gauge Data, Remote Sens., 12, 1836, https://doi.org/10.3390/rs12111836, 2020a. a

Sharma, S., Khadka, N., Hamal, K., Shrestha, D., Talchabhadel, R., and Chen, Y.: How accurately can satellite products (TMPA and IMERG) detect precipitation patterns, extremities, and drought across the Nepalese Himalaya?, Earth Space Sci., 7, e2020EA001315, https://doi.org/10.1029/2020EA001315, 2020b. a

Shen, Y., Hong, Z., Pan, Y., Yu, J., and Maguire, L.: China’s 1 km Merged Gauge, Radar and Satellite Experimental Precipitation Dataset, Remote Sens., 10, https://doi.org/10.3390/rs10020264, 2018. a

Siebert, S., Döll, P., Hoogeveen, J., Faures, J.-M., Frenken, K., and Feick, S.: Development and validation of the global map of irrigation areas, Hydrol. Earth Syst. Sci., 9, 535–547, https://doi.org/10.5194/hess-9-535-2005, 2005. a

Siebert, S., Henrich, V., Frenken, K., and Burke, J.: Update of the digital global map of irrigation areas to version 5, Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany and Food and Agriculture Organization of the United Nations, Rome, Italy, 10, 2660–6728, https://doi.org/10.13140/2.1.2660.6728, 2013. a

Skamarock, W. C.: Evaluating mesoscale NWP models using kinetic energy spectra, Mon. Weather Rev., 132, 3019–3032, https://doi.org/10.1175/MWR2830.1, 2004. a

Song, Y., Broxton, P. D., Ehsani, M. R., and Behrangi, A.: Assessment of snowfall accumulation from satellite and reanalysis products using SNOTEL observations in Alaska, Remote Sens., 13, 2922, https://doi.org/10.3390/rs13152922, 2021. a

Su, F., Hong, Y., and Lettenmaier, D. P.: Evaluation of TRMM Multisatellite Precipitation Analysis (TMPA) and its utility in hydrologic prediction in the La Plata Basin, J. Hydrometeorol., 9, 622, https://doi.org/10.1175/2007JHM944.1, 2008. a

Sun, Q., Miao, C., Duan, Q., Ashouri, H., Sorooshian, S., and Hsu, K.-L.: A review of global precipitation datasets: data sources, estimation, and intercomparisons, Rev. Geophys., 56, 79–107, https://doi.org/10.1002/2017RG000574, 2018. a, b, c, d

Tang, G., Ma, Y., Long, D., Zhong, L., and Hong, Y.: Evaluation of GPM Day-1 IMERG and TMPA Version-7 legacy products over Mainland China at multiple spatiotemporal scales, J. Hydrol., 533, 152–167, https://doi.org/10.1016/j.jhydrol.2015.12.008, 2016. a

Tarek, M., Brissette, F. P., and Arsenault, R.: Evaluation of the ERA5 reanalysis as a potential reference dataset for hydrological modelling over North America, Hydrol. Earth Syst. Sci., 24, 2527–2544, https://doi.org/10.5194/hess-24-2527-2020, 2020. a

Tarek, M., Brissette, F., and Arsenault, R.: Uncertainty of gridded precipitation and temperature reference datasets in climate change impact studies, Hydrol. Earth Syst. Sci., 25, 3331–3350, https://doi.org/10.5194/hess-25-3331-2021, 2021. a

Tramblay, Y., Rouché, N., Paturel, J.-E., Mahé, G., Boyer, J.-F., Amoussou, E., Bodian, A., Dacosta, H., Dakhlaoui, H., Dezetter, A., Hughes, D., Hanich, L., Peugeot, C., Tshimanga, R., and Lachassagne, P.: ADHI: the African Database of Hydrometric Indices (1950–2018), Earth Syst. Sci. Data, 13, 1547–1560, https://doi.org/10.5194/essd-13-1547-2021, 2021. a, b

Voisin, N., Wood, A. W., and Lettenmaier, D. P.: Evaluation of precipitation products for global hydrological prediction, J. Hydrometeorol., 9, 388–407, https://doi.org/10.1175/2007JHM938.1, 2008. a

Wada, Y., Bierkens, M. F. P., de Roo, A., Dirmeyer, P. A., Famiglietti, J. S., Hanasaki, N., Konar, M., Liu, J., Müller Schmied, H., Oki, T., Pokhrel, Y., Sivapalan, M., Troy, T. J., van Dijk, A. I. J. M., van Emmerik, T., Van Huijgevoort, M. H. J., Van Lanen, H. A. J., Vörösmarty, C. J., Wanders, N., and Wheater, H.: Human–water interface in hydrological modelling: current status and future directions, Hydrol. Earth Syst. Sci., 21, 4169–4193, https://doi.org/10.5194/hess-21-4169-2017, 2017. a

Wang, J., Walter, B. A., Yao, F., Song, C., Ding, M., Maroof, A. S., Zhu, J., Fan, C., McAlister, J. M., Sikder, S., Sheng, Y., Allen, G. H., Crétaux, J.-F., and Wada, Y.: GeoDAR: georeferenced global dams and reservoirs dataset for bridging attributes and geolocations, Earth Syst. Sci. Data, 14, 1869–1899, https://doi.org/10.5194/essd-14-1869-2022, 2022. a

Wang, Y., You, Y., and Kulie, M.: Global Virga Precipitation Distribution Derived From Three Spaceborne Radars and Its Contribution to the False Radiometer Precipitation Detection, Geophys. Res. Lett., 45, 4446–4455, https://doi.org/10.1029/2018GL077891, 2018. a

Williams, G. P.: Friends don't let friends use Nash-Sutcliffe Efficiency (NSE) or KGE for hydrologic model accuracy evaluation: A rant with data and suggestions for better practice, Environ. Model. Softw., 194, 106665, https://doi.org/10.1016/j.envsoft.2025.106665, 2025. a

Xia, Y., Mitchell, K., Ek, M., Sheffield, J., Cosgrove, B., Wood, E., Luo, L., Alonge, C., Wei, H., Meng, J., Livneh, B., Lettenmaier, D., Koren, V., Duan, Q., Mo, K., Fan, Y., and Mocko, D.: Continental-scale water and energy flux analysis and validation for the North American Land Data Assimilation System project phase 2 (NLDAS-2): 1. Intercomparison and application of model products, J. Geophys. Res.-Atmos., 117, https://doi.org/10.1029/2011JD016048, 2012. a

Xiang, Y., Chen, J., Li, L., Peng, T., and Yin, Z.: Evaluation of eight global precipitation datasets in hydrological modeling, Remote Sens., 13, 2831, https://doi.org/10.3390/rs13142831, 2021. a

Xie, P., Joyce, R., Wu, S., Yoo, S.-H., Yarosh, Y., Sun, F., and Lin, R.: Reprocessed, Bias-Corrected CMORPH Global High-Resolution Precipitation Estimates from 1998, J. Hydrometeorol., 18, 1617–1641, https://doi.org/10.1175/JHM-D-16-0168.1, 2017. a

Xie, P., Joyce, R., Wu, S., Yoo, S.-H., Yarosh, Y., Sun, F., Lin, R., and Program, N. C.: NOAA Climate Data Record (CDR) of CPC Morphing Technique (CMORPH) High Resolution Global Precipitation Estimates, Version 1, nOAA National Centers for Environmental Information [data set], https://doi.org/10.25921/w9va-q159 (last access: 1 July 2025), 2019. a

Yamamoto, M. K., Shige, S., Yu, C.-K., and Cheng, L.-W.: Further improvement of the heavy orographic rainfall retrievals in the GSMaP algorithm for microwave radiometers, J. Appl. Meteorol. Climatol., 56, 2607–2619, https://doi.org/10.1175/JAMC-D-16-0332.1, 2017. a, b

Yang, S., Jones, P. D., Jiang, H., and Zhou, Z.: Development of a near-real-time global in situ daily precipitation dataset for 0000–0000 UTC, Int. J. Climatol., 40, 2795–2810, https://doi.org/10.1002/joc.6367, 2020. a, b

Yano, J.-I., Ziemiański, M. Z., Cullen, M., Termonia, P., Onvlee, J., Bengtsson, L., Carrassi, A., Davy, R., Deluca, A., Gray, S. L., Homar, V., Koehler, M., Krichak, S., Michaelides, S., Phillips, V. T. J., Soares, P. M. M., and Wyszogrodzki, A. A.: Scientific challenges of convective-scale numerical weather prediction, B. Am. Meteorol. Soc., 99, 699–710, https://doi.org/10.1175/BAMS-D-17-0125.1, 2018. a

Yatagai, A., Kamiguchi, K., Arakawa, O., Hamada, A., Yasutomi, N., and Kitoh, A.: APHRODITE: constructing a long-term daily gridded precipitation dataset for Asia based on a dense network of rain gauges, B. Am. Meteorol. Soc., 93, 1401–1415, https://doi.org/10.1175/BAMS-D-11-00122.1, 2012. a

Yates, E., Anquetin, S., Ducrocq, V., Creutin, J.-D., Ricard, D., and Chancibault, K.: Point and areal validation of forecast precipitation fields, Meteorol. Appl., 13, 1–20, https://doi.org/10.1017/S1350482705001921, 2006. a

Yin, Z., Lin, P., Riggs, R., Allen, G. H., Lei, X., Zheng, Z., and Cai, S.: A synthesis of Global Streamflow Characteristics, Hydrometeorology, and Catchment Attributes (GSHA) for large sample river-centric studies, Earth Syst. Sci. Data, 16, 1559–1587, https://doi.org/10.5194/essd-16-1559-2024, 2024. a

You, Y., Peters-Lidard, C., Ringerud, S., and Haynes, J. M.: Evaluation of rainfall-snowfall separation performance in remote sensing datasets, Geophys. Res. Lett., 48, e2021GL094180, https://doi.org/10.1029/2021GL094180, 2021. a, b

Zhang, A. T. and Gu, V. X.: Global Dam Tracker: A database of more than 35,000 dams with location, catchment, and attribute information, Sci. Data, 10, 111, https://doi.org/10.1038/s41597-023-02008-2, 2023. a

Zomer, R. J., Trabucco, A., Bossio, D. A., and Verchot, L. V.: Climate change mitigation: A spatial analysis of global land suitability for clean development mechanism afforestation and reforestation, Agr. Ecosyst. Environ., 126, 67–80, https://doi.org/10.1016/j.agee.2008.01.014, 2008. a, b

Zsótér, E., Cloke, H., Prudhomme, C., Harrigan, S., de Rosnay, P., Muñoz-Sabater, J., and Stephens, E.: Trends in the GloFAS-ERA5 river discharge reanalysis, ECMWF Technical Memoranda, https://doi.org/10.21957/p9jrh0xp, 2020. a

Articles

Short summary

Our study evaluated 24 precipitation datasets using a hydrological model at global scale to assess their suitability and accuracy. We found that MSWEP (Multi-Source Weighted-Ensemble Precipitation) V2.8 excels due to its ability to integrate data from multiple sources, while others, such as IMERG (Integrated Multi-satellitE Retrievals for Global Precipitation Mission) and GDAS (Global Data Assimilation System), demonstrated strong regional performances. This research assists in selecting the appropriate dataset for applications in water resource management, hazard assessment, agriculture, and environmental monitoring.