Satellite-driven downscaling of global reanalysis precipitation products for hydrological applications

Introduction Conclusions References


Introduction
Flooding is one of the costliest natural hazards, occurring repeatedly around the globe (e.g., Sampson et al., 2014;Hagen and Lu, 2011).Flood vulnerability analysis provides essential information to support decisions for policy and preparedness against catastrophic flood consequences and for quantifying risk for coping with this hazard (Sampson et al., 2014).However, flood frequency maps are not available for most regions around the world (Hagen and Lu, 2011) due to limited economic resources to support long-term observations; this results in lack of knowledge and data (e.g., groundbased rain gauge measurements).Developing global-scale flood maps (Porter and Demeritt, 2012) is of increasing interest in the scientific community with great applicability in the (re)insurance industry.Global gridded precipitation data sets from satellites and reanalysis data sets derived from data assimilation systems are two main sources for deriving global flood hazard maps (Cloke et al., 2013;Kappes et al., 2012).
Characterizing the uncertainty in existing global gridded precipitation products is vital for the purpose of hydrological applications.Syed et al. (2004) showed that rainfall is responsible for nearly 70-80 % of the variability in the land surface hydrology.Therefore, precipitation uncertainty would critically affect the predicted variability in hydrologic simulations.Several validation studies have investigated uncertainties related to satellite rainfall remote sensing over diverse geographic and hydroclimatic regimes (Adler et al., 2001;AghaKouchak et al., 2009;Brown, 2006;Dinku et al., 2007;Ebert et al., 2007;Krajewski et al., 2000;Mc-Collum et al., 2002;Seyyedi et al., 2014a;Stampoulis et al., 2013;Su et al., 2008;Tang et al., 2010).These studies have shown that the precision of satellite rainfall products depends on precipitation type (e.g., deep convection vs. shallow convection), as well as terrain and climatological factors (AghaKouchak et al., 2011;Demaria et al., 2011;Turk and Miller, 2005;Seyyedi et al., 2014a).Gottschalck et al. (2005) evaluated precipitation products from global models, satellite and radar data against ground-based gauge measurements over the continental United States (CONUS) for a period of 14 months.They demonstrated that some of the reanalysis precipitation products (ECMWF, GEOS, and GDAS) can generally perform better than satellite precipitation data sets (TRMM3B42RT and PERSIANN).Peña-Arancibia et al. (2013) assessed daily detection and accuracy metrics for reanalysis and satellite precipitation data sets against gauge data.They argued that no product could demonstrate superior performance relative to the other, e.g., ERA-Interim is better in southern and northern Australia, JRA-25 performs better in southern and eastern Asia, and TRMM3B42 and CMORPH are better during monsoon periods.Therefore, combined use of different data sets (including satellite and reanalysis) is expected to perform better than any single product, especially for hydrological applications.
Substantial efforts have been devoted to assessing the feasibility of utilizing global-scale precipitation data sets derived from satellite or models on land surface hydrological modeling (Behrangi et al., 2011;Beighley et al., 2011;Hong et al., 2006Hong et al., , 2007;;Hossain andAnagnostou, 2004, 2005;Nijssen and Lettenmaier, 2004;Su et al., 2008;Bitew and Gebremichael, 2011;Wu et al., 2014).Some of these studies have highlighted the effect of product resolution (Gourley et al., 2011) and catchment size (Vergara et al., 2013) on the precipitation error propagation in hydrological simulations.Seyyedi et al. (2014b) recently utilized gridded precipitation data sets from the TRMM3B42V7 (25 km, 3 h) and GLDAS reanalysis (100 km, 3 h) to conduct a more in depth assessment of the effect of resolution and data type (satellite vs. reanalysis product) on streamflow simulations at subdaily scale.The study was based on a multiyear (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011) and multiscale approach considering 1006 sub-basins (36-71 000 km 2 ) of the Susquehanna River basin in the northeastern United States.They demonstrated that statistical scores in both rainfall and runoff simulations improve with increasing basin size.However, the satellite data set (TRMM3B42V7) was shown to perform significantly better than the reanalysis (GLDAS) in the simulated runoff values.The mean relative error in runoff simulations based on GLDAS was up to 7 times higher than that of TRMM3B42V7, which was attributed to the product resolution and associated underestimation of heavy precipitation.Results from that study suggest the use of downscaling and error correction for the GLDAS reanalysis precipitation data set before implementing it for runoff simulations.Bastola and Misra (2014) also evaluated two reanalysis precipitation data sets (ERA-40 and NCEP-R2) for hydrologic simulations over 28 small to midsize basins in the southeastern United States.Their results demonstrated that ER-40 tends to underestimate while NCEP-R2 tends to overestimate relative to the reference data.They also concluded that downscaling the reanalysis precipitation products would significantly increase their performance in terms of runoff simulations.
The critical role of high-resolution gridded rainfall data sets for hydrological simulations has led to the development of several rainfall disaggregation algorithms (e.g., Brussolo et al., 2008;Ferraris et al., 2003;Fowler et al., 2007;Frei et al., 2006;Maraun et al., 2010;Ning et al., 2011;Park, 2013;Rahman et al., 2009;Ramírez et al., 2006;Tao and Barros, 2010).The main assumption for some recently developed downscaling methods for satellite-based products is the relationship between spatial variability of rainfall and environmental factors such as topography and land surface conditions.Immerzeel et al. (2009)  The reason for using satellite data sets is that a great deal of effort has been devoted to improving the accuracy and resolution of satellite retrievals, which is paired with the recent advent of satellite missions on precipitation (Hou et al., 2014).Moreover, satellite products are globally available, which leads to a globally consistent downscaling scheme for reanalysis products that can be particularly useful over areas lacking long-term ground-based observations.This study is motivated by the challenges relating to precipitation applications due to the nonlinear error propagation from rainfall to hydrological simulations and the vital need for high-resolution and long-term gridded rainfall data for deriving flood frequency statistics and corresponding flood hazards maps.Specifically, we examine the hydrologic impact of using the higher resolution and accuracy quasi-global satellite precipitation product from TRMM3B42V7 to derive finer scale and error-corrected precipitation maps from the GLDAS reanalysis product.The methodology developed for the satellite-driven error correction and downscaling of GLDAS rainfall data is based on a stochastic error model which was originally developed for modeling the satellite retrieval uncertainty and its error propagation in hydrological applications (Hossain and Anagnostou, 2004;Maggioni et al. 2012Maggioni et al. , 2013)).The methodology is independent of groundbased measurements, which makes it applicable over data- This paper is organized into six sections.After the introduction, the study area and data sets are described, including the model used for hydrological simulations.The third section introduces the downscaling and error correction scheme, including the experiment setup and parameter calibration.The fourth section presents the error analysis methodology.The fifth section describes the results of the error analysis in rainfall and simulated runoff values.Finally, the conclusions section discusses the main findings of this research and provides recommendations for future studies.

Study area, data sets, and models
The study area is the Susquehanna River basin (39 to 43 • N and 75 to 79 • W, Fig. 1), which is the largest basin in the eastern United States.The highest peak (949 m above sea level) is in the northwestern corner and the lowest point (22 m below sea level) is in the southeastern corner with a general elevation gradient from north to southeast.The total area of the Susquehanna River basin is 71 000 km 2 , of which 76 % is in Pennsylvania, 23 % in New York, and 1 % in Maryland.The Susquehanna River basin is subject to major floods occurring once every 14 years with an average annual flood damage on the order of USD 150 million dollars (Susquehanna River Basin Commission, http://www.srbc.net/).Cumulating the drainage areas along the river network at the outlet of each individual catchment provides 373 unique watersheds with drainage areas ranging from 315 to 71 000 km 2 .The identified sub-basins were divided into five basin size categories (see Table 1) to study the effect of basin scale on the precipitation and runoff simulation error.
The study focuses on 437 flood-inducing rainfall events that occurred between 2002 and 2011.To investigate the effect of seasonality, the events were grouped by season.The number of events per season is reported in Table 2. Sixty percent of the events in each season were used for the downscaling model calibration, and the remaining 40 % were kept for  determining error statistics (results presented in this study).
Figure 2 shows the cumulative probabilities (CDF) of the events randomly selected for inclusion in the calibration and validation data sets per season.The figure indicates that the probability distributions of calibration and validation rainfall rates are very close to each other, which indicates that the calibration and validation periods have similar statistical properties in terms of rainfall rates.It is noted that the study is based on time series of catchment average precipitation values from each data set.Catchment average is the weighted average of all the data set's pixel values contained within a catchment's boundary, where the weights are based on the fraction of the catchment covered by each pixel.

Stage IV radar data
The radar-based NCEP stage IV precipitation data (Lopez, 2011)

TRMM3B42V7
TRMM3B42V7 is a combined microwave-infrared precipitation product (Huffman et al., 2007) (Kidd et al., 2011).The TRMM3B42V7 combination scheme is based on the Goddard profiling (GPROF) algorithm (Kidd et al., 2011;Kummerow et al., 2001;Kummerow et al., 1996;Olson et al., 1999;Wang et al., 2009;Gopalan et al., 2010) for rainfall estimation from PMW imagers (TMI, SSM/I, and AMSR).The PMW-calibrated infrared (IR) precipitation products (Janowiak et al., 2001) from Geosynchronous Earth Orbit (GEO) satellites are used to fill in the PMW gaps.Specifically, the algorithm takes the value of the PMWcalibrated IR precipitation products when the PMW is not available in a 3-hourly time step.The algorithm uses monthly ground precipitation gauge data extending between 50 • N and 50 • S for bias removal and calibration.

GLDAS
The reanalysis precipitation data set is from GLDAS and has 100 km spatial and 3-hourly temporal resolution.The reasons for selecting GLDAS are its global coverage, relatively high temporal resolution, and long data record (since 1979).The data are "observation based", coming from a combination of reanalysis data from the Global Data Assimilation System (GDAS) from the National Center for Environmental Prediction (NCEP), NOAA Climate Prediction Center's CMAP (CPC Merged Analysis of Precipitation) precipitation (Xie and Arkin, 1997), and radiation data sets from the US Air Force's AGRicultural METeorological modeling system (AGRMET) (Rodell et al., 2004).GDAS assimilates global meteorological observations.CMAP consists of merged satellite-based IR and MW observations with rain gauge analysis.AGRMET radiation fields are satellite observation based.GLDAS therefore represents merged, spatially and temporally interpolated fields of GDAS, CMAP, and AGREMET fields.

Hydrologic model simulations
Hillslope River Routing (HRR) (Beighley et al., 2009(Beighley et al., , 2011) ) is the modeling framework used in this study.HRR integrates a water balance model for the vertical fluxes and a routing model for the horizontal fluxes of the surface and subsurface runoff and streamflow.For each model unit, the landscape is approximated as an open book with two planes draining laterally to a main channel.Water and energy balance is used to simulate the vertical fluxes and storages of water in and through the soil layers on each plane.Flow routing is then performed using variants of the kinematic wave method for both the surface and subsurface runoff from hillslopes and diffusion wave methodologies (i.e., Muskingum-Cunge) for channels.Seyyedi et al. (2014b) provided details about the model implementation in the Susquehanna River basin and reported model specifications, parameter calibration, and performance results.In addition to the base model parameters (e.g., vertical hydraulic conductivity, suction head, and soil depth), three parameters were calibrated in Seyyedi et al. (2014b) based on soil and land cover data: horizontal conductivity, Kh, for the subsurface routing; overland flow roughness, N , for surface routing; and Manning's roughness, n, for channel routing.These parameters are scale dependent in that they capture both the hydraulic features (river reach and hillslope lengths) defined for a given model unit as well as all sub-model unit features not represented at the defined model scale (e.g., all tributaries not explicitly represented in the defined river network).The calibration was performed by systematically adjusting the three parameters (Kh, N , n) to achieve zero mean error (ME, m 3 s −1 ) for the annual maximum peak discharges at nine streamflow gauging stations shown in Fig. 1.As reported in Seyyedi et al. (2014b), model performance after calibration includes zero mean error for the entire basin, while mean relative errors for individual gauges ranged between −16 and 23 %, and errors for individual events ranged between −62 and 224 %.The largest error is from the gauge draining one of the smallest basin areas (1155 km 2 ) during Tropical Storm Lee in 2011, which caused significant flooding, especially in the northern Susquehanna River basin.Overall, 86 % of the errors are within ±50 %, and approximately half are within ±25 %.

Error correction and downscaling scheme
The stochastic space-time error model of Hossain and Anagnostou (2006), originally developed for satellite rainfall error modeling (hereafter named SREM2D), was devised in this study to disaggregate and error-correct GLDAS precipitation data sets using reference data from the TRMM3B42V7 satellite precipitation product.Specifically, SREM2D was applied on the coarse (100 km) grid resolution GLDAS precipitation fields to generate 20-member ensembles of error-adjusted precipitation fields at 25 km grid resolution.Figure 3 illustrates the framework for the stochastic downscaling and error correction.First SREM2D parameters are determined for each season using TRMM3B42V7 and GLDAS data from the calibration data sets of each season.Then SREM2D was applied to the GLDAS data during the validation period and evaluated against the reference stage IV gauge-adjusted radar-rainfall fields.Details about the SREM2D model are provided in Hossain and Anagnostou (2006), while below we describe the model calibration results for the different seasons.
SREM2D parameters calibrated in this study are (1) probability of rain detection (PODrain) (see Fig. 4e); (2) mean of the log-transformed multiplicative error, where error is the multiplicative factor "e = Rsensor/Rreference" -this parameter is represented in 2-D spatial fields for each season in Fig. 4a-d; (3) missed mean rain rate; (4) probability of norain detection (POD no-rain); (5) correlation length for the retrieval error (CLrain-downscale) ; (6) correlation length for the successful delineation of rain (CLrain det); and (7) correlation length for the successful delineation of no rain (CLnorain det).The calculated values for parameters 3 to 7 are presented in Table 3 for the selected calibration events in each season.
In terms of spatial patterns, the correlation lengths of rain detection, no-rain detection, and downscaled rain for all seasons are less than 83 km.The lower correlation length indicates lower dependence between variables in space.Regarding the random error, the range of standard deviation of logarithmic multiplicative errors is between 1.2 (fall) and 1.65 (winter).The values represent higher magnitude of variability of error between reference and sensor data in winter relative to the other seasons.The POD no−rain takes its maximum value during the summer season (0.98), while it drops to 0.85 for the winter season.The maximum mean rain rate of nondetected values is 0.82 for the summer; the corresponding value is 0.39 for the winter.Summer events are associated with higher rain rates, which results in higher non-detected rain rates from GLDAS.
The mean of the log-transformed multiplicative error for each season is presented in 2-D spatial fields (Fig. 4a, b, c, d).The negative mean logarithmic error indicates that GLDAS is underestimating relative to the TRMM3B42V7.As we see in Fig. 4a, b, c, d, GLDAS underestimates almost everywhere and for all seasons.The magnitude of underestimation in the summer is relatively higher than in the other seasons.The probability of rain detection is presented as a function of GLDAS rain rate (Fig. 4e).The summer events exhibit the highest values, whereas fall and spring have lower POD values.
Figure 5 presents the accumulated values based on all validation events for the different precipitation products and the 20-member SREM2D-generated ensembles of GLDAS downscaled precipitation, depicted by the shaded area in the plot.GLDAS rainfall significantly underestimates the other two precipitation data sets, especially in spring, fall, and winter seasons, while the SREM2D-generated ensemble envelops well encapsulate the TRMM3B42V7 and, in most cases, the ground-based reference accumulated rainfall.This indicates that the disaggregated GLDAS precipitation data are in agreement with the TRMM3B42V7 and the corresponding ground-based radar rainfall data.

Error analysis methodology
The error analysis devised in this study, aimed to demonstrate the degree of improvement due to downscaling, consists of three main hydrologic components (Fig. 6): reference simulation, observation simulation, and downscaled and error-corrected simulation.Reference simulation is based on generating runoff values through forcing HRR with the reference radar rainfall data.Observation simulation refers to forcing HRR with GLDAS or TRMM3B42V7 at the product resolution.Downscaled and error-corrected simulation refers to forcing HRR with the ensemble mean of the SREM2Ddownscaled GLDAS precipitation fields.There are two error analysis steps associated with the three main components: the rainfall error analysis and simulated surface runoff error analysis.Each error analysis component consists of three statistical metrics: quantile-quantile (Q-Q) plots, mean scale quantile relative error (QRE), and the quantile root mean square of error relative to the mean of reference (QRMSE).
The Q-Q plots are used to compare basin-average quantile rainfall and runoff values from the various data sources (GLDAS at 100 km, TRMM3B42 at 25 km, mean GLDAS downscaling ensemble at 25 km) against the reference data source (radar at 4 km).The QRE is defined as the ratio of the sum of differences between reference and sensor values (precipitation or runoff) to the sum of reference values determined over the sub-basins for each quantile range:   where P s Sensor is the sensor "basin-averaged" precipitation/runoff value, P s ref is the reference "basin-averaged" precipitation/runoff value over the sub-basin, t is the threshold value which is based on the reference data quantiles, j is the quantile index, and n is total number of value in a particular scale and quantile range.The perfect value for this metric is zero, which means there is no difference between reference and the sensor values.The negative QRE value means the sensor is underestimating, and the positive value means overestimating.
QRMSE is the root mean square of the differences between reference and sensor; it is normalized to the mean of reference values. .
(2) QRMSE quantifies the spread between sensor and reference data points.
To determine dependence of the error metrics on storm severity, QRE and QRMSE statistics are categorized into two groups according to the quantile values of rainfall and runoff, namely, values between 75th and 90th percentile and greater than the 90th percentile that represent moderate and extreme events, respectively.To investigate the effect of seasonality and basin-scale statistics, Q-Q plots are presented for the four seasons and different basin scales.

Rainfall error analysis
As mentioned above, the rainfall error analysis is divided into two categories: frequency distribution and quantitative statistics.The frequency distribution uses the Q-Q plots, and the quantitative statistics include the QRE and QRMSE error metrics.These are discussed next.

Frequency distribution
To assess the correspondence between sensor and reference rainfall data, we plotted the quantile values from TRMM3B42V7, GLDAS, and downscaled ensemble-mean GLDAS (sensor) against the corresponding quantile values of the reference radar rainfall (Fig. 7).In each Q-Q plot the x axis represents sensor values and the y axis represents radar values in millimeters per hour.We show significant changes in the Q-Q plots for the different basin scales (small to large) and seasons.
GLDAS show a systematic underestimation at all seasons and basin scales.The underestimation is most severe at the smallest basin scales (top panels).During the summer convective rainfall season, the underestimation reduces significantly for medium to large basin scales, and it changes to slight overestimation for the small quantile values (< 1 mm h −1 ).On the other hand, the GLDAS downscaled ensemble-mean data exhibit much better agreement with the reference radar rainfall data.The best agreement is observed during the fall and summer seasons, while good agreement is also depicted during the spring season.The winter season exhibits a strong underestimation (overestimation) of the low (high) quantile values.Overall, the downscaled GLDAS precipitation data set exhibits similar performance to the TRMM3B42V7 product in the fall, summer, and spring seasons, while in the winter, the downscaled GLDAS shows stronger underestimation than TRMM3B42V7 for the low quantile values.

Quantitative statistics
The seasonal variation of the mean relative error and relative root-mean-square error statistics versus basin scale for GLDAS, TRMM3B42V7, and the downscaled ensemblemean GLDAS are presented in Figs. 8 and 9, respectively.These statistics are based on precipitation values that exceed the 90th percentile.The main point to note is that no data sets show significant changes with basin scale.In spring, GLDAS shows significant underestimation, while TRMM3B42V7 is almost unbiased, and the downscaled GLDAS shows slightly overestimation for all basin size categories.In summer all data sets exhibit underestimation.The magnitude of underestimation in GLDAS is significantly higher than that of TRMM3B42V7 or the downscaled ensemble-mean GLDAS.In fall, GLDAS exhibits significant underestimation, while the downscaled ensemble-mean GLDAS is almost unbiased, in contrast to TRMM3B42V7, which exhibits slight overestimation.In winter, GLDAS shows underestimation, while TRMM3B42V7 and downscaled ensemble-mean GLDAS show overestimation.The magnitude of overestimation in the downscaled ensemble-mean GLDAS is lower than the underestimation in GLDAS.For the random component of precipitation error (relative RMSE), the three precipitation data sets are performing similarly, with scores very close in the summer and fall seasons (scores ranging between 0.9 and 1.05).Overall, GLDAS exhibits lower relative RMSE values than the other two precipitation data sets, with this difference becoming more significant (range between 0.8 and 1.4) during winter and spring seasons.

Simulated runoff error analysis
Time series of the simulated runoff for the entire basin derived from forcing the HRR model with GLDAS (observation simulation), TRMM3B42V7 (product simulation), downscaled ensemble-mean GLDAS (downscaled and errorcorrected simulation), and radar-rainfall data (reference simulation) for the validation data sample of each season are presented in Fig. 10.As shown in the time series plot, GLDAS systematically underestimates runoff relative to the other data sets, and particularly during the major hurricane events in the fall.The downscaled ensemble-mean GLDAS performs significantly better and is shown to be able to capture  the events and the overall flow patterns.In the case of the high-flow fall events (associated with two hurricanes), the downscaled ensemble-mean GLDAS simulated runoff seems to be between TRMM3B42V7 and reference data.Below we discuss Q-Q plots and QRE and QRMSE error metrics for the runoff simulations.

Frequency distribution
The Q-Q plots of the simulated runoff values from the three data sets (i.e., TRMM3B42V7, GLDAS, and downscaled ensemble-mean GLDAS) against the reference simulations are presented in Fig. 11.Similar to Fig. 7, GLDAS exhibits a strong underestimation of runoff at all seasons and basin scales.The underestimation is shown to be more significant in the fall, spring, and winter seasons, while it reduces significantly during the summer events.The ensemble-mean downscaled GLDAS, on the other hand, exhibits very good agreement with the reference values, particularly during fall and spring seasons.This agreement is very similar to the one exhibited for the TRMM3B42V7 data set, indicating that downscaling causes GLDAS to perform similarly to the corresponding TRMM3B42V7 data set, which was used in the calibration of the stochastic model parameters.

Quantitative statistics
Figures 12 and 13 show the two error metrics (QRE and QRMSE) determined for the validation sample reference runoff simulation values exceeding the 90th percentile value for the different seasons.As shown in the QRE plots of Fig. 12, GLDAS significantly underestimates in all seasons (Fig. 12).The magnitude of underestimation is the strongest in summer and fall seasons and the lowest in spring.Winter season underestimation reduces with increasing basin scale.The ensemble-mean downscaled GLDAS QRE values exhibit significant bias reduction in runoff simulations, particularly in the fall and winter seasons.In spring, the downscaled GLDAS exhibits overestimation, which is still lower in absolute magnitude than the underestimation of the original GLDAS runoff simulations.The QRE values of the TRMM3B42V7 product are consistently low, showing a positive bias of < 10 %.
For the random error component, downscaling consistently improves the QRMSE statistic at all basin scales and for all seasons.The greatest reduction on QRMSE is in the summer and winter seasons, while spring exhibits the least effect.The satellite product (TRMM3B42V7) shows consistently lower QRMSE values than both GLDAS and downscaled GLDAS products for all basin scales and seasons.The greatest difference is in the summer and fall seasons, which are associated with more organized convective systems and less snow/mixed-phase precipitation.The spring season also exhibits a slight basin-scale dependence on QRMSE for the downscaled GLDAS and TRMM3B42V7 product-driven runoff simulations; no significant basin-scale dependence is presented for the other seasons or products.
The above findings are in contrast with the increased random error component shown in the downscaled GLDAS precipitation product (Fig. 9).To understand this aspect, we present in Table 4 the QRMSE ratios between runoff and precipitation (error propagation) for the two products, seasons, and basin scales.The downscaled GLDAS exhibits dampening of the random error component from precipitation to runoff simulations; this dampening seems to be less dependent on basin scale and more related to season.For example, winter and spring seasons exhibit the strongest dampening of random error (ratios around 0.5), while in the summer the ratio is around 1 (i.e., no change), and in the fall the ratio is around 0.8 with a slight basin-scale dependence (i.e., ranging from 0.86 for basins below 1000 km 2 to 0.79 for basins greater than 10 000 km 2 ).On the other hand, the original GLDAS product shows either an increase in the random error component from precipitation to runoff simulations during summer and fall seasons or a weaker (about half) dampening, compared to the downscaled product, in winter and spring seasons.These differences in precipitation to runoff error propagation convert the slightly increased random error of the downscaled GLDAS product in precipitation to a significantly lower random error in runoff simulations, which is consistent with our aim of improving the hydrologic use of GLDAS products in flood modeling.

Conclusions
The aim of the study was to evaluate a stochastic downscaling and error correction approach for improving the use of a global reanalysis precipitation data set (GLDAS) in flood simulations.GLDAS is available over a relatively long time period (since 1979), which provides a good source of precipitation data for hydrological analyses and global flood hazard mapping.However, it has been shown in Seyyedi et al. (2014b) that the resolution and biases of this product introduce significant runoff simulation errors, which limit its applicability for flood modeling.In this study we present the implementation of a two-dimensional stochastic error model (SREM2D) to downscale and adjust GLDAS precipitation data using the higher resolution and accuracy TRMM3B42V7 satellite precipitation product as reference.
The study focused on a large basin (Susquehanna River basin) in the northeastern United States subjected to 437 significant rainfall events over a 10-year period (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011), which were grouped into four seasons.The hydrologic simulations were performed with the HRR model, which was calibrated using radar-rainfall and observations from nine USGS streamflow gauges.
The improvements from downscaling and adjusting the GLDAS precipitation were evaluated in terms of both rain-    fall and runoff simulations using frequency distributions and quantitative error metrics.The effect of basin scale and seasonality were considered in this analysis.For the precipitation error analysis, the quantile-quantile (Q-Q) plots indicated that GLDAS is meaningfully underestimating in all seasons and on all basin scales, while the satellite-driven downscaled GLDAS ensembles significantly reduced that bias, reaching a performance similar to the TRMM3B42V7 precipitation product.This was confirmed by the mean relative error statistic, where downscaled GLDAS shows substantial reduction of the strong underestimation exhibited in the original GLDAS product.The error analysis in simulated runoff values gave similar bias patterns to those in the precipitation products.The downscaled ensemble-mean GLDAS product has considerably reduced bias compared to the original GLDAS product.There is a slight basin-scale effect on the evaluated statistics, with slightly better runoff results for larger basin sizes.The random error in the simulated runoff values reduces for the downscaled ensemble-mean GLDAS product relative to the original GLDAS.This was explained by the properties of the random error propagation from precipitation to runoff simulations.The random error in the original GLDAS is either increasing (summer and fall seasons) or slightly decreasing (winter and spring) from precipitation to runoff.On the other hand, the downscaled GLDAS product showed a remarkable dampening (0.5-0.8) of the random error from precipitation to runoff simulations.This can be attributed to hydrologic processes (infiltration and runoff generation) that can average out the random precipitation error component of the high-resolution products (e.g., downscaled GLDAS), but make discharge errors worse for the strongly underestimated GLDAS rainfall rates within the basin.Overall, results presented in this study indicate that the proposed satellite precipitation based downscaling and error correction method has the potential to improve the hydrological use of GLDAS precipitation reanalysis data sets.The main advantage of this approach is that it uses highresolution global precipitation products from multi-sensor satellite observations, which makes it flexible to implement over areas with limited ground-based measurements.Furthermore, the downscaling scheme is modular in design and can be applied to any gridded data set.
The proposed scheme was demonstrated over the northeastern United States, which is a data-rich area.As stated in the study area section, the TRMM3B42V7 technique uses regional ground-based precipitation measurements from rain gauges to adjust the precipitation retrieval.Although this approach is consistently applied globally, many areas of the world do not have a gauge density as large as the US network.As argued in studies reported in this paper, rain gauge adjustments in data-poor areas may worsen the accuracy of the TRMM3B42V7 product.Therefore, future research should evaluate this scheme on the basis of other satellite products that do not use rain-gauge-based adjustments to more accurately represent the conditions of data-poor areas.Another extension of this research is to apply the SREM2D downscaling scheme on the entire (35-year) record of GLDAS precipitation data to derive multiyear downscaled GLDAS reanalysis ensembles, and use them through the hydrologic model of this study to derive flood return periods for the Susquehanna River basin.Finally, extending the downscaling methodology to GLDAS as well as other reanalysis products, such as ERA-40 and ERA-interim, at the global scale in conjunction with multiyear (1998-2014) high-resolution precipitation products from satellite-only techniques (e.g., CMORPH, PERSIANN) would allow for derivation of a high-accuracy global satellite-driven water resources reanalysis independent of ground measurements.Such products could be used in many engineering and scientific applications, such as flood and drought frequency analyses, design of hydraulic structures, or reservoir design and operation optimization.

Figure 1 .
Figure 1.Study area (left panel) and precipitation product grids (right panel) over the Susquehanna River basin.The red grid indicates GLDAS (100 km), yellow indicates TRMM3B42V7 (25 km) and black indicates the stage IV radar rainfall product (4 km).

Figure 2 .
Figure 2. Cumulative probability of radar rainfall rain rates in the calibration and validation events selected for each season.

Figure 3 .
Figure 3. Stochastic downscaling framework.It consists of two main parts: the left side indicates required SREM2D parameters and the right side shows GLDAS ensemble generation and quality assessment with the absolute reference data (stage IV radar data).

Figure 4 .
Figure 4. SREM2D parameters, 2-D spatial mean of logarithmic error "e" for each season: (a) spring, (b) summer, (c) fall, and (d) winter.(e) Probability of rain detection as a function of GLDAS rain rate.

Figure 5 .
Figure 5. Cumulative precipitation values of the validation events in the four seasons; the shaded area indicates the 20 ensemble members of downscaled and error-corrected GLDAS data.

Figure 6 .
Figure 6.Flow diagram for the error analysis methodology.

Figure 8 .
Figure 8. QRE error metric determined conditional to reference precipitation values exceeding their 90th percentile.The horizontal axis indicates basin-scale categories presented in Table 2. Results are presented for spring (upper left panel), summer (upper right), fall (lower left), and winter (lower right) seasons.

Figure 9 .
Figure 9. QRMSE error metric determined conditional to reference precipitation values exceeding their 90th percentile.The horizontal axis indicates basin-scale categories presented in Table 2. Results are presented for spring (upper left panel), summer (upper right), fall (lower left), and winter (lower right) seasons.

Figure 10 .
Figure 10.Runoff time series driven by the different precipitation products over the basin indicated in Fig. 1 and consisting of the selected validation events of each season.GLDAS shows underestimation in all cases; meaningful improvement is shown in downscaled error-corrected GLDAS.

Figure 12 .
Figure 12.QRE error metric determined conditional to reference runoff values exceeding their 90th percentile.The horizontal axis indicates basin-scale categories presented in Table 2. Results are presented for spring (upper left panel), summer (upper right), fall (lower left), and winter (lower right) seasons.

Figure 13 .
Figure 13.QRMSE error metric determined conditional to reference runoff values exceeding their 90th percentile.The horizontal axis indicates basin-scale categories presented in Table 2. Results are presented for spring (upper left panel), summer (upper right), fall (lower left), and winter (lower right) seasons.

Table 1 .
Number of basins for each basin-scale category.

Table 2 .
Number of flood events selected for each season.

Table 3 .
SREM2D parameters determined for GLDAS downscaling for the four seasons using the calibration events.

Table 4 .
The ratio of QRMSE in runoff to QRMSE in precipitation for GLDAS and ensemble-mean downscaled GLDAS data.