Articles | Volume 22, issue 2
Research article
07 Feb 2018
Research article |  | 07 Feb 2018

Near-real-time adjusted reanalysis forcing data for hydrology

Peter Berg, Chantal Donnelly, and David Gustafsson

Extending climatological forcing data to current and real-time forcing is a necessary task for hydrological forecasting. While such data are often readily available nationally, it is harder to find fit-for-purpose global data sets that span long climatological periods through to near-real time. Hydrological simulations are generally sensitive to bias in the meteorological forcing data, especially relative to the data used for the calibration of the model. The lack of high-quality daily resolution data on a global scale has previously been solved by adjusting reanalysis data with global gridded observations. However, existing data sets of this type have been produced for a fixed past time period determined by the main global observational data sets. Long delays between updates of these data sets leaves a data gap between the present day and the end of the data set. Further, hydrological forecasts require initializations of the current state of the snow, soil and lake (and sometimes river) storage. This is normally conceived by forcing the model with observed meteorological conditions for an extended spin-up period, typically at a daily time step, to calculate the initial state. Here, we present and evaluate a method named HydroGFD (Hydrological Global Forcing Data) to combine different data sets in order to produce near-real-time updated hydrological forcing data of temperature and precipitation that are compatible with the products covering the climatological period. HydroGFD resembles the already established WFDEI (WATCH Forcing Data–ERA-Interim) method (Weedon et al.2014) closely but uses updated climatological observations, and for the near-real time it uses interim products that apply similar methods. This allows HydroGFD to produce updated forcing data including the previous calendar month around the 10th of each month. We present the HydroGFD method and therewith produced data sets, which are evaluated against global data sets, as well as with hydrological simulations with the HYPE (Hydrological Predictions for the Environment) model over Europe and the Arctic regions. We show that HydroGFD performs similarly to WFDEI and that the updated period significantly reduces the bias of the reanalysis data. For real-time updates until the current day, extending HydroGFD with operational meteorological forecasts, a large drift is present in the hydrological simulations due to the bias of the meteorological forecasting model.

1 Introduction

Large-scale hydrological models on global or continental scales require meteorological forcing data at, typically, daily time resolution. There is a lack of data with high quality and consistency between variables on such scales; however, data on coarser monthly scales are more prominent. Reanalysis data fulfill the spatial and temporal consistency but suffer from bias that limits their use for hydrological simulations. Current data sets that merge reanalysis and coarser observations bridge the data gap but are mostly only episodically updated (Sheffield et al.2006; Weedon et al.2011, 2014; Beck et al.2017).

The degree to which the skill of a hydrological forecast is sensitive to the initial hydrological conditions, on the one hand, and the meteorological forcing in the forecast period, on the other hand, depends on factors such as the hydrometeorological regime of the catchment and the memory of the hydrological system. The hydrological skill sensitivity to the initial state and/or the meteorological forecast varies as a function of the season, which has been shown for both seasonal and short-term forecasts (Li et al.2009; Shukla and Lettenmaier2011; Paiva et al.2012; Demirel et al.2013; Pechlivanidis et al.2014). In most cases, however, hydrological forecast models are initialized by hindcast simulations covering some period before the forecast issue date, for which appropriate meteorological forcing data are needed.

Climatological and hydrological simulations require consistent forcing data for a long period, which can be problematic with gauge-based data sets if the gauge location and the network density are very different between the observed variables. Observational data sets with global coverage are sparse regarding data with at least daily resolution, but there are exceptions such as the Climate Prediction Center's (CPC) products for temperature (CPCtemp2017) and precipitation (Chen et al.2008). There are also several promising satellite-based products, such as the Tropical Rainfall Measuring Mission (TRMM) (Huffman et al.2009b) and the Global Precipitation Measurement (GPM) mission, although satellite data require adjustments to ground truth observations. The negative aspects of the above data sets are problems with spatial coverage, because of non-sampled (polar) regions for the satellite data and lack of gauges in parts of the world for gauge-based data. The gauge density becomes even more important for gridding precipitation at the daily timescale.

Operational models working on a global scale have found ways to work with sparse observations. The Global Flood Awareness System (GloFAS) uses the ERA-Interim (EI) reanalysis (Dee et al.2011), with precipitation adjusted using data from the Global Precipitation Climatology Project (GPCP; Huffman et al.2009a) on a monthly timescale (Alfieri et al.2013; Hirpa et al.2016). Another global-scale model system is the Global Flood Forecasting Information System (GLOFFIS), where the meteorological forcing data are derived from several sources, such as gauge measurements, CPC unified gridded precipitation (Chen et al.2008) and the ECMWF control forecast (Emerton et al.2016).

Earlier methods (Sheffield et al.2006; Weedon et al.2011, 2014; Beck et al.2017) have merged information from a reanalysis with temporally coarser observational data to produce new data sets that inherit the temporal resolution of the reanalysis with the average properties of the observations. With these methods, long periods of internally consistent daily or sub-daily resolution and global coverage become available for, e.g., large-scale hydrological simulations. The various methods have applied different reanalysis data sets and observational records and therefore differ in their final result. The more simple method is that of Weedon et al. (2014), where mainly single data sets are applied globally for the adjustment of each variable. Although this leaves the method highly dependent on the quality and availability of a few data sets, it makes the method less affected by temporal and spatial inconsistencies between periods and regions. An issue with relying on gridded observational data sets is that such data are often updated episodically and with several months or even years of delay before they are updated. This can be an issue for global or continental hydrological forecasting where up-to-date information is important, thus requiring a continuous updating of the forcing data while retaining a consistent climatology.

Here, we present the HydroGFD (Hydrological Global Forcing Data) method for producing adjusted meteorological forcing data sets for a near-global domain. The novelty in the production of the data sets is the combination of reanalysis and operational global model input, as well as the combination of various observational data sources to fill the gap between the present and the end of the climatological products. We evaluate the updating procedure of the climatological data by direct comparison of the meteorological data, as well as by employing a hydrological model to evaluate the data sets. The main motivation for creating the data set is to update climatological simulations, but also to improve the initialization for hydrological forecasting at large scales or in data-sparse regions where dense observational data are not available for initialization. We present evaluation of two such applications for the Arctic and European setups of the hydrological models E-HYPE (European HYPE) and Arctic-HYPE.

2 Methods and data

The HydroGFD method is currently intended to be a substitute and extension of precipitation and temperature from the WFDEI (WATCH Forcing Data–ERA-Interim) method (Weedon et al.2011), which is currently used in many hydrological simulations with HYPE (Lindström et al.2010) and other hydrological models.

We are therefore mimicking the WFDEI setup closely, however, with some necessary differences due to updates of the meteorological observations since the first appearance of WFDEI. The HydroGFD data set is currently limited to precipitation and temperature at 3- and 6-hourly intervals, whereas WFDEI produces several additional variables (Weedon et al.2011). The basic method is to construct monthly mean adjustment factors per calendar month for each variable and to adjust every time step during the month with that factor. For temperature, the adjustment factor is produced by subtracting the monthly mean reanalysis from the observations and adding this to every time step of the reanalysis. For precipitation, a first step of adjusting the number of wet days is performed. The underlying assumption is that the reanalysis model produces excessive light rainfall (drizzle). Days with the least amount of rainfall that are in excess to the observed rainy days are set to zero. In a second step, the ratio between the monthly mean observations and the reanalysis data is calculated and used to scale the reanalysis data.

Table 1Table of meteorological forcing data used in the analyses and hydrological simulations, listing the atmospheric model and the data sources used to adjust the the model for each variable (precipitation, number of wet days, and temperature). The data sets are described in Table 2.

Download Print Version | Download XLSX

The HydroGFD system has been applied to produce the main climatological data set called GFDCL, which is a methodological equivalent to the WFDEI (Weedon et al.2011) data set except for updated climatological observations (see Table 1) and differences in the implementation. GFDCL, like WFDEI, is based on the ERA-Interim reanalysis but is coded so that EI can be interchanged with other reanalyses. Precipitation is corrected for wet-day bias compared to wet-day information from the CRUts3.22 data set from the Climate Research Unit (CRU; Harris and Jones2014) and scaled with monthly precipitation from GPCC7 (see Table 2). Temperature was corrected additively with CRU monthly mean temperature. The GFDCL data set is restricted to the time period 1979–2013, due to the start of the EI reanalysis period, and by the end of the GPCC7 (Schneider et al.2014) observational data set. The main difference between GFDCL and WFDEI arises from the treatment of undercatch, i.e., the rainfall likely not captured by the rain gauges due to turbulence around the gauge. WFDEI applied the Adam and Lettenmaier (2003) undercatch correction to the GPCC5 and GPCC6 data sets. With GPCC7, the undercatch correction is already included in the data set and does not need to be applied in the HydroGFD methodology. However, for GPCC7, the undercatch correction was based on Legates and Willmott (1990) but reduced by 15 % to better fit with their own estimates (Schneider et al.2014). Adam and Lettenmaier (2003) compared their method to that of Legates and Willmott (1990) and found the latter to lead to a too-low precipitation amount by about 5–30 % and differences in the annual cycle of the correction factors. There is clearly a large controversy on this topic. We therefore expect differences between GFDCL and WFDEI in both annual totals and in the annual cycle.

The main issue tackled here is how to implement the WFDEI methodology forward in time as GPCC7 becomes unavailable, or when EI becomes unavailable. We propose two flavors of HydroGFD to extend the period past year 2013 (see Table 1 for data sets and references):

  1. GFDEI consists of the EI data set with precipitation scaled by the GPCC monitoring data set and wet day adjusted according to the GPCC first guess daily product. Temperature is adjusted with the GHCN-CAMS data set.

  2. GFDOD consists of the ECMWF deterministic forecast, which differs from EI by mainly the model version and the assimilated data. Precipitation is scaled by the GPCC first guess monthly data set and wet-day adjustments according to the GPCC first guess daily product. Temperature is adjusted with GHCN-CAMS data.

GFDEI fills the gap between the end of GFDCL in 2013 and the latest available EI data, i.e., lagging about 3 months behind real time. For the last 2 months, GFDOD is used to fill the gap. The necessary data sets are all available for download around the 10th of each month. Figure 1 shows a schematic for how the forcing data are used to update hydrological models to today's date. For example, to update a model to 9 May, the model is forced with GFDEI until 31 January, GFDOD until 31 March and then OD until 9 May. This gives a period of 40 days with unadjusted OD data. However, to update the model to 10 May, because the GPCC monitoring product becomes available on the 10th of the month (at latest), all data shift 1 calendar month and require a shorter period of OD data (unadjusted data). In a hydrological forecasting context, the simulations are updated from the GFDEI data, which is the continuous extension of GFDCL, and the GFDOD and OD parts are rerun after each update to determine the new initial conditions.

Figure 1Schematic of the updating procedure. The HydroGFD data are continuously updated with GFDEI as long as EI data are available. The intermediary data set GFDOD fills up the time series as long as GPCC data are available and then continues with uncorrected OD data. Because the previous month becomes updated on the 10th of each month, the 9th is the day with the longest period of OD driving data. The next month, GFDEI is extended 1 month, and the GFDOD data are updated for the new month.


Because the observational data sets only provide information over land areas, the HydroGFD system only produces adjustments where data are available and retains the original reanalysis, or deterministic forecast, when no data are available. One notable exception is Antarctica, which is not covered by the observational data sets and is therefore not adjusted at any step of the updating procedure.

HYPE model

The HYPE (Hydrological Predictions for the Environment) model is a process-based hydrological model developed for high-resolution multi-basin applications, which has been applied on various spatial scales (from tens to millions of square kilometers) and hydroclimatological conditions (Lindström et al.2010; Strömqvist et al.2012; Arheimer et al.2012; Andersson et al.2015; Gelfan et al.2017). The model is based on a semi-distributed approach where the hydrological system is represented by a network of sub-basins, which are further divided into classes that can be selected to represent combinations of soil type and land cover or elevation zones. The water balance and runoff from each subclass is calculated taking into account processes such as snow and glacier accumulation and melt, infiltration, evapotranspiration, surface runoff, tile drainage, and groundwater recharge and runoff. The runoff from the land classes is further routed through the network of lakes and rivers represented by the sub-basin delineation. The model is used for research and operational purposes to provide information for, for instance, flood and hydropower reservoir inflow forecasting, river discharge and nutrient loads to the ocean, as well as assessment of the climate change impact on hydrological systems.

To evaluate the real usefulness of the HydroGFD data in continental (and by extension global) hydrological forecasting, the HydroGFD data were tested in two continental-scale applications of HYPE. For Europe, the E-HYPE v3.2 (Hundecha et al.2016) hydrological model was calibrated with GFDCL and employed to evaluate the updating versions of HydroGFD. The simulation domain ranges from wet Arctic, wet maritime to dry Mediterranean climatic conditions. The E-HYPE model has been shown to reproduce well the spatial and temporal variability in hydrological processes across Europe (Donnelly et al.2016; Hundecha et al.2016) and has been identified as a useful model for continental-scale forecasting (Emerton et al.2016). E-HYPE takes daily mean precipitation and temperature as input. Potential evapotranspiration is estimated from daily mean temperature and extraterrestrial radiation estimated separately for each sub-basin location and day of the year using the modified Jensen–Haise and McGuinness model following Oudin et al. (2005). For each sub-basin, air temperature and precipitation is taken from the nearest grid point. Temperature is further corrected with a constant lapse rate (0.65 C/100 m) for the difference between the mean sub-basin elevation and the corresponding elevation of the grid point. The elevation correction of precipitation is also possible in the HYPE model, but it is not used in E-HYPE.

For the Arctic, we use the Arctic-HYPE model v3.0 (Andersson et al.2015; Gelfan et al.2017) that covers the land area draining into the Arctic Ocean (excluding Greenland). The model domain is 23 million km2 divided into 32 599 sub-basins with an average size of 715 km2. The Arctic region is characterized by numerous lakes of various size (5 % areal fraction) and glaciers (about 50 % of the glaciated area outside the Greenland and Antarctica ice sheets, mainly on islands in the Canadian Arctic archipelago, Svalbard and Russian Arctic islands) (Dyurgerov and Meier1997; Meier and Bahr1996). To take into account the long turnover times of larger lakes in the domain (for instance Lake Baikal) and the ongoing decline in glacier volume, the Arctic-HYPE model was initialized using an initial spin-up period for the period 1961–2010 using the WFD data (Weedon et al.2011) with a simplified correction of precipitation versus GPCC7 on a monthly basis, to be consistent with the GFDCL data, and extended using GFDCL for the period 1979–2013. As for E-HYPE, Arctic-HYPE is forced by daily mean precipitation and temperature, but, in contrast to E-HYPE, potential evapotranspiration is calculated using the Priestley–Taylor equation assuming it to be more representative for the wide range of climatic conditions in the Arctic-HYPE domain. The Priestley–Taylor equation requires solar radiation and relative humidity, which was estimated using the minimum and maximum daily temperatures as additional input variables, following the recommended procedures by Allen et al. (1998).

Both E-HYPE and Arctic-HYPE models have been parametrized and calibrated with similar step-wise approaches involving first of all sub-basin delineation based on globally available digital elevation data (USGS HydroSHEDS and Hydro1K). Secondly, classification into selected land-use and soil type classes are based on land cover and soil data such as the ESA CCI (European Space Agency Climate Change Initiative) land cover or CORINE (COordination of INformation on the Environment) and HWSD (Harmonized World Soil Database). Thirdly, model parameters governing water balance processes in ice/snow, soil, lakes and rivers were thereafter calibrated in an iterative procedure using river discharge data from the Global Runoff Data Centre (GRDC), as well as data on internal water balance components such as snow (ESA GlobSnow and former Soviet Union snow course data), glaciers (glacier area and mass balance data from ESA CCI glaciers and the World Glacier Monitoring Service) and evapotranspiration (flux-tower data from FluxNet and MODIS evaporation products).

For the evaluation simulations with HydroGFD products, the models are run once per month from 9 May 2010 to 9 December 2013 to recreate a 130-day initialization simulation for each run, ending on the given date. This is the longest possible initialization step, as the meteorological forcing data are updated on the 10th, for which the initializations would advance 1 calendar month (Fig. 1). The first simulation starts from a saved state of the GFDCL simulation in January 2010, and each subsequent run is initialized from a starting state saved from the GFDEI portion of the previous simulation, making the GFDEI simulation continuous in time. A total of 44 simulations are made with each hydrological model. The simulations are then compared with a climatology simulated using GFDCL forcing for each region for the same period 2010–2013 to evaluate the change in simulated hydrology as a result of the changing forcing data products.

Figure 2Climatology of (a) precipitation from GPCC7 and (e) temperature for CRU. Relative percentage difference in climatological precipitation from GPCC7 for (b) EI, (c) GFDCL and (d) WFDEI. Absolute difference in climatological temperature from CRU for (f) EI, (g) GFDCL and (h) WFDEI.


3 Results

We begin with evaluating the GFDCL data set, as well as comparing differences between the various HydroGFD versions. Thereafter, we present the analysis of hydrological simulations for Europe and the Arctic.

3.1 Meteorological evaluation

3.1.1 Climatology 1979–2013

GFDCL is directly comparable to the WFDEI data set due to the very similar method, but will differ due to different underlying data, and handling of precipitation undercatch. Because WFDEI was on several occasions evaluated against flux-tower measurements across the globe (Weedon et al.2011, 2014; Beck et al.2017), we do not repeat such an evaluation for GFDCL here and compare instead to the WFDEI and other data sets.

Dee et al. (2011)Harris and Jones (2014)Schneider et al. (2015b)Schneider et al. (2015a)Ziese et al. (2011); Schamm et al. (2013)Fan and Van den Dool (2008)Weedon et al. (2011)

Table 2Table of model and data sources used in the analyses.

Download Print Version | Download XLSX

Figure 3Comparison of the number of wet days provided by (a) the CRU data set, compared to those derived from (b) GPCC-FG, and (c) the difference between the two for the period 2010–2013.


The baseline reanalysis data set EI has both wetter and drier regions compared to GPCC7, with biases towards ±100 % over large regions (Fig. 2b). Overall, the wetter regions are predominant. Here, we note especially the wet bias throughout the Arctic (excluding Greenland) and mainly slightly wet bias in continental Europe. Corrections with GFDCL reproduces GPCC7 well (Fig. 2c), as expected per definition of the method. There are some isolated patches with underestimated precipitation, mainly in the dry regions of the Sahara desert and southern Arabian Peninsula, which appear because no scaling is possible for single months with a complete lack of precipitation in EI at these locations. In contrast to GFDCL, WFDEI has a general wet bias when compared to GPCC7 (Fig. 2d). The wet bias is explained mainly by stronger undercatch corrections included in WFDEI, as explained in Sect. 2.

Temperature bias in EI ranges mainly between ±1 C for most land areas (Fig. 2f), but there are regions with considerable bias. There is a mostly warm bias of partly several degrees Celsius in the Arctic regions. Europe has a low bias, except for Scandinavia, which shows a warm bias. Both GFDCL (Fig. 2g) and WFDEI (Fig. 2h) correct the bias per definition and are both indistinguishable at the 0.2 C accuracy of the color legend, even though different versions of CRU were employed (GFDCL: CRUts3.22; and WFDEI: CRUts3.1 for 1979–2009, CRUts3.21 for 2010–2012 and CRUts3.23 for 2013).

In summary, GFDCL is methodologically similar to WFDEI and differences in the results are mainly due to the different precipitation source used.

3.1.2 Evaluation of the updating method (2010–2013)

To evaluate the updating method of the GFDEI and GFDOD data sets, we investigate differences in bias for the period 2010–2013 when all data sources are available (see Table 2). The only methodological difference between GFDEI and GFDOD compared to GFDCL is the calculation of the number of wet days in a month. Whereas the latter uses gridded station measurements of the number of wet days from CRU, the former data sets have the number of wet days calculated from the GPCC-FG daily product as the number of days in a month with precipitation larger than or equal to 1 mm day−1. Figure 3 presents the period average number of wet days in a month for CRU and GPCC-FG. The two methods to calculate wet days differ significantly for Europe and especially the Arctic part of Scandinavia and western Russia, where the updating method overestimates the number of wet days. The updating method also produces underestimations in Africa, Latin America and the Andes. An interesting difference is markedly confined within the political borders of India, which implies a difference in the observations entering either CRU or GPCC-FG, and could be an artifact of a higher station density in that region compared to surrounding regions or a different threshold used for the wet-day definition.

Figure 4Relative difference of mean monthly precipitation between different data sources and (a–d) GPCC7, (e–g) GPCC-monitor, (h–i) GPCC-FG and (j) EI.


Figure 4 shows the bias between the different data sets used here, such that the data set given at the top of the plot is compared with that named to the left of each row. In the first row (Fig. 4a–d), all data sets are compared to GPCC7. Clearly, GPCC-monitor and GPCC-FG both underestimate precipitation for most parts of the globe compared to GPCC7. This is partly due to the lack of undercatch correction, but differences may also result from lower station density, as not all stations are available in real time. The latter effect can be seen in the different bias patterns for GPCC-monitor and GPCC-FG (Fig. 4a and b, respectively) and also in the difference between GPCC-monitor and GPCC-FG (Fig. 4e). The extension of the GFDCL data set is mainly through the GFDEI product, which is adjusted by GPCC-monitor, and the GFDOD product is mainly used as an interim measure to bridge the data gap for initializations of forecasts. GFDEI has a similar spatial structure to GPCC7, with some marked regional differences, but a general reduction of a few percent in total precipitation is seen. EI has a similar bias as for the climatological period (compare Fig. 4c and Fig. 2b). The bias of GPCC-monitor shrinks in significance when compared to that of EI, which means that the extension of GFDCL with GFDEI is indeed relevant when extending the climatological data set for, for example, hydrological applications.

OD has a similar bias to EI when compared to GPCC7 (Fig. 4d); however, clear differences although of lower magnitude also appear in a direct comparison of OD and EI (Fig. 4j). The main differences are confined to the tropical regions; however, the bias of OD is much more prevalent than that of GPCC-FG, which indicates value in the interim GFDOD product. GFDEI and GFDOD retains the average bias of the GPCC-monitor and GPCC-FG products, per definition (not shown).

Temperatures are compared between the data sets GHCN-CAMS, EI and OD toward CRU (not shown). The main differences are in the Arctic, especially for Greenland, and for various mountain ranges and coastal areas, with magnitudes of several degrees Celsius. EI and OD have a similar bias for most of the globe, although OD has a larger warm bias in the Arctic and northern Europe.

3.2 Hydrological evaluation

The effect of the interim products on simulated hydrology in Europe and the Arctic are evaluated using the E-HYPE and Arctic-HYPE continental hydrological models. The resulting bias at the end of OD simulation is indicative of the potential bias in initial conditions for a hydrological forecast made using the HydroGFD procedure. First, a climatological simulation driven by GFDCL is carried out for the years 2010–2013, starting from a saved model state on 10 January 2010. Second, a set of simulations separated by 1 calendar month was carried out for the period 10 May 2010 until 10 November 2013. Each of the simulations start from GFDEI for the first month, continue with GFDOD for 2 months and then OD for 1 month and 10 days (see Fig. 1). The model state of the last day of the GFDEI simulation is saved and used for the initial state of the next month's GFDEI simulation. When nothing else is stated, the evaluation is performed with day 1 at the first day of the GFDEI until the last day of the simulation, which is approximately day 130. In the figures we mark with colors as in Fig. 1 the different forcing data periods approximated by 30-day months to indicate which data set was used.

Figure 5Upstream precipitation, evapotranspiration and specific runoff averaged over all catchments and shown for all forecast times as well as per season for (a) E-HYPE and (b) Arctic-HYPE. All runs are presented as absolute deviations (Abs Dev) from the GFDCL forced simulation.


The impacts of the differences in the GFDEI, GFDOD and OD data sets compared to the reference GFDCL simulation are shown as an average across the respective simulation domains in Fig. 5. The specific runoff shows lower values for GFDEI and GFDOD compared to GFDCL for both domains. Clearly, the main determining factor for the differences arises from the differences in upstream precipitation from the first 30 days with GFDEI. Even though GFDOD has less of a precipitation offset from GFDCL, and for the Arctic even a positive difference, the GFDEI offset causes a slow drift in runoff toward the new conditions of GFDOD and therefore a remaining negative offset for about the first 90–100 days. Upstream evapotranspiration shows a low offset from GFDCL for GFDEI, which shows that the GHCN-CAMS and CRUts3.22 data sets are similar for these two domains. However, although the same data set is used for GFDOD, there is a larger offset for this period. The difference in upstream evapotranspiration offsets between the two model domains is most likely a result of the larger (and positive) offset in upstream precipitation for the GFDOD and OD periods in the Arctic-HYPE domain, rather than the smaller differences in temperature. OD has a strong wet precipitation bias (particularly in the northern hemisphere; results for the tropics and southern hemisphere may be different) (Fig. 4d), which is of a much greater magnitude than that of GFDEI. The bias causes the slow drift of the specific runoff to accelerate around day 90–100, as the model adjusts to the new precipitation average. The case is similar for both domains. Another striking feature from Fig. 5 is the larger variability for GFDOD and OD, compared to GFDEI, which is due to differences between EI and OD. This affects the day-to-day variations of the simulations, but not the total water balance.

Figure 5 shows also results per season. For both Europe and the Arctic, precipitation and runoff biases are largest for the OD forced period in DJF and MAM and relatively minor in JJA and SON. Seen as a continental mean, there is little variation in the biases between individual years, meaning that the results are robust in time (not shown).

Figure 6 shows a spatial view of the average upstream runoff difference from the GFDCL simulation for each domain. In the resolution of the color scales, there are only small differences between GFDEI and GFDOD. The offsets from GFDCL are mainly within ±20 % for Europe, but much stronger local offsets are seen in the Arctic domain. The Arctic is a more sensitive region to differences in the station density behind the gridded observational data sets, as there are fewer stations to begin with. This fact plays a large role in shaping the offsets seen here. The OD period is, as expected, wetter for most of the domains, but more clearly so for the Arctic domain.

Figure 6Relative percentage difference of the specific runoff in the upstream area from GFDCL for each catchment of (a) E-HYPE and (b) Arctic-HYPE, with the different data sets (right to left panels) GFDEI, GFDOD and OD.


Figure 7River discharge model performance measures: bias (relative volume error in %), Nash–Sutcliffe efficiency (NSE), Pearson correlation (r), and ratio of simulated and observed variance for a selection of grid points in (a–d) Europe and (e–h) the Arctic. The performance of GFDEI, GFDOD and OD (y axis) is compared to GFDCL (x axis) in scatter diagrams.


A selection of in situ observations from gauging stations with available data from at least 2 of the 4 simulated years was used to analyze how the model performance against observed discharge varies using the climatological forcing and different interim data sets. Performance criteria of the models for each of the gauges are presented for each data set in comparison to GFDCL in Fig. 7. Since GFDCL is always the reference, the results for each gauge line up vertically in the figure. The two domains show similar results, and we therefore describe the results in a general sense. The bias follows the patterns described above, with lower values for GFDEI and GFDOD, while OD has higher values. Whether there is a positive or negative bias is determined by the initial bias of the GFDCL simulation. The Nash–Sutcliffe efficiency (NSE) and Pearson correlation (r) are not showing any clear structure, but remain reasonable for most of the simulations. The variance is consistently higher for the OD simulation as also noted above.

In summary, the domain-average deviations from GFDCL show that the updating procedure adds value to the simulations by keeping the precipitation and temperature climate closer to the GFDCL data set when compared to the alternative of using uncorrected data (e.g., OD). The extension of GFDCL with GFDEI has only minor effects on the long-term hydrology. However, for forecast initializations, the inevitable switch to OD data when approaching the current date will cause a strong drift due to the wet bias of OD in the northern hemisphere regions. The drift continues throughout the OD period, which means that the initial drift a forecast is subjected to is dependent on the day of the forecast. The drift is largest for forecasts issued just before the 10th and lowest just after. This warrants future development to look for a method to adjust the deterministic forecast data (OD). In highly seasonal regions with little interannual variability, OD could be adjusted with the monthly climatological mean precipitation and temperature; however, it should be investigated whether this worsens simulations in regions with high interannual variability. Such a correction could also be used within the forecasting period; however, it is reserved as the subject of future study.

4 Conclusions

We present and evaluate a new data set called HydroGFD, which consists of several interim products to fill the gap between available climatological and forecasted data. The main product, GFDCL, is the methodological equivalent to the already well established WFDEI (Weedon et al.2014), although with updated observational data sets. To extend the data set beyond year 2013, when, for example, the GPCC7 data set ends, adjustments are performed with regularly updated data sets. This is performed with the GFDEI product until the latest update of EI, which is with about a 3-month delay. For near-real-time updates, GFDOD makes use of the ECMWF deterministic model with similar data sets for adjustments as for GFDEI. GFDOD is available until the end of the previous month from around the 10th of the current month.

GFDCL is found to be a much similar product to WFDEI but with a more consistent data set. The introduced undercatch corrections in the precipitation data set GPCC7 differ from those assumed in WFDEI, which leads to generally lower amounts in GFDCL. Temperature is very similar.

The updates in GFDEI beyond 2013 are evaluated for an overlapping period (2010–2013). GFDEI is found to have slightly lower precipitation amounts and spatially somewhat different temperatures. However, the differences of GFDCL shrink in comparison to the bias of EI which has bias that is often an order of magnitude higher.

When EI is not available, the OD model is employed and the precipitation data source changes from GPCC-monitor to GPCC-FG. The change in data source has the largest impact, with several geographical differences which impact on the GFDOD product. As an interim product until the next update, GFDOD reduces the bias of OD (which is similar to that of EI) to levels similar to GFDEI.

Initializations of hydrological simulations for forecasting purposes are investigated for GFDOD and extended by the non-corrected OD until the day before the next update of GFDOD. It is found that the strong bias of OD, especially for precipitation, causes a severe drift of the hydrological model away from the GFDOD climatology. The results are similar for both of the domains investigated, i.e., Europe and the Arctic region. Some measure to reduce the induced drift due to bias of OD would be necessary for reliable forecasts. Further, as HydroGFD data are updated, it is necessary to rerun the hydrological model from the last update of EI, i.e., for the last 3 months. The effect of the updating procedure will be that the forecast just after the update will not be consistent with the one from the day before due to the change in the last few months and the initial state at the time of the forecast. Analysis of the forecasts was not part of the current study.

HydroGFD is currently applied for forecasts with HYPE models in the Niger River basin ( which is evaluated in Andersson et al. (2017), the Arctic ( and for seasonal forecasts in a concept study for Copernicus Climate Change Service available from the sectoral information services at the website

The HydroGFD data sets are planned for public release via a web interface on An updated version of HydroGFD using the new reanalysis system ERA-5 and introducing further observational data sets is foreseen during 2018.

Data availability

The HydroGFD method relies mainly on open data sets, as referenced within the article. The ECMWF reanalysis can be accessed via the web portal The forecasts from ECMWF (here referred to as “OD”) are restricted to member institutes (or other special circumstances, see and are therefore not available for public download. However, HydroGFD will shortly appear online on Hydrological simulations were performed with the open source model HYPE, which can be accessed at

Competing interests

The authors declare that they have no conflict of interest.


We acknowledge data from ERA-Interim, CRU, GPCC and WFDEI as referenced within the paper, as well as GHCN-CAMS (National Center for Atmospheric Research Staff (Eds). Last modified 8 May 2014. “The Climate Data Guide: GHCN (Global Historical Climatology Network) Related Gridded Products”. Retrieved from and the ECMWF deterministic forecast system. Further, we acknowledge the initial work of implementing the HydroGFD system at SMHI by Lisa Bengtsson, Magnus Lindskog and Heiner Körnich and the work on operationalization by Fredrik Almén.

Edited by: Nadia Ursino
Reviewed by: Graham Weedon and one anonymous referee


Adam, J. C. and Lettenmaier, D. P.: Adjustment of global gridded precipitation for systematic bias, J. Geophys. Res.-Atmos., 108, 4257,, 2003. a, b

Alfieri, L., Burek, P., Dutra, E., Krzeminski, B., Muraro, D., Thielen, J., and Pappenberger, F.: GloFAS – global ensemble streamflow forecasting and flood early warning, Hydrol. Earth Syst. Sci., 17, 1161–1175,, 2013. a

Allen, R. G., Pereira, L. S., Raes, D., and Smith, M.: Crop evapotranspiration – Guidelines for computing crop water requirements, FAO Irrigation and drainage paper 56, FAO, Rome, 1998. a

Andersson, J., Pechlivanidis, I., Gustafsson, D., Donnelly, C., and Arheimer, B.: Key factors for improving large-scale hydrological model performance, European Water, 49, 77–88, 2015. a, b

Andersson, J. C., Ali, A., Arheimer, B., Gustafsson, D., and Minoungou, B.: Providing peak river flow statistics and forecasting in the Niger River basin, Phy. Chem. Earth Pt. A/B/C, 100, 3–12,, 2017. a

Arheimer, B., Dahné, J., Donnelly, C., Lindström, G., and Strömqvist, J.: Water and nutrient simulations using the HYPE model for Sweden vs. the Baltic Sea basin – influence of input-data quality and scale, Hydrol. Res., 43, 315–329, 2012. a

Beck, H. E., van Dijk, A. I. J. M., Levizzani, V., Schellekens, J., Miralles, D. G., Martens, B., and de Roo, A.: MSWEP: 3-hourly 0.25 global gridded precipitation (1979–2015) by merging gauge, satellite, and reanalysis data, Hydrol. Earth Syst. Sci., 21, 589–615,, 2017. a, b, c

Chen, M., Shi, W., Xie, P., Silva, V. B. S., Kousky, V. E., Wayne Higgins, R., and Janowiak, J. E.: Assessing objective techniques for gauge-based analyses of global daily precipitation, J. Geophys. Res.-Atmos., 113, D04110,, 2008. a, b

CPCtemp:, last access: 18 December 2017. a

Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A. C. M., van de Berg, L., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., Haimberger, L., Healy, S. B., Hersbach, H., Hólm, E. V., Isaksen, L., Kållberg, P., Köhler, M., Matricardi, M., McNally, A. P., Monge-Sanz, B. M., Morcrette, J.-J., Park, B.-K., Peubey, C., de Rosnay, P., Tavolato, C., Thépaut, J.-N., and Vitart, F.: The ERA-Interim reanalysis: configuration and performance of the data assimilation system, Q. J. Roy. Meteorol. Soc., 137, 553–597,, 2011. a, b

Demirel, M. C., Booij, M. J., and Hoekstra, A. Y.: Effect of different uncertainty sources on the skill of 10 day ensemble low flow forecasts for two hydrological models, Water Resour. Res., 49, 4035–4053,, 2013. a

Donnelly, C., Andersson, J., and Arheimer, B.: Using flow signatures and catchment similarities to evaluate a multi-basin model (E-HYPE) across Europe, Hydrolog. Sc. J., 61, 255–273,, 2016. a

Dyurgerov, M. and Meier, M.: Year-to-year fluctuations of global mass balance of small glaciers and their contribution to sea-level changes, Arct. Alp. Res., 29, 392–402, 1997. a

Emerton, R. E., Stephens, E. M., Pappenberger, F., Pagano, T. C., Weerts, A. H., Wood, A. W., Salamon, P., Brown, J. D., Hjerdt, N., Donnelly, C., and Baugh, C. A.: Continental and global scale flood forecasting systems, WIRES Water, 3, 391–418, 2016. a, b

Fan, Y. and Van den Dool, H.: A global monthly land surface air temperature analysis for 1948–present, J. Geophys. Res.-Atmos., 113, D01103,, 2008. a

Gelfan, A., Gustafsson, D., Motovilov, Y., Kalugin, A., Krylenko, I., and Lavrenov, A.: Climate change impact on the water regime of two great Arctic rivers: modeling and uncertainty issues, Climatic Change, 141, 1–17,, 2017. a, b

Harris, I. and Jones, P.: CRU TS3.22: Climatic Research Unit (CRU) Time-Series (TS) Version 3.22 of High Resolution Gridded Data of Month-by-month Variation in Climate (Jan. 1901–Dec. 2013), NCAS British Atmospheric Data Centre, 24 September 2014,, 2014. a, b

Hirpa, F. A., Salamon, P., Alfieri, L., del Pozo, J. T., Zsoter, E., and Pappenberger, F.: The Effect of Reference Climatology on Global Flood Forecasting, J. Hydrometeorol., 17, 1131–1145,, 2016. a

Huffman, G. J., Adler, R. F., Bolvin, D. T., and Gu, G.: Improving the global precipitation record: GPCP Version 2.1, Geophys. Res. Lett., 36, L17808,, 2009a. a

Huffman, G. J., Adler, R. F., Bolvin, D. T., and Nelkin, E. J.: The TRMM Multi-Satellite Precipitation Analysis (TMPA), in: Satellite Rainfall Applications for Surface Hydrology, Springer Netherlands, Dordrecht, 3–22,, 2009b. a

Hundecha, Y., Arheimer, B., Donnelly, C., and Pechlivanidis, I.: A regional parameter estimation scheme for a pan-European multi-basin model, J. Hydrol. Reg. Stud., 6, 90–111,, 2016. a, b

Legates, D. and Willmott, C.: Mean seasonal and spatial variability in gaugecorrected, global precipitation, Int. J. Climatol., 10, 111–127, 1990. a, b

Li, H., Luo, L., Wood, E. F., and Schaake, J.: The role of initial conditions and forcing uncertainties in seasonal hydrologic forecasting, J. Geophys. Res., 114, D04114,, 2009. a

Lindström, G., Pers, C., Rosberg, R., Strömqvist, J., and Arheimer, B.: Development and test of the HYPE (Hydrological Predictions for the Environment) model – A water quality model for different spatial scales, Hydrol. Res., 41.3-4, 295–319,, 2010. a, b

Meier, M. F. and Bahr, D. B.: Counting glaciers: Use of scaling methods to estimate the number and size distribution of the glaciers of the world, in: Glaciers, Ice Sheets and Volcanoes: A Tribute to Mark F. Meier, vol. 96, DTIC Document, CRREL Special Report, CRREL, Hanover, USA, 89–94, 1996. a

Oudin, L., Hervieu, F., Michel, C., Perrin, C., Andreassian, V., Anctil, F., and Loumagne, C.: Which potential evapotranspiration input for a lumped rainfall-runoff model: Part 2 – Towards a simple and efficient potential evapotranspiration model for rainfall-runoff modelling, J. Hydrol., 303, 290–306, 2005. a

Paiva, R. C. D., Collischonn, W., Bonnet, M. P., and De Goncalves, L. G. G.: On the sources of hydrological prediction uncertainty in the Amazon, Hydrol. Earth Syst. Sci., 16, 3127–3137,, 2012. a

Pechlivanidis, I. G., Bosshard, T., Spångmyr, H. L. G., Gustafsson, D., and Arheimer, B.: Uncertainty in the Swedish operational hydrological forecasting systems, in: Vulnerability, Uncertainty, and Risk: Quantification, Mitigation and Management, edited by: Beer, M., Au, S. K., and Hall, J. M., CDRM, Liverpool, UK, 253–262,, 2014. a

Schamm, K., Ziese, M., Becker, A., Finger, P., Meyer-Christoffer, A., Rudolf, B., and Schneider, U.: GPCC First Guess Daily Product at 1.0: Near Real-Time First Guess daily Land-Surface Precipitation from Rain-Gauges based on SYNOP Data, Global Precipitation Climatology Centre, GPCC, Deutscher Wetterdienst Offenbach, Germany,, 2013. a

Schneider, U., Becker, A., Finger, P., Meyer-Christoffer, A., Ziese, M., and Rudolf, B.: GPCC's new land surface precipitation climatology based on quality-controlled in situ data and its role in quantifying the global water cycle, Theor. Appl. Climatol., 115, 15–40, 2014. a, b

Schneider, U., Becker, A., Finger, P., Meyer-Christoffer, A., Rudolf, B., and Ziese, M.: GPCC Monitoring Product: Near Real-Time Monthly Land-Surface Precipitation from Rain-Gauges based on SYNOP and CLIMAT data, Global Precipitation Climatology Centre, GPCC, Deutscher Wetterdienst, Offenbach, Germany,, 2015a.  a

Schneider, U., Becker, A., Finger, P., Meyer-Christoffer, A., Rudolf, B., and Ziese, M.: GPCC Full Data Reanalysis Version 7.0 at 0.5: Monthly Land-Surface Precipitation from Rain-Gauges built on GTS-based and Historic Data, Global Precipitation Climatology Centre, GPCC, Deutscher Wetterdienst Offenbach, Germany,, 2015b. a

Sheffield, J., Goteti, G., and Wood, E. F.: Development of a 50-yr high-resolution global dataset of meteorological forcings for land surface modeling, J. Climate, 19, 3088–3111, 2006. a, b

Shukla, S. and Lettenmaier, D. P.: Seasonal hydrologic prediction in the United States: Understanding the role of initial hydrologic conditions and seasonal climate forecast skill, Hydrol. Earth Syst. Sci., 15, 3529–3538,, 2011. a

Strömqvist, J., Arheimer, B., Dahné, J., Donnelly, C., and Lindström, G.: Water and nutrient predictions in ungauged basins – Set-up and evaluation of a model at the national scale, Hydrolog. Sci. J., 57, 229–247, 2012. a

Weedon, G. P., Gomes, S., Viterbo, P., Shuttleworth, W., Blyth, E., Österle, H., Adam, C., Bellouin, N., Boucher, O., and Best, M.: Creation of the watch forcing data and its use to assess global and regional reference crop evaporation over land during the twentieth century, J. Hydrometeorol., 12, 823–848,, 2011. a, b, c, d, e, f, g, h

Weedon, G. P., Balsamo, G., Bellouin, N., Gomes, S., Best, M. J., and Viterbo, P.: The WFDEI meteorological forcing data set: WATCH Forcing Data methodology applied to ERA-Interim reanalysis data, Water Resour. Res., 50, 7505–7514, 2014. a, b, c, d, e, f

Ziese, M., Becker, A., Finger, P., Meyer-Christoffer, A., Rudolf, B., and Schneider, U.: GPCC First Guess Product at 1.0: Near Real-Time First Guess monthly Land-Surface Precipitation from Rain-Gauges based on SYNOP Data, Global Precipitation Climatology Centre, GPCC, Deutscher Wetterdienst, Offenbach, Germany,, 2011. a

Short summary
A new product (Global Forcing Data, GFD) that provides bias-adjusted meteorological forcing data for impact models, such as hydrological models, is presented. The main novelty with the product is the near-real time updating of the data which allows more up-to-date impact modeling. This is performed by combining climatological data sets with climate monitoring data sets. The potential in using the data to initialize hydrological forecasts is further investigated.