Articles | Volume 23, issue 3
Research article
14 Mar 2019
Research article |  | 14 Mar 2019

Contribution of potential evaporation forecasts to 10-day streamflow forecast skill for the Rhine River

Bart van Osnabrugge, Remko Uijlenhoet, and Albrecht Weerts

Medium-term hydrologic forecast uncertainty is strongly dependent on the forecast quality of meteorological variables. Of these variables, the influence of precipitation has been studied most widely, while temperature, radiative forcing and their derived product potential evapotranspiration (PET) have received little attention from the perspective of hydrological forecasting. This study aims to fill this gap by assessing the usability of potential evaporation forecasts for 10-day-ahead streamflow forecasting in the Rhine basin, Europe. In addition, the forecasts of the meteorological variables are compared with observations.

Streamflow reforecasts were performed with the daily wflow_hbv model used in previous studies of the Rhine using the ECMWF 20-year meteorological reforecast dataset. Meteorological forecasts were compared with observed rainfall, temperature, global radiation and potential evaporation for 148 subbasins. Secondly, the effect of using PET climatology versus using observation-based estimates of PET was assessed for hydrological state and for streamflow forecast skill.

We find that (1) there is considerable skill in the ECMWF reforecasts to predict PET for all seasons, and (2) using dynamical PET forcing based on observed temperature and satellite global radiation estimates results in lower evaporation and wetter initial states, but (3) the effect on forecasted 10-day streamflow is limited. Implications of this finding are that it is reasonable to use meteorological forecasts to forecast potential evaporation and use this is in medium-range streamflow forecasts. However, it can be concluded that an approach using PET climatology is also sufficient, most probably not only for the application shown here, but also for most models similar to the HBV concept and for moderate climate zones.

As a by-product, this research resulted in gridded datasets for temperature, radiation and potential evaporation based on the Makkink equation for the Rhine basin. The datasets have a spatial resolution of 1.2×1.2 km and an hourly time step for the period from July 1996 through 2015. This dataset complements an earlier precipitation dataset for the same area, period and resolution.

1 Introduction

Hydrologic forecasting has the aim of predicting the future state of important hydrologic fluxes, most notably streamflow. Throughout the process of forecasting, from model setup via initial state estimation to forecast run, meteorological forcing is a key component. Precipitation is known to be the main driver of hydrological processes, and most of the forecast uncertainty is attributed to inaccurate precipitation forcing (Cuo et al.2011; Pappenberger et al.2005). As a consequence, most attention has been given to the accuracy of precipitation forecasts. See for example the review of Cloke and Pappenberger (2009).

Evaporation is a result of the interaction between meteorological forcing and physical and physiological processes at the land surface. Meteorological forcing provides the potential energy (potential evaporation or PET) for evaporative processes to take place. There are many formulas to estimate the potential energy available for evaporation, which can be divided in three types of formulas based on their data requirements (Xystrakis and Matzarakis2011; Xu and Singh2002): temperature-based (e.g., Hargreave's equation, Hammon's equation), radiation-based, and combined methods (e.g., Hansen's equation, Turc's equation, Makkink's equation). From an operational viewpoint the different types of formulas result in different demands on data availability.

Constraints on data availability have led to additional approximations for potential evaporation. A common approximation is the calculation of a monthly potential evaporation climatology or PET demand curves (Bowman et al.2016). This climatology is then used as a driver for both historic potential evaporation and future potential evaporation.

Hydrological models have proven to be insensitive to the difference between variable potential evaporation forcing and climatological monthly potential evaporation forcing with respect to the model's potential to estimate streamflow after calibration (Andréassian et al.2004; Oudin et al.2005a, b). However, in forecasting, different choices in the handling of forcing data can be made between the historic update step and the forecast step, while the hydrological model, as a rule, remains the same. It therefore remains relevant to understand how a single model reacts to potential evaporation forcing. Insensitivity to the type of potential evaporation during the process of calibration does not mean that a model is insensitive to the form of potential evaporation input.

As mentioned above, there has been little attention to the forecast skill of the secondary forcing variables temperature and radiation in the hydrological context of potential evaporation. Furthermore, there is an easy and often used practice of avoiding potential evaporation forecasts by using a potential evaporation climatology. Therefore, the objective of this study is to assess to what extent potential evaporation forecasts can contribute to streamflow forecast skill.

This question is evaluated for the Rhine basin in Europe (Fig. 1). The Rhine is one of the basins currently employed as a case study for the IMproving PRedictions of EXtremes (IMPREX) project, which aims to improve predictions and management of hydrological extremes through climate services (van den Hurk et al.2016).

Several studies already directly addressed some aspects of operational ensemble flow forecasts in the Rhine. Renner et al. (2009) showed that at the time meaningful hydrological ensemble forecasts could be produced up to a 9-day lead time for the Rhine River based on ECMWF ensemble meteorological forecasts. Reggiani et al. (2009) used a Bayesian ensemble uncertainty processor to improve the assessment of uncertainty in the ensemble forecast. Terink et al. (2010) applied downscaling techniques to ERA15 ECMWF reanalysis data to develop a downscaling strategy for regional climate models (RCMs). Verkade et al. (2013) developed post-processing techniques to improve the precipitation and temperature ECMWF forecasts for the hydrological model. Photiadou et al. (2011) compared two historical precipitation datasets and assessed the influence of precipitation datasets on model results. Recently, van Osnabrugge et al. (2017) developed a high-resolution hourly precipitation dataset for use with gridded hydrologic models.

To answer the research question, model experiments are performed, but first the data and hydrological model are presented (Sect. 2). Second, the model experiments are described, which also partitions the main question into three subquestions (Sect. 3) which are subsequently answered (Sect. 4). The paper concludes with a discussion on the results in the wider context of evaporation modeling in hydrologic forecasting and the conclusions (Sect. 5).

Figure 1Map of the Rhine basin, Europe. Black lines delineate 148 subbasins used in the analysis of the meteorological forecast skill. Square markers show the locations used for forecast skill analysis.


2 Data and model

Observational data have been preprocessed for use with a grid-based hydrological model. The data were processed with hourly time resolution, on a 1.2×1.2 km grid spatial resolution, and for the period mid 1996 through 2015. All source data to derive the gridded estimates come from sources that supply their data in near real time, making the datasets suitable for operational forecasting. For this study all data were aggregated to a daily time step. The hourly datasets are downloadable through the 4TU data center (van Osnabrugge2017, 2018).

2.1 Precipitation

For this study the precipitation dataset from van Osnabrugge (2017) is used. The precipitation dataset has been derived using the genRE interpolation method based on ground measurements and the HYRAS (Rauthe et al.2013) climatological precipitation dataset (van Osnabrugge et al.2017).

2.2 Temperature

Temperature observations (1996–2016) are interpolated on the same 1.2×1.2 km grid as the precipitation data. Temperature is derived from interpolation of ground measurements with correction for elevation using the SRTM digital elevation model (Farr et al.2007) and standard lapse rate as follows.

To calculate temperature Tx at a given grid cell x from a number of n surrounding stations, determine a set of weights based on inverse-distance squared weighting between all stations (typically the n closest stations) and the grid cell. This step can have a threshold for maximum distance. di,x is the distance between station i and cell x:

(1) w i , x = 1 d i , x 2 i = 1 n 1 d i , x 2 .

Second, interpolate the measured temperature Tm,i with the weights as with standard inverse-distance squared interpolation:

(2) T m , x = i = 1 n T m , i w i , x .

Third, calculate the temperature lapse correction term Tγ,x as the weighted difference between the height of the grid cell Hx and the height of the considered stations Hi multiplied by the lapse rate γ.

(3) T γ , x = γ i = 1 n H i - H x w i , x

Note that Tγ,x is static for a fixed configuration of the measurement network if γ is taken to be a constant. In this study the configuration of the measurement network changed based on the number of reporting stations at each time step. A constant lapse rate was assumed: γ=0.0066 (C m−1).

The final temperature estimate for grid cell x is obtained by adding Tγ,x and Tm,x:

(4) T x = T γ , x + T m , x .

2.3 Downward shortwave surface radiation flux

The availability of solar radiation measurements at the surface has proven to be spatially and temporally inadequate for many applications, with remotely sensed solar radiation products having the largest potential to remedy this (Journée and Bertrand2010). Remotely sensed solar radiation estimates from the Land Surface Analysis Satellite Application Facility (LSA-SAF) were found to be in closer agreement with ground observations than reanalysis datasets such as the Système d'Analyze Fournissant des Renseignements Atmosphériques à la Neige (SAFRAN) reanalysis (Carrer et al.2012) and ERA-Interim (Jedrzej et al.2014).

For this study, downward shortwave radiation is resampled and merged from the EUMETSAT Surface Incoming Solar Radiation (SIS) (Mueller et al.2009) and Downward Surface Shortwave Flux (DSSF) (Trigo et al.2011) products from the Climate Monitoring Satellite Application Facility (CM-SAF) and LSA-SAF, respectively. Gaps in the satellite data are filled with the ERA5 surface solar radiation downwards (ssrd) parameter from the 4d-var reanalysis (Copernicus Climate Change Service2018). ERA5 was found to have comparable mean bias with satellite-derived products for inland stations (Urraca et al.2018).

In earlier research it has been shown that LSA-SAF (2005–current) and CM-SAF (1983–2005) can consistently be merged into one time series (Jedrzej et al.2014). The products of the different SAFs are comparable in terms of bias and standard deviation (Ineichen et al.2009).

2.4 Makkink potential evaporation

There are different approaches in making use of remotely sensed data to calculate evapo(transpi)ration. One branch of research aims to calculate actual evapotranspiration directly from satellite imagery (Su2002). Applications range from estimating the global evaporation flux (Mu et al.2011), water resources management (Bastiaanssen et al.2005) and constraining model parameters for a gridded model (Immerzeel and Droogers2008).

For operational use, PET estimates can be derived from satellite data only, or from a combination of satellite imagery and ground measurements. Bowman et al. (2017) explored the use of MODIS to provide a daily PET, both as dynamic PET (Spies et al.2015) and PET climatology (Bowman et al.2016) for a gridded and lumped version of the Sacramento Soil Moisture Accounting (SAC-SMA) model. The model was recalibrated for each PET input. No configuration with MODIS-derived PET showed consistent improvements across all basins in their case study. Still, it was concluded that the combination of dynamic PET in combination with a gridded model had the best overall results (Bowman et al.2017).

A disadvantage of using satellites such as MODIS is their temporal coverage which is restricted to a single overpass at a set time each day giving one instantaneous value. This can be resolved by assuming a sinusoidal development of PET over the day (Kim and Hogue2008), but the limitation is clear. This disadvantage is resolved by using geostationary satellites. For example, Jacobs et al. (2009) used solar radiation from the NOAA GOES geostationary satellite in combination with ground observations to calculate daily PET with the Penman–Monteith equation.

Here, potential evaporation is calculated from geostationary satellite radiation estimates and ground observations of temperature with the method proposed by Makkink (1957), which is applicable with remotely sensed radiation estimates (de Bruin et al.2016). PET calculated with Makkink's equation is a reference crop evapotranspiration, which means that crop factors apply determined by the hydrological model. In the setup of our hydrological model the crop factor was determined by land use. A crop factor of 1.15 is applied to the forested areas, and 1.0 to all others.

The reasons for choosing the Makkink equation are that (1) it only needs radiation and temperature, for which gridded estimations are available, and that (2) the Makkink equation is used by the Royal Netherlands Meteorological Institute (KNMI), so that the work presented here is compatible with ongoing local research (Hiemstra and Sluiter2011).

The potential evaporation is calculated based on air temperature T (C) and downward shortwave radiation Rg (W m−2) for accumulation period t (s) (Hiemstra and Sluiter2011):

(5) PET = 1000 0.65 Δ Δ + ψ t R g λ ρ w ( mm ) ,

with ψ the psychrometric constant, λ the latent heat of water, Δ the slope of the saturation vapor pressure curve and ρw the density of water calculated by


The Makkink potential evaporation calculated for each time step is called “near real time” (PETNRT). The potential evaporation climatology (PETClim) was calculated by averaging over the full time period (20 years) for each day (Fig. 2).

Figure 2Difference between climatology and near-real-time potential evaporation. Shown for the year 2004 for grid cell x:200, y:200.


2.5 ECMWF reforecast

The European Center for Medium-Range Weather Forecasts (ECMWF) issues hindcasts produced with the current model cycle for certain days for the last 20 years. The reforecast obtained for this study was produced with model cycle 43r1 (Buizza et al.2017). The first forecast is on 10 March 1996 and the last forecast on 29 December 2015, with reforecasts alternating every 3 or 4 days.

Forecasted Makkink potential evaporation (PETFcast) is calculated based on the t2m (T) and ssrd (Rg) variables using Eqs. (5)–(9). Temperature was first downscaled to the model resolution using the standard lapse rate as used in the interpolation of the temperature observations as follows:

(10) T x = T + h - h x γ ,

with T the temperature given by the ECMWF forecast on the ECMWF resolution, h the average height of the DEM corresponding to the footprint of the ECMWF grid cell, hx the height of cell x in the model, and γ the lapse rate.

2.6 Hydrological model

wflow is a modular hydrological modeling framework that allows for easy implementation and prototyping of regular grid hydrological model concepts in python-pcraster (Schellekens et al.2017). The hydrological model concept used is the HBV (Hydrologiska Byråns Vattenbalansavdelning) model concept (Lindström et al.1997) applied on a grid basis. The generated runoff is routed through the river network with a kinematic wave approach (Schellekens et al.2017). In the following this model is referred to as wflow_hbv. The setup of the hydrological model is the same as used in assessing the validity of the genRE precipitation dataset (van Osnabrugge et al.2017). The model was parameterized through calibration with a generalized likelihood uncertainty estimation (GLUE) like procedure (Beven and Binley1992), using HYRAS precipitation as forcing data (Winsemius et al.2013a, b). The model is taken “as is” and is not recalibrated for each PET forcing, the effect of which has been studied extensively elsewhere (e.g., Bowman et al.2017; Oudin et al.2005a).

3 Experimental setup

The analysis consists of a meteorological part and a hydrological part (Fig. 3).

Figure 3Flow chart of the model experiment. Blue boxes represent data products. Green boxes depict modeling activities. Arrows represent the flow of data for historical runs (blue lines) and forecast runs (black). The red boxes indicate the areas for analysis of the results, each box targeting a research subquestion.


3.1 Analysis of meteorological forecast skill

In this analysis we aim to answer the following question. What is the forecast skill of temperature, radiation and potential evaporation compared to precipitation?

For this purpose the observations and forecasts are spatially averaged over 148 subbasins (Fig. 1). Time series of observations and forecasts are then used to calculate the mean continuous ranked probability skill score (CRPSS) for each basin and each season (MAM, JJA, SON, DJF).

The mean continuous ranked probability score (CRPS) is an overall measure of forecast quality and is calculated by

(11) CRPS = 1 n i = 1 n - F y ( y ) - H ( y x ) d y ,

in which Fy(y) is the cumulative distribution function of the forecast variable and H(yx) the Heaviside step function that assumes probability 1 for values greater than or equal to the observation and 0 otherwise (Brown et al.2010). Interpretation of the mean CRPS is similar to interpretation of a root mean square error. Both scores have no fixed upper bound, their magnitude is determined by the variable, and lower scores are better, with 0 the perfect score.

The limits of the mean CRPS vary depending on the basin and season, and it is therefore difficult to compare between basins and season. For this reason the CRPS is translated into the continuous ranked probability skill score, which measures the performance of a forecasting system relative to a reference forecast. The reference forecast here is seasonal climatology. As such the CRPSS equals 1 for a perfect forecast and 0 when the forecast ensemble does not score a better CRPS than the CRPS calculated for the climatological distribution.


Additionally, the relative mean error (RME) is calculated for the mean of the forecasts Yi to detect relative biases in the mean:

(13) RME = i = 1 n Y i - x i i = 1 n x i ,

in which Yi is the mean of the ensemble for forecast i and xi the corresponding observation.

The above scores are calculated with the Ensemble Verification System (EVS), a software package to calculate ensemble verification metrics (Brown et al.2010).

3.2 Analysis of the effect of PET forecasts on streamflow predictions

In this second part of the analysis we aim to answer the following questions.

  1. To what extent are initial states affected by the use of climatological versus near-real-time potential evaporation?

  2. To what extent can potential evaporation forecasts contribute to streamflow forecast skill?

To answer the first question, the wflow_hbv model is consecutively forced with PETClim and PETNRT. Four states and two fluxes are exported for analysis: (1) upper soil reservoir, (2) lower soil reservoir, (3) interception storage, (4) soil moisture store and fluxes, (5) discharge and (6) actual evaporation. For the different states and fluxes the mean difference (MD) is calculated for each grid cell. This is done for each season to investigate seasonality of differences. The MD is calculated as

(14) MD = i = 1 n STATE NRT , i - STATE CLIM , i n .

To answer the second question, two hindcast runs are performed with PETFcast and PETClim as PET forcing, respectively. To avoid effects caused by the initial state, all forecasts start from the initial states derived from the PETNRT simulation. Forecast skill scores are calculated as for the meteorological variables for 20 discharge gauges and for each season. Different from the meteorological verification exercise, the metrics are calculated for the forecasts with reference to the model output, and are not compared with observations. The reason for this was that differences between observation and forecast stem from many different sources, including errors in the initial state. Subsequently, a forecast that is “too wet” might compensate in the 10-day forecast for initial states that were “too dry”. For this reason the effect of the meteorological forecast was isolated by calculating the verification metrics against modeled streamflow. This also avoids issues of perceptive bias due to the model being calibrated on another PET forcing; one of the PET types might simply perform better because it is more like the original PET used in calibration.

Streamflow gauges for analysis were selected such that

  1. only gauges were chosen for which the model was deemed behavioral as expressed by a KGE score threshold of 0.5;

  2. only one gauge was selected for each stream in the basin, except for the Rhine River itself, for which two additional gauges were chosen. If multiple gauges in the same stream were present, the gauge most downstream was chosen. The gauge “most downstream” was selected by sorting on mean yearly discharge and picking the highest; and

  3. from the then remaining list, the largest 20 streams were selected for analysis.

The streamflow locations are shown in Fig. 1 as black squares including the name of the river.

4 Results

4.1 Analysis of meteorological forecast skill

The forecast skill is assessed for all catchments and for each season. Seasons are Northern Hemisphere seasons spring (MAM), summer (JJA), autumn (SON), and winter (DJF). Figure 4 shows the mean CRPSS calculated for subsamples of all forecast–observation pairs for different levels of exceedance, P(Xx), for each variable. Simply put, the CRPSS value at P(Xx)=0.1 is calculated for the top 10 % of observations and the CRPSS value at P(Xx)=0.7 is calculated for the highest 70 % of the observations. P(Xx) is calculated over all observations from all seasons. This means that for some seasons, for example temperature in winter, there is a lower limit in P(Xx), because the highest temperatures do not occur during winter. On the other hand, the response of the CRPSS curve is flat for high P(Xx) for temperature during summer, as all summer temperatures fall in the highest 60 % of temperatures of the whole year. The same is shown for the CRPS, Fig. 5, and the RME, Fig. 6.

Figure 4Continuous ranked probability skill score (CRPSS) for the four forcing variables benchmarked against sample climatology for the 148 HBV subbasins. CRPSSs are aggregated into mean (solid), 10th and 90th percentiles (dashed). Note that the CRPSS at P(Xx)=0.1 or 0.7 is calculated over, respectively, the 10 % and 70 % highest observation–forecast pairs, conditioned on the observations.


Figure 5Continuous ranked probability score (CRPS) for the four forcing variables for the 148 HBV subbasins for the whole year. CRPS is aggregated into mean (solid), 10th and 90th percentiles (dashed). Note that the CRPS at P(Xx)=0.1 or 0.7 is calculated over, respectively, the 10 % and 70 % highest observation–forecast pairs, conditioned on the observations.


There is no skill in the ECMWF forecast beyond 10 days for daily precipitation. This is consistent with the 9-day lead time in streamflow forecasts found by Renner et al. (2009). The skill is best in winter and worst in summer, which is expected based on the dominating meteorological processes (frontal systems in winter and convective events in summer). The total amount of precipitation is underestimated after a 1-day lead time (Fig. 6).

There is more skill in the forecast for the variables temperature and incoming shortwave radiation. Likewise, there is considerable skill remaining in the potential evaporation forecast. For temperature the 1-day forecast is close to perfect for autumn and spring. The skill in temperature forecast is similar for spring, summer and autumn but worse during winter. The spread, the difference in skill between basins, is also largest during winter and spring. The RME shows that there is a small negative bias in the temperature forecasts. The RME for winter is largest; however, it should be noted that the RME is the mean difference weighed by the mean of the observations (Eq. 13). As the mean temperature in winter is closer to zero, this results in larger RME. Still, also when expressed in absolute values, the error for temperature during winter is larger than for other seasons (Fig. 5).

Figure 6Relative mean error (RME) for the four forcing variables for the 148 HBV subbasins for the whole year. RME is aggregated into mean (solid), 10th and 90th percentiles (dashed). Note that the CRPSS at P(Xx)=0.1 or 0.7 is calculated over, respectively, the 10 % and 70 % highest observation–forecast pairs, conditioned on the observations.


For radiation there is already quite a considerable loss in skill after 1 day, but then the CRPSS remains quite stable for longer forecasts, notably during spring and autumn. There is a larger decline in skill for summer and for extreme low radiation values in winter. In absolute terms, the CRPS is related to the magnitude of the average radiation for each season, with the smallest absolute errors for winter and the largest during summer (Fig. 5). In terms of bias, we see that the relative mean error increases with lower P(Xx) (Fig. 6, row 3). This indicates that low values are slightly overestimated while high values are slightly underestimated, making the radiation forecasts slightly less extreme than the observations. This is further demonstrated in Fig. S1 in the Supplement, which plots the RME for different levels of non-exceedance (P(Xx)), as opposed to exceedance in Fig. 6.

The skill of the potential evaporation forecast is closely tied to the skill in radiation forecast, both because Makkink potential evaporation is directly proportional to radiation and because of the larger uncertainty in the radiation forecast. The forecast skill has the same properties as those found for the radiation forecast. A small difference is that part of the forecast skill in temperature is found back in a slightly improved forecast skill after 10 days for PET compared to radiation in summer.

Overall, there is relatively little spread in skill between basins, with the 10th and 90th percentiles close to the mean and following the same trajectory. The difference in skill between the different seasons is larger than the spread between basins, especially for the variables temperature, radiation and potential evaporation. This difference in skill between seasons is partly misleading. For example, the forecast skill for radiation in winter (Fig. 4, purple line) appears to be an outlier. However, the whole range of occurrences of extreme high and low radiative forcing is compressed in a limited part of P(Xx). Although the forecast over the whole range of winter radiative forcing is lower than that for the other seasons, the top 10 % of winter radiative forcings are actually among the best predicted.

Likewise, high temperatures receive higher skill scores than low season temperatures. This is even more distinct in the radiation forecasts. This does, however, not mean that the forecasts of such rare events are more accurate: both RME (Fig. 6) and CRPS (Fig. 5) are larger for high extremes, meaning larger errors for those forecasts. Still, taking into account the rarity of the event by calculating the CRPSS, which is the skill of the forecast relative to the skill of a random draw from the climatology, temperature, radiation and potential evaporation forecasts are found to add most information for extreme high values, even though the error of those forecasts is larger than for more “average” values (values with a higher probability of occurrence).

4.2 Influence of dynamic PET on initial states

Dynamic potential evaporation leads to lower actual evaporation (AET). The difference is largest for summer and spring (Fig. 7). Part of this lower evaporation is from a reduction in interception as the interception storage is more filled on average under dynamical forcing. This can be explained by the correlation between precipitation events and low potential evaporation. On rainy days the dynamic potential evaporation is generally lower, which decreases the amount of interception evaporation. Under climatological forcing the energy available is not reduced, and thus more water evaporates from the interception store. The latter is sometimes taken into account in hydrological models by adding a potential evaporation reduction function dependent on the intensity of precipitation to correct the PET climatology. For example, the HBV model has this option (Schellekens et al.2017).

Figure 7Seasonal mean difference in calculated actual evaporation (AET) for each season. Actual evaporation includes evaporation from interception.


The lower evaporation with dynamic PET forcing cascades through the different model storages, accumulating in a mostly wetter lower zone (LZ) storage under dynamic forcing. Finally, the lower evaporation results in higher discharge throughout the Rhine basin (see Figs. S2–S7). Exceptions are the high Rhine during spring and to a lesser extent during autumn, and several areas during winter when there is very little effect overall. The wetter conditions also result in higher peak discharges. As these higher discharges are a result of the temporal dynamics of the potential evaporation input, we expect to find a similar effect on forecasted discharges. As will be shown later (Fig. 9), this is indeed the case, albeit very limited.

4.3 Influence of PET forecast on streamflow forecast

The CRPSS for streamflow forecast is hardly influenced by potential evaporation forcing type. At first sight, the skill scores obtained with dynamic or climatological PET are identical. Small differences only become visible when taking a close-up of the differences by subtracting one from the other (Fig. 8). However, the small difference in skill grows with lead time. The influence of PET forcing type becomes more intuitive when looking at the RME. An increasing drift with lead time between PET forcing types is visible (Fig. 9). Interestingly, this drift in RME is almost uniform over all subsets of predicted discharge. The drift is positive, which means that forecasted PET leads to slightly higher forecasted discharges, as expected based on the results of the influence of variable PET on the initial states.

Figure 8CRPSS for forecast runs (forecasted PET, climatological PET) and their difference benchmarked against model output for the 20 largest sub-catchments in the Rhine basin. CRPSSs are aggregated into mean (solid), 10th and 90th percentiles (dashed). Note that the CRPSS at P(Xx)=0.1 or 0.7 is calculated over, respectively, the 10 % and 70 % highest observation–forecast pairs, conditioned on the observations.


Figure 9RME for forecast runs (forecasted PET, climatological PET) and their difference for the 20 largest streams in the Rhine basin. RME scores are aggregated into mean (solid), 10th and 90th percentiles (dashed). Note that the CRPSS at P(Xx)=0.1 or 0.7 is calculated over, respectively, the 10 % and 70 % highest observation–forecast pairs, conditioned on the observations.


Analyzed for each season separately, there is a little more to discover about the role of potential evaporation forecasts and the sensitivity of forecast skill to the meteorological forecast in general. The contribution of the meteorological forecast to streamflow forecast uncertainty is largest for summer, as shown by the largest decrease in CRPSS for the 10-day forecast in summer compared to the other seasons. The CRPSS especially “dips” for the most extreme discharges, which is not as strong for spring and autumn, and especially compared to the flat response of the CRPSS for the highest 30 % of discharges in winter.

In terms of the effect of potential evaporation climatology versus forecasted potential evaporation, the influence is largest (but still quite small) for summer and spring. This is tied to the potential evaporation being of larger magnitude; there is hardly a response for winter, where there is the lowest potential evaporation.

The influence of PET forecasts on low flow prediction is further examined by calculating the scores for different levels of non-exceedance P(Xx), instead of exceedance, so that the score value at P(Xx)=0.1 is calculated for the 10 % smallest observations and the score value at P(Xx)=0.7 is calculated for the lowest 70 % of the observations. Not only has the choice of PET forcing for the forecast hardly any effect on the forecasted streamflow (Fig. 10, bottom row), but the forecast skill of low discharge is also affected only slightly by the skill of the meteorological forecast in general. The meteorological forecast skill declines with lead time (e.g., Fig. 4), but the forecast skill of low percentile discharge remains almost perfect (very close to 1) compared to the model under perfect forcing.

Figure 10CRPSS for forecast runs (forecasted PET, climatological PET) and their difference benchmarked against model output for the 20 largest streams in the Rhine basin. CRPSSs are aggregated into mean (solid), 10th and 90th percentiles (dashed). Note that the CRPSS at P(Xx)=0.1 or 0.7 is calculated over, respectively, the 10 % and 70 % lowest observation–forecast pairs, conditioned on the observations.


5 Conclusions

This paper presented a simple and straightforward investigation with an operational forecasting practice perspective. First, observation data were preprocessed for use in the gridded wflow_hbv model. Second, the wflow_hbv model was subjected to dynamical and climatological PET forcing. Three aspects were analyzed: (1) the skill in meteorological forecast, (2) the effect of PET forcing on initial states and (3) the effect of PET forcing on forecast skill.

Nine to 10 days is the upper limit on forecast lead time for daily precipitation for the ECMWF forecast in the Rhine basin, with only very little skill remaining compared to climatology. There is considerable skill in daily temperature, radiation and potential evaporation forecasts, also after 10 days. Variable PET forcing resulted in lower evaporation and in wetter initial states and higher modeled discharges.

The main result of this study is that potential evaporation forecasts improved streamflow forecasts only slightly. This confirms earlier results that the influence of random errors on estimated streamflow was generally not measurable when comparing model runs directly, needing a 20% systematic bias in PET to influence model outcomes significantly (Parmele1972). Likewise, Fowler (2002) concluded that climatological PET estimates produced a soil water regime very similar to that derived with actual daily PET values, including extreme periods, for a site in Auckland, New Zealand.

There is a wider discussion on evaporation modeling in hydrological models (Andréassian et al.2004; Oudin et al.2005a, b) to which the results here might add a new perspective: that of evaporation as a process relevant for medium-term forecasts. This is directly also a limitation of this research; only the influence on forecasts up to 10 days was investigated. The influence on seasonal forecasting might be more substantial, considering that the modeling of evaporation strongly influences the partitioning between runoff and evaporation in the longer-term water balance (Bai et al.2016).

Further limitations are that only one model was tested (wflow_hbv) and for one climate zone (moderate temperate). The model was calibrated originally on a different PET climatology than studied here and was not recalibrated. The latter is not seen as a limitation. Deliberately not recalibrating the model enabled us to focus on the changes in modeled processes instead of comparably vague assessments based on model performance expressed in efficiencies, with the effects brought forward by the PET forcing hidden somewhere in the parameter space.

In the analysis, forecasting metrics were calculated over subsets of observation–forecast pairs conditioned on the observations. Alternatively, the subsets could have been conditioned on the mean of the forecasts. This would present more intuitive information for a forecaster at the time of a forecast when the observation is by definition not yet known (Lerch et al.2015).

The idea to look at potential evaporation forecast was instigated as part of a program to improve forecasts of low flows. Indeed, it is a recurring hypothesis that potential evaporation forecasts should aid especially in making low flow predictions. The uniform response of several skill scores for different subsets of observed discharge does not support this idea; there is no special gain for low flows.

Instead, from our model results it follows that the correct prediction of a drought is firstly dependent on a correct forecast of no rain. Low flow recession is subsequently determined, in the absence of further feedback mechanisms, solely by the storage–discharge relationship of, in this case, the lower zone representing the saturated zone as well as the routing of surface water.

The follow-up question then is the following. Is this true in reality, or is this a model deficiency? Should we rethink hydrological modeling to incorporate more feedbacks on evaporation? Certainly there are models with more complex representation of evaporative processes. These are valid and important questions, especially in the light of hydrologic response to change in climate drivers. However, from the results presented here, it should not be expected that a better understanding of evaporative processes and feedbacks will result directly in a significant increase in 10-day predictive skill for streamflow.

Data availability

Gridded precipitation, temperature, radiation and potential evaporation used in this study are available through the 4TU data center; see van Osnabrugge (2017) and van Osnabrugge (2018).


The supplement related to this article is available online at:

Author contributions

BvO prepared the data, performed the analyses and wrote the article. RU contributed to improving the article and experimental setup. AW supplied part of the scripts used in the analysis and contributed in preprocessing steps of the forecasts as well as contributing to the design of the model experiments and final article text.

Competing interests

The authors declare that they have no conflict of interest.


This work is partly supported by the IMPREX project funded by the European Commission under the Horizon 2020 framework program (grant 641811) and partly by the Dutch Ministry of Infrastructure and the Environment. Part of the work was conducted as an in-kind contribution to Netherlands Organisation for Scientific Reasearch project “SWM-EVAP: Smart Water Management in a complex environment: improving the monitoring and forecasting of surface evaporation” (ALWTW.2016.049). Meteorological data for this research have been gratefully received from the Deutscher Wetterdienst Climate Data Center; KNMI Data Centrum; Météo France; Federal Office of Meteorology and Climatology MeteoSwiss; Administration de la gestion de l'eau du Grand-Duché de Luxembourg; and Service Publique de Wallonie Département des Etudes et de l'Appui à la Gestion. Discharge data have been gratefully received from SCHAPI (Service Central d'Hydrométéorologie et d'Appui àla Prévision des Inondations) through Banque HYDRO; Bundesamt für Umwelt BAFU; Bundesanstalt für Gewässerkunde (BfG); Administration de la gestion de l'eau du Grand-Duché de Luxembourg; Bavarian Environment Agency,; Landesanstalt für Umwelt, Messungen und Naturschutz Baden-Württemberg, LUBW; Landesamtes für Umwelt, Wasserwirtschaft und Gewerbeaufsicht Rheinland-Pfalz; Landesamt für Umwelt- und Arbeitsschutz Saarland; and Landesamt für Natur, Umwelt und Verbraucherschutz Nordrhein-Westfalen. We thank David Lavers from ECMWF for providing access to the reforecast dataset.

Edited by: Alexander Gelfan
Reviewed by: two anonymous referees


Andréassian, V., Perrin, C., and Michel, C.: Impact of imperfect potential evapotranspiration knowledge on the efficiency and parameters of watershed models, J. Hydrol., 286, 19–35,, 2004. a, b

Bai, P., Liu, X., Yang, T., Li, F., Liang, K., Hu, S., Liu, C., Bai, P., Liu, X., Yang, T., Li, F., Liang, K., Hu, S., and Liu, C.: Assessment of the Influences of Different Potential Evapotranspiration Inputs on the Performance of Monthly Hydrological Models under Different Climatic Conditions, J. Hydrometeorol., 17, 2259–2274,, 2016. a

Bastiaanssen, W. G. M., Noordman, E. J. M., Pelgrum, H., Davids, G., Thoreson, B. P., and Allen, R. G.: SEBAL Model with Remotely Sensed Data to Improve Water-Resources Management under Actual Field Conditions, J. Irrig. Drain. Eng., 131, 85–93,, 2005. a

Beven, K. and Binley, A.: The future of distributed models: Model calibration and uncertainty prediction, Hydrol. Process., 6, 279–298,, 1992. a

Bowman, A. L., Franz, K. J., Hogue, T. S., and Kinoshita, A. M.: MODIS-Based Potential Evapotranspiration Demand Curves for the Sacramento Soil Moisture Accounting Model, J. Hydrol. Eng., 21, 04015055,, 2016. a, b

Bowman, A. L., Franz, K. J., and Hogue, T. S.: Case Studies of a MODIS-Based Potential Evapotranspiration Input to the Sacramento Soil Moisture Accounting Model, J. Hydrometeorol., 18, 151–158,, 2017. a, b, c

Brown, J. D., Demargne, J., Seo, D. J., and Liu, Y.: The Ensemble Verification System (EVS): A software tool for verifying ensemble forecasts of hydrometeorological and hydrologic variables at discrete locations, Environ. Model. Softw., 25, 854–872,, 2010. a, b

Buizza, R., Bidlot, J.-R., Janousek, M., Keeley, S., Mogensen, K., and Richardson, D.: New IFS cycle brings sea-ice coupling and higher ocean resolution, ECMWF Newsletter No. 150, 14–17,, 2017. a

Carrer, D., Lafont, S., Roujean, J.-L., Calvet, J.-C., Meurey, C., Le Moigne, P., and Trigo, I. F.: Incoming Solar and Infrared Radiation Derived from METEOSAT: Impact on the Modeled Land Water and Energy Budget over France, J. Hydrometeorol., 13, 504–520,, 2012. a

Cloke, H. and Pappenberger, F.: Ensemble flood forecasting: A review, J. Hydrol., 375, 613–626,, 2009. a

Copernicus Climate Change Service (C3S): ERA5: Fifth generation of ECMWF atmospheric reanalyses of the global climate, Copernicus Climate Change Service Climate Data Store (CDS), available at:!/home, last access: 1 July 2018. a

Cuo, L., Pagano, T. C., and Wang, Q. J.: A Review of Quantitative Precipitation Forecasts and Their Use in Short- to Medium-Range Streamflow Forecasting, J. Hydrometeorol., 12, 713–728,, 2011. a

de Bruin, H. A. R., Trigo, I. F., Bosveld, F. C., and Meirink, J. F.: A Thermodynamically Based Model for Actual Evapotranspiration of an Extensive Grass Field Close to FAO Reference, Suitable for Remote Sensing Application, J. Hydrometeorol., 17, 1373–1382,, 2016. a

Farr, T. G., Rosen, P. A., Caro, E., Crippen, R., Duren, R., Hensley, S., Kobrick, M., Paller, M., Rodriguez, E., Roth, L., Seal, D., Shaffer, S., Shimada, J., Umland, J., Werner, M., Oskin, M., Burbank, D., and Alsdorf, D. E.: The shuttle radar topography mission, Rev. Geophys., 45, RG2004,, 2007. a

Fowler, A.: Assessment of the validity of using mean potential evaporation in computations of the long-term soil water balance, J. Hydrol., 256, 248–263,, 2002. a

Hiemstra, P. and Sluiter, R.: Interpolation of Makkink evaporation in the Netherlands, Tech. rep., KNMI, De Bilt, available at: (last access: 5 March 2018), 2011. a, b

Immerzeel, W. W. and Droogers, P.: Calibration of a distributed hydrological model based on satellite evapotranspiration, J. Hydrol., 349, 411–424,, 2008. a

Ineichen, P., Barroso, C. S., Geiger, B., Hollmann, R., Marsouin, A., and Mueller, R.: Satellite Application Facilities irradiance products: Hourly time step comparison and validation over Europe, Int. J. Remote Sens., 30, 5549–5571,, 2009. a

Jacobs, J. M., Lowry, B., Choi, M., and Bolster, C. H.: GOES Solar Radiation for Evapotranspiration Estimation and Streamflow Prediction, J. Hydrol. Eng., 14, 293–300,, 2009. a

Jedrzej, J., Bojanowski, S., Vrieling, A., and Skidmore, A. K.: A comparison of data sources for creating a long-term time series of daily gridded solar radiation for Europe, Solar Energy, 99, 152–171,, 2014. a, b

Journée, M. and Bertrand, C.: Improving the spatio-temporal distribution of surface solar radiation data by merging ground and satellite measurements, Remote Sens. Environ., 114, 2692–2704,, 2010. a

Kim, J. and Hogue, T. S.: Evaluation of a MODIS-Based Potential Evapotranspiration Product at the Point Scale, J. Hydrometeorol., 9, 444–460,, 2008. a

Lerch, S., Thorarinsdottir, T. L., Ravazzolo, F., and Gneiting, T.: Forecaster's Dilemma: Extreme Events and Forecast Evaluation, Statist. Sci., 32, 106–127,, 2015. a

Lindström, G., Johansson, B., Persson, M., Gardelin, M., and Bergström, S.: Development and test of the distributed HBV-96 hydrological model, J. Hydrol., 201, 272–288,, 1997. a

Makkink, G. F.: Ekzameno de la formulo de Penman, Neth. J. Agri. Sci., 5, 290–305, 1957. a

Mu, Q., Zhao, M., and Running, S. W.: Improvements to a MODIS global terrestrial evapotranspiration algorithm, Remote Sens. Environ., 115, 1781–1800,, 2011. a

Mueller, R., Matsoukas, C., Gratzki, A., Behr, H., and Hollmann, R.: The CM-SAF operational scheme for the satellite based retrieval of solar surface irradiance – A LUT based eigenvector hybrid approach, Remote Sens. Environ., 113, 1012–1024,, 2009. a

Oudin, L., Hervieu, F., Michel, C., Perrin, C., Andréassian, V., Anctil, F., and Loumagne, C.: Which potential evapotranspiration input for a lumped rainfall-runoff model? Part 2 – Towards a simple and efficient potential evapotranspiration model for rainfall-runoff modelling, J. Hydrol., 303, 290–306,, 2005a. a, b, c

Oudin, L., Michel, C., and Anctil, F.: Which potential evapotranspiration input for a lumped rainfall-runoff model? Part 1 – Can rainfall-runoff models effectively handle detailed potential evapotranspiration inputs?, J. Hydrol., 303, 275–289,, 2005b. a, b

Pappenberger, F., Beven, K. J., Hunter, N. M., Bates, P. D., Gouweleeuw, B. T., Thielen, J., and de Roo, A. P. J.: Cascading model uncertainty from medium range weather forecasts (10 days) through a rainfall-runoff model to flood inundation predictions within the European Flood Forecasting System (EFFS), Hydrol. Earth Syst. Sci., 9, 381–393,, 2005. a

Parmele, L. H.: Errors in output of hydrologic models due to errors in input potential evapotranspiration, Water Resour. Res., 8, 348–359,, 1972. a

Photiadou, C. S., Weerts, A. H., and van den Hurk, B. J. J. M.: Evaluation of two precipitation data sets for the Rhine River using streamflow simulations, Hydrol. Earth Syst. Sci., 15, 3355–3366,, 2011. a

Rauthe, M., Steiner, H., Riediger, U., Mazurkiewicz, A., and Gratzki, A.: A Central European precipitation climatology – Part I: Generation and validation of a high-resolution gridded daily data set (HYRAS), Meteorol. Z., 22, 235–256,, 2013. a

Reggiani, P., Renner, M., Weerts, A. H., and van Gelder, P. A. H. J. M.: Uncertainty assessment via Bayesian revision of ensemble streamflow predictions in the operational river Rhine forecasting system, Water Resour. Res., 45, W02428,, 2009. a

Renner, M., Werner, M. G. F., Rademacher, S., and Sprokkereef, E.: Verification of ensemble flow forecasts for the River Rhine, J. Hydrol., 376, 463–475,, 2009. a, b

Schellekens, J., van Verseveld, W., de Boer-Euser, T., Winsemius, H., Thiange, C., Bouaziz, L., Tollenaar, D., de Vries, S., and Weerts, A.: Openstreams wflow documentation, available at: (last access: 1 March 2018), 2017. a, b, c

Spies, R. R., Franz, K. J., Hogue, T. S., and Bowman, A. L.: Distributed Hydrologic Modeling Using Satellite-Derived Potential Evapotranspiration, J. Hydrometeorol., 16, 129–146,, 2015. a

Su, Z.: The Surface Energy Balance System (SEBS) for estimation of turbulent heat fluxes, Hydrol. Earth Syst. Sci., 6, 85–100,, 2002. a

Terink, W., Hurkmans, R. T. W. L., Torfs, P. J. J. F., and Uijlenhoet, R.: Evaluation of a bias correction method applied to downscaled precipitation and temperature reanalysis data for the Rhine basin, Hydrol. Earth Syst. Sci., 14, 687–703,, 2010. a

Trigo, I. F., Dacamara, C. C., Viterbo, P., Roujean, J.-L., Olesen, F., Barroso, C., Camacho-de Coca, F., Carrer, D., Freitas, S. C., García-Haro, J., Geiger, B., Gellens-Meulenberghs, F., Ghilain, N., Meliá, J., Pessanha, L., Siljamo, N., and Arboleda, A.: The Satellite Application Facility for Land Surface Analysis, Int. J. Remote Sens., 32, 2725–2744,, 2011. a

Urraca, R., Huld, T., Gracia-Amillo, A., Martinez-de Pison, F. J., Kaspar, F., and Sanz-Garcia, A.: Evaluation of global horizontal irradiance estimates from ERA5 and COSMO-REA6 reanalyses using ground and satellite-based data, Solar Energy, 164, 339–354,, 2018. a

van den Hurk, B. J., Bouwer, L. M., Buontempo, C., Döscher, R., Ercin, E., Hananel, C., Hunink, J. E., Kjellström, E., Klein, B., Manez, M., Pappenberger, F., Pouget, L., Ramos, M.-H., Ward, P. J., Weerts, A. H., and Wijngaard, J. B.: Improving predictions and management of hydrological extremes through climate services:, Climate Services, 1, 6–11,, 2016. a

van Osnabrugge, B.: Gridded precipitation dataset for the Rhine basin made with the genRE interpolation method, 4TU. Centre for Research Data,, 2017. a, b, c

van Osnabrugge, B.: Gridded Hourly Temperature, Radiation and Makkink Potential Evaporation forcing for hydrological modelling in the Rhine basin, Wageningen University & Research, Dataset,, 2018. a, b

van Osnabrugge, B., Weerts, A. H., and Uijlenhoet, R.: genRE: A Method to Extend Gridded Precipitation Climatology Data Sets in Near Real-Time for Hydrological Forecasting Purposes, Water Resour. Res., 53, 9284–9303,, 2017. a, b, c

Verkade, J., Brown, J., Reggiani, P., and Weerts, A.: Post-processing ECMWF precipitation and temperature ensemble reforecasts for operational hydrologic forecasting at various spatial scales, J. Hydrol., 501, 73–91,, 2013. a

Winsemius, H. M., van Verseveld, W., Weerts, A. H., and Hegnauer, M.: Generalised Likelihood Uncertainty Estimation for the daily HBV model in the Rhine Basin, Part B: Switzerland, Tech. rep., Deltares, Delft, 2013a. a

Winsemius, H. M., van Verseveld, W., Weerts, A. H., and Hegnauer, M.: Generalised Likelihood Uncertainty Estimation for the daily HBV model in the Rhine Basin, Part A: Germany, Tech. rep., Deltares, Delft, 2013b. a

Xu, C.-Y. and Singh, V. P.: Cross Comparison of Empirical Equations for Calculating Potential Evapotranspiration with Data from Switzerland, Water Resour. Manage., 16, 197–219,, 2002. a

Xystrakis, F. and Matzarakis, A.: Evaluation of 13 Empirical Reference Potential Evapotranspiration Equations on the Island of Crete in Southern Greece, J. Irrig. Drain. Eng., 137, 211–222,, 2011. a

Short summary
A correct estimate of the amount of future precipitation is the most important factor in making a good streamflow forecast, but evaporation is also an important component that determines the discharge of a river. However, in this study for the Rhine River we found that evaporation forecasts only give an almost negligible improvement compared to methods that use statistical information on climatology for a 10-day streamflow forecast. This is important to guide research on low flow forecasts.