Model output statistics (MOS) methods can be used to empirically relate an environmental variable of interest to predictions from earth system models (ESMs). This variable often belongs to a spatial scale not resolved by the ESM. Here, using the linear model fitted by least squares, we regress monthly mean streamflow of the Rhine River at Lobith and Basel against seasonal predictions of precipitation, surface air temperature, and runoff from the European Centre for Medium-Range Weather Forecasts. To address potential effects of a scale mismatch between the ESM's horizontal grid resolution and the hydrological application, the MOS method is further tested with an experiment conducted at the subcatchment scale. This experiment applies the MOS method to 133 additional gauging stations located within the Rhine basin and combines the forecasts from the subcatchments to predict streamflow at Lobith and Basel. In doing so, the MOS method is tested for catchments areas covering 4 orders of magnitude. Using data from the period 1981–2011, the results show that skill, with respect to climatology, is restricted on average to the first month ahead. This result holds for both the predictor combination that mimics the initial conditions and the predictor combinations that additionally include the dynamical seasonal predictions. The latter, however, reduce the mean absolute error of the former in the range of 5 to 12 %, which is consistently reproduced at the subcatchment scale. An additional experiment conducted for 5-day mean streamflow indicates that the dynamical predictions help to reduce uncertainties up to about 20 days ahead, but it also reveals some shortcomings of the present MOS method.
Environmental forecasting at the subseasonal to seasonal timescale promises
a basis for planning in e.g. energy production, agriculture, shipping, or
water resources management. While the uncertainties of these forecasts are
inherently large, they can be reduced when the quantity of interest is
controlled by slowly varying and predictable phenomena. For example, the El
Niño–Southern Oscillation plays an important role in predicting the
atmosphere, and snow accumulation and melting often forms the backbone in
predicting hydrological variables of the land surface
In the case of streamflow forecasting, the ESP-revESP experiment proposed by
The framework allows for the estimation of the time range at which the initial conditions control the generation of streamflow: when the prediction error of the ESP simulation exceeds that of the revESP simulation, the meteorological forcings start to dominate the streamflow generation. Similarly, when the prediction error of the ESP simulation approaches the prediction error of the climatology (i.e. average streamflow used as naive prediction strategy), the initial conditions no longer control the streamflow generation.
In both cases this time range depends on the interplay between climatological
features (e.g. transitions between wet and dry or cold and warm seasons) and
catchment-specific hydrological storages (e.g. surface water bodies, soils,
aquifers, and snow) and can vary from 0 up to several months
An emerging option for streamflow forecasting is the integration of seasonal
predictions from earth system models (ESMs), i.e. coupled
atmosphere–ocean–land general circulation models forcing a hydrological model with the predicted evolution of the atmosphere; employing runoff simulated by the land surface model; using the predicted states of the atmosphere, ocean, or land surface in a perfect prognosis or model output statistics
context with the streamflow as the predictand.
The first approach requires a calibrated hydrological model for the region of
interest. In order to correct a potential bias and to match the spatial and
temporal resolution of the hydrological model, it further involves a
postprocessing of the atmospheric fields. A postprocessing might also be
applied to the streamflow forecasts to account for deficiencies of the
hydrological model. See e.g.
In the second approach the land surface model takes the hydrological model's
place with the difference that the atmosphere and land surface are fully
coupled. Since the land surface component of ESMs often represents
groundwater dynamics and the river routing in a simplified way
The third approach deals with developing an empirical prediction rule for
streamflow. If the model-building procedure is based on observations only,
the approach is commonly referred to as perfect prognosis (PP). On the other
hand, the model might be built using the hindcast archive of a particular ESM
(model output statistics, MOS). In both cases the final prediction rule is
applied to the actual ESM outcome to forecast the quantity of interest.
Therefore, MOS methods require the presence of a hindcast archive of the ESM
involved, but can take systematic errors of the ESM into account
Studies that map ESM output to streamflow with PP or MOS methods include
multiple linear regression
The present study aims to take up this scale bridging and to test a MOS-based approach for monthly mean streamflow forecasting and a range of catchment areas. To analyse the limits of predictability and to aid interpretation, we first define predictor combinations motivated by the ESP-revESP framework. Next, seasonal predictions of precipitation, surface air temperature, and runoff from the European Centre for Medium-Range Weather Forecasts (ECMWF) are entered into the regression equation and the resulting forecast skill is estimated with respect to the ESP-inspired regression model.
The variation of the catchment area is borrowed from the concept of the “working
scale”
This experiment is conducted for the Rhine River at Lobith and Basel in
western Europe. Studies using subseasonal or seasonal climate predictions
indicate for several parts of the Rhine basin moderate skill beyond the lead
time of traditional weather forecasts. These studies apply the model chain as
outlined above in approach number one: concerning catchments of the Alpine
and High Rhine,
As a compromise between skillful lead time and temporal resolution, we decide to focus on monthly mean streamflow at lead times of 0, 1, and 2 months. In order to resolve the monthly timescale and to test the MOS method at shorter time intervals, an experiment is further conducted for 5-day mean streamflow. Here, 0 lead time refers to forecasting one time interval ahead, while e.g. a 1-month lead time denotes a temporal gap of 1 month between the release of a forecast and its time of validity.
Strictly speaking, the present study deals with hindcasts or retrospective
forecasts. However, for the sake of readability we use the terms forecast,
hindcast, and prediction interchangeably. Below, Sect.
The Rhine River is situated in western Europe and discharges into the North
Sea; in the south its basin is defined by the Alps. About 58 million people
use the Rhine water for the purpose of navigation, hydropower, industry,
agriculture, drinking water supply, and leisure
Table
Concerning the climatology of the period 1981–2011 (Fig.
Geography of the Rhine River at Basel and Lobith according to
Monthly area averages of streamflow, precipitation, and surface air temperature for the Rhine at Lobith and
Basel with respect to the period 1981–2011
Observations of river streamflow and gridded precipitation, surface air temperature, and runoff of the period 1981–2011 in daily resolution constitute the data set. Throughout the study gridded quantities get aggregated to (sub)catchment area averages.
The streamflow observations consist of a set of 135 time series in
The ENSEMBLES gridded observational data set in Europe (E-OBS, version 16.0)
provides precipitation and surface air temperature on a 0.25
Precipitation, surface air temperature, and runoff from ECMWF's seasonal
forecast system 4 (S4) archive are on a 0.75
The atmospheric model (IFS cycle 36r4) consists of 91 vertical levels with
the top level at 0.01
The H-TESSEL land surface model implements four soil layers with an
additional snow layer on the top. Interception, infiltration, surface runoff,
and evapotranspiration are dealt with by dynamically separating a grid cell
into fractions of bare ground, low and high vegetation, intercepted water,
and shaded and exposed snow. In contrast, the soil properties of a particular
layer are uniformly distributed within one grid cell. Vertical water movement
in the soil follows Richards's equation with an additional sink term to allow
for water uptake by plants. Runoff per grid cell equals the sum of surface
runoff and open drainage at the soil bottom
The following subsections outline the experiment, which is individually
conducted for both the Rhine at Lobith and Basel. Section
The predictand
The set of predictors consists of variables that either precede or succeed
the date of prediction
The S4* combinations constitute the MOS method and consider the seasonal predictions from the S4 hindcast archive, where we use the asterisk as wildcard to refer to any of the S4P, S4T, S4PT, and S4Q models. The S4P and S4T models are used to separate the forecast quality with respect to precipitation and temperature. The S4Q model is tested as H-TESSEL does not implement groundwater dynamics and preceding precipitation and temperature might tap this source of predictability.
Predictor combinations consisting of (with respect to the date of prediction) preceding and subsequent
precipitation (
For a particular
The ordinary least squares hyperplane is then used for prediction without any
transformation, basis expansion, or interaction. However, model variance can
be an issue: specifically for the preMet model from Table
Each year with a buffer of 2 years (i.e. the two preceding and subsequent
years) is left out and the regression outlined in Sect.
Lead time is introduced by integrating the predicted
Contrasting the forecast quality of a given model for catchments separated in space inevitably implies a large number of factors, e.g. the geographic location (and thus the grid points of the ESM involved), the orography, or the degree to which streamflow is regulated. In order that these factors are held while screening through a range of catchment areas, we propose to vary the working scale within a particular target catchment.
Following this line of argumentation we apply the model-building procedure
from Sect.
For these subcatchments we have streamflow observations from the
entire upstream area but not the actual subcatchment area itself. To arrive
at an estimate of the water volume generated by the subcatchment, we equate
the predictand
This procedure implies that we ignore the water travel time: first, when taking the differences of outflows and inflows and second, when summing up the subcatchment forecasts. While the former increases the observational noise, the latter does not affect the regression itself, but it adds a noise term to the final forecast at Lobith and Basel. As the statistical properties of the noise introduced by the water travel time are unknown, we only can argue that the results provide a lower bound of the forecast quality due to this methodological constraint.
Subcatchment division of the Rhine at Lobith and Basel. The median area covers 4 orders of magnitude.
The forecast quality of the regression models is analysed using the pairs of
cross-validated monthly mean streamflow forecasts and observations
The first validation steps focus on the forecasts at Lobith and Basel and
thus consider the sum of the subcatchment forecasts
Climatology and runoff simulated by H-TESSEL serve as benchmarks. The
climatology is estimated using the arithmetic mean from the daily streamflow
observations. After averaging in time, runoff from H-TESSEL gets
post-calibrated via linear regression against the streamflow observations per
spatial level. For both benchmarks the cross-validation scheme from
Sect.
Taylor diagrams
In the case of the monthly analysis it turns out that the paired differences of
absolute errors for a given lead time, spatial level, and reference model
To evaluate whether a particular model
To help in the interpretation of the forecast quality of the MOS method
regarding the spatial levels at Lobith and Basel, we plot, in a qualitative
manner, the MAE skill score (Eq.
The terrain roughness is included since the atmospheric flow in complex
terrain is challenging to simulate and atmospheric general circulation models
need to filter the topography according to their spatial resolution
In order to predict 5-day mean streamflow, Eq. (
The experiment spans several dimensions (i.e. Lobith versus Basel, dates of prediction, lead times, predictor combinations, spatial levels), so we frequently need to collapse one or several dimensions. The Supplement aims to complete the results as presented below.
Figure
The benchmark climatology is outperformed at 0 lead time by all models. At longer lead times the subMet model pops up besides the refRun model and the remaining models approach climatology. For the refRun model we note a correlation of about 0.9 independently of the lead time while the observation's variability generally is underestimated.
For Lobith and 0 lead time we observe an elongated cluster, which comprises all models except the climatology and the refRun model. Some models score a higher correlation – a closer look would reveal that these are the S4P, S4PT, and S4Q models with H-TESSEL standing at the forefront.
Taylor diagrams for the benchmarks climatology and H-TESSEL and the predictor combinations from
Table
Figure
In general, the patterns repeat more or less along the spatial levels and the
S4PT model only beats the reference models in the denominator of Eq. (
While significant differences between the S4PT and the preMet models are rare, the subMet model starts to outperform the S4PT model already at a lead time of 1 month. The comparison against the bias-corrected H-TESSEL runoff shows that the S4PT model might provide more accurate predictions for early summer, but not otherwise.
MAE skill score of the S4PT model with respect to the climatology, the preMet and subMet models, and bias
corrected H-TESSEL runoff. The ordinate depicts the target calendar month and the abscissa the monthly lead time. Crosses
indicate
In order to conclude the analysis of the monthly predictions at Lobith and
Basel, Table
Focusing on the MOS method, Table
Mean absolute error at 0 lead time of the benchmarks climatology and H-TESSEL and the predictor
combinations from Table
MAE skill score of the S4* models relative to the preMet model (Eq.
Figure
MAE skill score of the S4PT model with respect to the preMet model for each subcatchment and 0 lead time.
Subcatchments are only coloured when the
The same skill scores from Fig.
While the first two attributes concern the geography of the subcatchments, the third attribute indicates the relevance of the initial conditions for the subsequent generation of streamflow. The fourth attribute shows how well the S4PT model performs relative to the climatology as benchmark, when it has access to the best available input data.
The resulting patterns suggest that positive skill does not depend on the subcatchment area. On the other hand, a low terrain roughness and a weak relevance of the initial conditions seem to favour positive skill. The last row finally indicates that positive skill is restricted to subcatchments where the refRun model outperforms climatology. Roughly, a hypothetical relationship appears to strengthen from the top to the bottom plots.
MAE skill score of the S4PT model with respect to the preMet model for each subcatchment and 0
lead time, plotted against subcatchment attributes (see Sect.
Figure
In addition, we see that the bias-corrected H-TESSEL runoff starts rather cautiously, but it seems to slightly outperform the S4* models at longer lead times. While the S4T model is hardly distinguishable from the preMet model, the S4P, S4PT, and S4Q models appear to outperform the preMet model within the first 20 days (Lobith) and 15 days (Basel).
For the full range of lead times, the spatial levels introduce some clear
differences (Fig.
Correlation coefficient of 5-day mean streamflow observations and
predictions for lead times up to 45 days;
Correlation coefficient of 5-day mean streamflow observations and
predictions for lead times up to 175 days;
In the case of the monthly streamflow, the refRun model ends up with a
correlation of about 0.9 for all lead times, spatial levels, and both Lobith
and Basel (Fig.
For the 5-day mean streamflow the refRun model gets degraded. At short
lead times the correlation amounts to about 0.8, while for longer lead times
the correlation exhibits a decreasing trend. Either the present model
formulation is less valid (especially for small values of
The spatial levels can affect the forecast quality in two ways:
via the ignorance of the water travel time (Sect. or by the aggregation of the E-OBS and S4 fields at the catchment scale not being the appropriate spatial
resolution (e.g. large-scale grid averages cancel any spatial variability, and for catchment areas below
the grid scale a grid point does not necessarily contain information valid at the local scale).
However, clear differences between the spatial levels can only be observed
for the 5-day streamflow predictions, where at spatial levels 2 and 3 the
forecast quality is improved. Using local information of precipitation,
surface air temperature, or runoff appears to compensate for the ignorance of
the water travel time.
In
The analysis of the 5-day mean streamflow forecasts
(Sect.
In the case of the monthly mean streamflow forecasts at 0 lead time, the MOS
method based on precipitation or runoff provides a smaller mean absolute
error than the preMet model (Table
Figure
Within ECMWF's seasonal forecasting system S4, H-TESSEL aims to provide a
lower boundary condition for the simulation of the atmosphere and
consequently neither implements streamflow routing nor groundwater
storage
The S4Q model, which has access to the same input data and in addition
conditions on preceding precipitation and temperature, scores a lower
forecast accuracy than H-TESSEL in the case of Lobith (Table
The present study tests a model output statistics (MOS) method for monthly and 5-day mean streamflow forecasts in the Rhine basin. The method relies on the linear regression model fitted by least squares and uses predictions of precipitation and surface air temperature from the seasonal forecast system S4 of the European Centre for Medium-Range Weather Forecasts. Observations of precipitation and surface air temperature prior to the date of prediction are employed as a surrogate for the initial conditions. In addition, runoff simulated by the S4 land surface component, the H-TESSEL land surface model, is evaluated for its predictive power.
MOS methods often bridge the grid resolution of the dynamical model and the spatial scale of the actual predictand. In order to estimate how the forecast quality depends on the catchment area, a hindcast experiment for the period 1981–2011 is conducted that varies the working scale within the Rhine basin at Lobith and Basel. This variation is implemented by applying the MOS method to subcatchments and combining the resulting forecasts to predict streamflow at the main outlets at Lobith and Basel.
On average, the monthly mean streamflow forecasts based on the initial
conditions are skillful with respect to the climatology at 0 lead time for
both the Rhine at Lobith and Basel. The MOS method, which in addition has
access to the dynamical seasonal predictions, further reduces the mean
absolute error by about 5 to 12
We conclude that the present model formulation – in particular the
assumption of linearity – is valid for the monthly timescale, catchments
with areas up to 160 000
The regression approach from Sect.
The authors declare that they have no conflict of interest.
This article is part of the special issue “Sub-seasonal to seasonal hydrological forecasting”. It is not associated with a conference.
Streamflow series and catchment boundaries are provided by the following public authorities: the State Institute
for the Environment, Measurements and Conservation Baden Wuerttemberg; the Bavarian Environmental Agency;
the State of Vorarlberg; the Austrian Federal Ministry of Agriculture, Forestry, Environment and Water; and the Swiss
Federal Office for the Environment. Further we acknowledge the E-OBS data set from the EU-FP6 project
ENSEMBLES (