Operational river discharge forecasting in poorly gauged basins: the Kavango River Basin case study

Operational probabilistic of river are essential for e ective resources management. Many studies have addressed this topic using di ﬀ erent approaches ranging from purely statistical black-box approaches to physically-based and distributed modelling schemes employing data assimilation techniques. However, few 5 studies have attempted to develop operational probabilistic forecasting approaches for large and poorly gauged river basins. This study is funded by the European Space Agency under the TIGER-NET project. The objective of TIGER-NET is to develop open-source software tools to support integrated water resources management in Africa and to facilitate the use of satellite earth observation data in water management. We 10 present an operational probabilistic forecasting approach which uses public-domain climate forcing data and a hydrologic–hydrodynamic model which is entirely based on open-source software. Data assimilation techniques are used to inform the forecasts with the latest available observations. Forecasts are produced in real time for lead times of 0 to 7 days. The operational probabilistic forecasts are evaluated using a selection 15 of performance statistics and indicators. The forecasting system delivers competitive forecasts for the Kavango River, which are reliable and sharp. Results indicate that the value of the forecasts is greatest for intermediate lead times between 4 and 7 days. 2010) global elevation dataset at a resolution of 30 arcseconds. The parameterization of vegetation processes in the SWAT model is based on the land cover input We use the USGS Global Land Cover Characterization (GLCC) dataset, version 2.0 with a spatial resolution of 1 km (USGS, 2008). The soil dataset forms the basis for parameterizing soil hydraulic processes in SWAT. We use the FAO/UNESCO digital soil map of the world and derived soil prop-erties, revision 1 with a spatial resolution of 5 arcmin (FAO-Unesco, 1974). Look-up translating into parameters the performance of a indicate that, for short times, the last outperforms longer lead the forecasting performs better than the last observation. The break-even point occurs some- a it important


Introduction
Operational probabilistic hydrological modelling and river discharge forecasting is an 20 active research topic in water resources engineering and applied hydrology (Pagano et al., 2014). Sharp and reliable forecasts of river discharge are required over a range of forecasting horizons for flood and drought management. A state of the art river discharge forecasting system consists of a weather forecast or an ensemble of weather forecasts (Cloke and Pappenberger, 2009), a hydrologic-hydrodynamic modelling sys-states of rainfall-runoff models (e.g. Clark et al., 2008;Pauwels and De Lannoy, 2009) while other approaches focus the updating on the hydrodynamic parts of the model (Biancamaria et al., 2011;Neal et al., 2009). Probably, the most popular algorithm used in hydrologic data assimilation is the ensemble Kalman filter (e.g. Clark et al., 2008). Alternatively, the particle filter (Moradkhani et al., 2005) can be used, which 15 does not require the assumption of Gaussian model errors. Some studies use filtering approaches where the gain is determined heuristically from offline simulations and then used operationally in forecasting mode (Madsen and Skotner, 2005). As pointed out by Liu et al. (2012), despite the large body of literature on hydrologic data assimilation, few studies evaluate the benefit of data assimilation for actual forecasting and practical 20 application of data assimilation by operational agencies is rare.
In many river basins the performance of operational hydrological modelling and forecasting is limited because in-situ observations of precipitation and river discharge are scarce or unavailable. This is also the case for many of Africa's large river basins which are poorly gauged (e.g. Zambezi, Volta, Congo). Consistent, long-term and spatially Introduction techniques have the potential to fill critical data gaps in the observation of the global hydrological cycle. All major components of the water balance, except river discharge, can now be estimated based on various types of remote sensing data. However, the available techniques are still limited by coarse spatial and temporal resolution as well large and/or poorly understood error characteristics (Tang et al., 2009). From a man-5 agement perspective one of the most important components of the hydrological cycle is river discharge. Extremely high flows in rivers cause flooding which can have severe consequences in terms of fatalities and economic damage. Low flows cause conflicts in the allocation of scarce water resources between economic sectors and/or the environment. Therefore, in many river basins there is a need for hydrological models to provide 10 operational estimates of river discharge based on remotely sensed observations and limited available in-situ measurements. The TIGER-NET project addresses the demand for free, up-to-date and spatially resolved water information for the African continent. The project is funded by the European Space Agency (ESA) and aims to support integrated water resources manage-15 ment in Africa by (i) providing access to ESA Earth observation (EO) data, (ii) developing an open-source Water Observation and Information System (WOIS) and (iii) implementing capacity building actions in collaboration with African partner institutions (Guzinski et al., 2014).
The WOIS includes a hydrological modelling component, which supports long-term 20 scenario analysis (e.g. impact of climate change, deforestation etc.) as well as operational probabilistic forecasting. The specific objective for the operational modelling capability is to provide reliable and sharp probabilistic forecasts of river discharge over time horizons of up to one week. In addition to hydrological modelling, WOIS includes functionality for operational flood monitoring, basin characterization at high (∼ 30 m) 25 and medium (∼ 1 km) spatial resolutions and derivation of other products requiring EO data processing and analysis (Guzinski et al., 2014). It was designed for use in African organizations, where budgetary and technical constraints often limit the use of EO data for integrated water resources management. Therefore, WOIS is based purely on free, Introduction open-source software components and was created as an easy to use tool for both capacity building and operational use. Among the partner institutions engaged in the TIGER-NET project is the Namibian Ministry of Agriculture, Water and Forestry. The Ministry has an interest in forecasting the discharge of the Kavango River. Based on these requirements, this study has four specific objectives: 5 1. Development of a robust and simple probabilistic river discharge forecasting system for poorly gauged river basins, based solely on open source software and public-domain data.
2. Informing the forecasting system with in-situ discharge observations in real time.
3. Operational demonstration of the system for the Kavango River case study. 10 4. Comprehensive evaluation of the operational probabilistic forecasts using a selection of performance statistics and indicators.

Study area
The Kavango River originates in the highlands of central Angola and flows south to 15 the border between Angola and Namibia. The Cuito River joins the Kavango River just before the river enters into Namibia's Caprivi Strip. It terminates in the Okavango Delta, a large wetland system in Northern Botswana (Milzow et al., 2009). An overview of the basin is provided in Fig. 1. The basin is located on the Southern fringes of the intertropical convergence zone. A strong south-to-north precipitation gradient is observed. 20 The climate is highly seasonal and large inter-annual variations are typical, which are controlled by a number of climate time scales (McCarthy et al., 2000;Wolski et al., 2014). The Kavango River is an important resource for all riparian countries and forms the basis of many people's livelihoods (Kgathi et al., 2006 water allocation between economic sectors and the environment have been in focus for some time, flood risk has recently become a major concern in Namibia because the northern part of Namibia has experienced increased magnitude and frequency of flooding events since 2008 (Wolski et al., 2014). Water managers need accurate and reliable forecasting tools to deal with both floods and droughts.

5
Three hydrological modelling efforts have been reported in the literature for the Kavango River Basin. Folwell and Farqhuarson (2006) used the Global Water Availability Assessment (GWAVA) model to assess climate change impacts in the basin. Hughes et al. (2011Hughes et al. ( , 2006) calibrated a Pitman model for the basin and were able to reproduce in-situ observations satisfactorily. Milzow et al. (2011) developed a SWAT (Soil and Wa-10 ter Assessment Tool) model of the Kavango Basin and calibrated the model with water levels from radar altimetry, soil moisture from Envisat-ASAR and total water storage change from GRACE.

Hydrologic and hydrodynamic modelling
The modelling approach implemented in this study consists of a hydrologic (rainfall-15 runoff) model which is coupled to a simple hydrodynamic model for channel flow. A oneway coupling between the two model compartments is implemented, i.e. once runoff has entered the river channel, the water cannot move back into the land phase of the hydrological cycle.
We use the well-known SWAT hydrological model, version 2009(Gassman et al., 20 2005Neitsch et al., 2011) for rainfall-runoff modelling. SWAT is a semi-distributed, physically based hydrological model which operates at a daily time step. The river basin is divided into a number of sub-basins. Each sub-basin is in turn divided into hydrological response units (HRU), which are defined as portions of the sub-basin with similar terrain slope, land use and soil type. The Kavango SWAT model consists of 25 12 subbasins with outlets located at the confluences of major tributaries as well as at in-situ discharge station locations (Fig. 1 The hydrodynamic model used in this study is a simple Muskingum routing scheme. The river is divided into 12 primary individual river reaches. The primary reaches are further sub-divided if required to meet the numerical stability criteria of the Muskingum routing scheme (Chow et al., 1988). The hydrodynamic model state vector consists of the simulated discharges in each individual reach. In the Muskingum routing scheme, 5 the model operator propagating the discharge forward in time is linear, i.e. the simulated discharges at time step t + 1 are a linear function of the simulated discharges at time step t and the runoff forcings at time steps t and t + 1: (1) In this equation, q is the vector of simulated discharges and r is the vector of runoff 10 forcings, A, B and C are linear operators which depend on the configuration of the river channels and network connectivity and the superscripts indicate time steps. For details on the implementation of the Muskingum routing scheme the reader is referred to Chow et al. (1988) and Michailovsky et al. (2013). 15 SWAT requires the following input datasets: elevation, land cover, soil type and climate forcings. The elevation dataset is used for automatic watershed and river network delineation as well as for the determination of terrain slope. We use the ACE2 (Altimeter Corrected Elevation, version 2, Berry et al., 2010) global elevation dataset at a resolution of 30 arcseconds. The parameterization of vegetation processes in the SWAT 20 model is based on the land cover input dataset. We use the USGS Global Land Cover Characterization (GLCC) dataset, version 2.0 with a spatial resolution of 1 km (USGS, 2008). The soil dataset forms the basis for parameterizing soil hydraulic processes in SWAT. We use the FAO/UNESCO digital soil map of the world and derived soil properties, revision 1 with a spatial resolution of 5 arcmin (FAO-Unesco, 1974 The model is forced with daily precipitation and daily minimum and maximum temperature from the National Oceanic and Atmospheric Administration's Global Forecast System (NOAA-GFS) which provides up to seven days of forecast at a six hourly temporal resolution and 0.5 • spatial resolution (NOAA, 2014). For historical simulation periods and model calibration, forcing time series consisting of the 1 day ahead fore-5 casts are used. In operational mode, long-term forecasts are successively replaced with short-term forecasts as time proceeds. In order to assess the performance of the NOAA-GFS precipitation forecast for the Kavango region, the 1 day ahead forecasts were compared to FEWS-RFE rainfall estimates (Herman et al., 1997). FEWS-RFE was previously found to be one of the most accurate remote sensing precipitation prod-10 ucts for Africa (Milzow et al., 2011;Stisen and Sandholt, 2010).

Calibration and validation of the hydrologic-hydrodynamic model
Calibration and validation of the hydrologic-hydrodynamic model were performed against observed in situ river discharge using a split-sample approach. were available for calibration/validation. The objective function which was minimized in the calibration was formulated as where NSE is the Nash-Sutcliffe model efficiency (Nash and Sutcliffe, 1970) and ME is 20 the water balance error (mean error). This formulation ensured a reasonable trade-off between fitting the observed hydrographs and matching the observed water balance of the catchment. A sequential calibration strategy was implemented: first, the subcatchments upstream of Rundu were calibrated using Rundu observations and subsequently the subcatchments between Rundu and Mohembo were calibrated using Mohembo 25 observations. Calibration was performed using the model-independent parameter estimation programme PEST (Doherty et al., 2014). Because of the strongly non-linear response of the SWAT rainfall-runoff model, global search strategies are the preferred option for calibration of SWAT models (Arnold et al., 2012). We use the shuffled complex evolution (SCE) algorithm (Duan et al., 1992) which performs a global search over the 5 entire allowed parameter space. The SCE algorithm is included in the PEST package (SCEUA_P).
The selection of calibration parameters was the result of an iterative procedure including extensive sensitivity analysis and repeated trial model runs. The final selection was based on the following principles: (i) spatial variation of vegetation and soil parameters is determined by the input datasets and should be left unchanged during calibration. The corresponding SWAT parameters were either not changed at all or multiplied with a global factor. (ii) The water balance of the rainfall-runoff model should be maintained. Therefore the fraction of the recharge entering the deep aquifer was set to zero. (iii) SWAT groundwater parameters are highly uncertain a priori but at the 15 same time very sensitive. Enough spatial variation in groundwater parameters must be allowed in order to reproduce the various recession time scales in the observed hydrographs. (iv) SWAT has two threshold values of the shallow groundwater storage, one controlling the onset of baseflow and one controlling the onset of phreatic evapotranspiration. The absolute magnitudes of the two threshold values are less important 20 because they mainly control the length of the required model warm-up period. However, the difference between these two threshold values has significant control over the water balance of the catchment: if the baseflow threshold is below the phreatic ET threshold, more water will leave the catchment as baseflow and less as actual ET and vice versa. In order to reduce parameter correlation and non-uniqueness, the baseflow 25 threshold was generally fixed at 100 mm in the Kavango SWAT model. Table 1 provides an overview of the calibration parameters and their allowed ranges. For the groundwater parameters, spatial variation was allowed between the Rundu and Mohembo regions, the upstream and downstream catchments within each region and HESSD 11,2014 Operational river discharge forecasting in poorly gauged basins: the Kavango River case study P. Bauer-Gottwein et al. the high slope and low slope portions of the land surface. This resulted in a total number of 19 calibration parameters for the Rundu region and 20 calibration parameters for the Mohembo region. We chose 8 complexes in the SCE calibration run and the number of complexes remained the same throughout the run. Both the number of parameter sets in each complex and the number of evolution steps before complex shuffling were 5 set to 39 and 41 for the Rundu and Mohembo regions respectively. The convergence criterion was set to a relative improvement of the best objective function of 1 % over 10 shuffling loops. A total of 50 000 model runs were allowed, however the calibration converged after 14 711 and 18 373 model runs for the Rundu and Mohembo regions respectively. After completion of the SCE run, the evolution of the parameter values 10 over the course of the shuffling loops was evaluated. All parameter values converged to a stable solution away from the a priori parameter bounds.

Assimilation strategy
The objective of data assimilation is to combine, at each point in time, the model-based estimate of the state of the system as well as the most recent observations of the 15 state, to produce the best possible estimate of the current and future states, taking into account the respective uncertainties of simulated states and observations. The assimilation strategy chosen in this study consists of updating the simulated discharge in the Muskingum routing model only, because the objective was to generate probabilistic river discharge forecasts with lead times of up to 7 days. Updates of the rainfall-runoff 20 model states would probably improve long-term forecasts significantly but may have limited effect on forecasts with short lead times in large basins such as the Kavango Basin. Moreover, updating the rainfall-runoff model will require ensemble-based assimilation approaches. For the intended user group of the TIGER-NET products, simplicity and efficiency are key criteria. 25 Observed in-situ discharge at the station Rundu is assimilated to the model in the operational runs. Because the Muskingum routing operator is linear and the measurement operator is linear too, we can use the standard Kalman filter for state updating. 11080 11,2014 Operational river discharge forecasting in poorly gauged basins: the Kavango River case study P. Bauer-Gottwein et al. The Kalman filter is the optimal sequential assimilation method for linear dynamics (Kalman, 1960). If instead of river discharge, water level measurements from spaceborne or ground-based instruments are assimilated, the measurement operator becomes non-linear and the extended Kalman filter can be used (Michailovsky et al., 2013). The reader is referred to the literature (e.g. Jazwinski, 1970) for a detailed dis-5 cussion of the Kalman filter equations.

Description of the model error
Runoff is assumed to be the dominant source of error in the routing model. While the routing model parameters, which depend on reach geometries and Manning's friction factors, are uncertain, runoff uncertainty can be expected to be much more signifi-10 cant due to the error in the NOAA-GFS rainfall forcing as well as structural deficiencies and/or parameterization errors in the SWAT model. In order to find a reasonable representation of the model error, the magnitude, auto-correlation and spatial crosscorrelation of the runoff error had to be assessed. No direct measurements of runoff are available within the river basin. To derive an operational error model, we assume that 15 magnitude and autocorrelation of the relative runoff error are the same as magnitude and autocorrelation of the relative model residuals at the available in-situ discharge stations: where w t is the relative model residual (-), Q sim,t is the modelled discharge at the in-20 situ discharge station at time step t and Q obs,t is the in-situ discharge as time step t. The autocorrelation of the residuals was assumed to be represented by a first order autoregressive (AR1) model: HESSD 11,2014 Operational river discharge forecasting in poorly gauged basins: the Kavango River case study P. Bauer-Gottwein et al. where δ is the AR1 parameter and ε is a sequence of white Gaussian noise with a spatial covariance Q . Due to the correlated meteorological inputs the runoff forcing error was assumed to be spatially correlated between the various subcatchments of the model. We assume that the spatial correlation of the runoff forcing error is equivalent to the spatial correlation of the runoff forcing itself. The correlation matrix of the runoff 5 inputs was computed and Q was set to: where C is the runoff correlation matrix and σ( ) 2 is the variance of the white noise component of the AR1 model. The auto-correlated runoff error state was integrated in the Kalman filter updating scheme by augmenting the model state vector with the corre-10 lated noise term (Jazwinski, 1970;Michailovsky et al., 2013). This ensures persistence of assimilation benefits in time.
The major source of error in in-situ discharge observations is the rating curve, which is used to transform readings of river stage into river discharge. Rating curves are particularly unreliable for extreme flow rates and, depending on the channel character- 15 istics, the rating curve changes over time and requires frequent updating. In the absence of detailed information on the in-situ measurement procedure, we assumed the measurement error to be uncorrelated in time and proportional to the discharge. The relative error was assumed to be 10 %, which is a typical value for in-situ discharge derived from rating curves (Di Baldassarre, 2009) and comparable to other hydrologic 20 data assimilation studies (e.g. Clark et al., 2008).

Operational forecasting and performance evaluation
Operational forecasts have been issued at the daily basis for the validation period and supplied to Namibia's Ministry of Agriculture Water and Forestry for web-based dissemination. A set of criteria were used to assess the performance of the probabilistic river HESSD 11,2014 Operational river discharge forecasting in poorly gauged basins: the Kavango River case study P. Bauer-Gottwein et al. of the central model forecast, as well as the reliability and sharpness of the probabilistic forecasts. The following criteria were used to assess the performance of the central model forecast: Nash-Sutcliffe model efficiency (NSE), root-mean square error (RMSE), mean error (ME) and persistence index. The persistence index (PI, Bennett et al., 2013) is defined analogous to the NSE: where n is the number of forecasted observations, Q are the forecasts, Q obs are the observations and Q last is the latest available observation before the forecasted observation. While the NSE uses the average of the observations as the benchmark (i.e. a forecast that performs as good as the long-term average of the available observa-10 tions scores an NSE of 0), the PI uses the last available observation as the benchmark (i.e. a forecast that performs as good as the latest available observation scores a PI of 0).
Reliability and sharpness of the probabilistic forecasts were assessed with the coverage of the 95 % confidence interval (i.e. percentage of observations that fall within the 15 predicted nominal 95 % confidence interval), the sharpness of the 95 % confidence interval (width of predicted 95 % confidence interval), the Interval Skill Score (ISS) of the 95 % confidence interval as well as the continuous ranked probability score (CRPS). The ISS is defined according to Gneiting and Raftery (2007) as: HESSD 11,2014 Operational river discharge forecasting in poorly gauged basins: the Kavango River case study P. Bauer-Gottwein et al. where α is the level of the confidence interval (0.05 in our case), l is the lower and u the upper bound of the confidence interval. The CRPS is a verification tool for probabilistic forecasts and can be interpreted as the area between the cumulative distribution function of the forecast and the cumulative distribution function of the observation, which is a Heaviside step function. The CRPS 5 thus compares the full distribution function of the forecast with the observation and not only selected confidence intervals. For normally distributed forecasts, a closed-form expression for the CRPS exists (Gneiting et al., 2004): where σ is the SD of the probabilistic forecast, Φ is the cumulative distribution function and φ the probability density function of the standard normal distribution.

Comparison of precipitation products
Comparison of the FEWS-RFE and NOAA-GFS precipitation products showed large 15 deviations between the two products. Figure 2 shows a double mass plot for the average precipitation over the entire Kavango River catchment for the period 2005-2012.
Obviously, there is a significant bias and the timing of precipitation events is inconsistent too, as evidenced by the wiggles in the double mass curve. and Sandholt, 2010). We therefore assume that the FEWS-RFE product is closer to the unknown true precipitation than NOAA-GFS and bias correct the NOAA-GFS data to match the long-term average precipitation for both products. A spatially and temporally constant precipitation correction factor of 0.67 was therefore used throughout the study. Clearly, the quality of the precipitation forcing is a critical issue, which has significant control over the performance of the forecasting system. Within the TIGER-NET framework, we are dependent on public domain datasets and NOAA-GFS was the only free source of operational weather forecasts for the African continent available to the project. Potentially, model performance could be improved if NOAA-GFS data was corrected dynamically, for instance by continuously benchmarking it against real-time 10 or near real-time precipitation products such as FEWS-RFE or TRMM-3B42 (Huffman et al., 2007) for the recent past and estimating a time-variable bias correction. An even better solution would be to merge NOAA-GFS data with in-situ precipitation data. However, no operational dataset of in-situ precipitation observations is available for this part of Africa.

Performance of the calibrated model
Calibration of the hydrologic-hydrodynamic modelling system with the SCE algorithm was successful. region. While the years 2005-2008 were relatively dry, the following years were exceptionally wet in the region (Wolski et al., 2014). The calibrated SWAT model cannot match these inter-annual dynamics and ends up over-predicting flow in the dry years and under-predicting flow in the wet years. Figures 3 and 4 show comparisons of simulated and observed hydrographs for both stations and both simulation periods. Table 1 5 provides an overview of the calibrated parameter values. All parameter values are physically reasonable and calibrated parameter values do not stick to the bounds of a-priori parameter intervals.
Model residuals were analysed and tested for normality and autocorrelation. Figure 5 summarizes the results of the model error analysis. Figure 5a plots the relative 10 error of the hydrologic-hydrodynamic model vs. the observed discharge. Obviously, the relative error is not independent of discharge; it is higher for low discharge than for high discharge. The Q-Q plot in Fig. 5b shows that the empirical distribution of model errors significantly deviates from a normal distribution. The empirical distribution of the model errors is narrower than the normal distribution and a larger portion of 15 the data is clustered around the mean. The correlogram in Fig. 5c shows highly significant auto-correlation of the model errors. Figure 5d shows the residual model errors (ε) after application of the AR1 model (Eq. 4), plotted against the observed discharge. This distribution looks more even than the distribution of the primary model residuals in Fig. 5a. A test for normality using the Q-Q plot shows significant deviations and again 20 a narrower distribution than the normal distribution (Fig. 5e). Temporal correlations have been effectively removed from the model errors and no significant correlations remain as shown in Fig. 5f. We conclude from this analysis that the relative error of the hydrologic-hydrodynamic model can be reasonably represented with an AR1 model. The time correlation of the AR1 model is δ = 0.9917 on the daily time step. The random 25 error contribution is ε = 0.0438. As explained in the methods section, we assume that the same AR1 model parameters can represent the relative error of the runoff forcing and we use this result to parameterize the model error in the Kalman filter assimilation scheme. 11,2014 Operational river discharge forecasting in poorly gauged basins: the Kavango River case study  Table 3 reports the performance statistics for the probabilistic model runs. We report results for the open-loop run without assimilation, the assimilation run ("now-casting") as well as the 1-7 day ahead forecasts. We only assimilate data from the station Rundu, because (i) no real-time observations are available for Mohembo and (ii) this enables us 5 to assess the effect of upstream assimilation on a downstream station. The indicators are reported for both in-situ stations and for the calibration and the validation period.

Discharge forecasting and data assimilation
We are well aware that the observations in the calibration period have been used already for model calibration and are now used again for assimilation. Still, we feel that it is useful to present the statistics for information. Figure 6 shows the open-loop and assimilation run for the station Rundu during calibration and validation periods.  Fig. 6. This results in a relatively high ISS score. The assimilation run is much sharper for all stations and periods and we do not observe a significant loss of reliability, except for Mohembo during the validation period. 20 This can again be explained by the low number of observations at Mohembo during the validation period as well as relative over-sampling of the high-flow period. ISS scores are consequently much lower than for the open-loop run, which indicates massive improvement. The 1-7 day ahead forecast runs show degrading performance for increasing lead times. However, even the 7 day ahead forecast generally has a lower 25 ISS than the open-loop run, except for Rundu during the validation period. Figure 8 graphically summarizes the performance indicators. Clearly, the central forecast is better for all lead times than the central run in the open-loop simulation. All three indicators HESSD 11,2014 Operational river discharge forecasting in poorly gauged basins: the Kavango River case study P. Bauer-Gottwein et al. (NSE, RMSE and ME) show significant improvement. Coverage decreases rapidly with increasing lead time for the station Rundu but is more or less independent of lead time for the station Mohembo. This can be explained by the routing time lag between the two stations. Improvements due to assimilation of Rundu data travel down to Mohembo and are still visible at this station after many days. For the station Rundu, increased sharp-5 ness is over-compensated by loss of reliability, which leads to increasing ISS scores with increasing lead time. For the validation period, only the 0-3 ahead forecasts are better than the open-loop run, if evaluated with the ISS score. We generally observe weaker performance of the forecasting system after longer periods without in-situ observations. If no in-situ data is available for some time, the model error increases. After observations become available again, large updates are applied to the model states by the Kalman filter, which are then traveling downstream in the river and can cause erratic response. Table 4 shows performance indicators of the forecasting system for a portion of the validation period. In this dataset, the first few observations that come in after extended periods without observations are removed. In 15 total, about 15 % of the observations were discarded when computing the performance indicators. For this reduced dataset, the ISS score for all forecasting horizons remains well below the score of the open-loop simulation. For this dataset, we also computed the persistence index and the CRPS score, which are graphically displayed in Fig. 9. According to the CRPS score, the forecasts are far superior to the open-loop run for 20 all forecasting horizons. The persistence index only evaluates the performance of the central forecast and compares it to the performance of a deterministic forecast equal to the last available observation. The results indicate that, for short lead times, the last observation outperforms the central forecast, while for longer lead times, the forecasting system performs better than the last observation. The break-even point occurs some- 25 where around a lead time of 4 days. However, it is important to note that the PI does not assess the quality of probabilistic forecasts in terms of sharpness and reliability but only takes the central forecast into account.

Discussion
The presented approach for the generation of probabilistic river discharge forecasts is simple and robust and designed to work in data-sparse and poorly gauged basins.
A key factor for the performance of the system is the rainfall forcing. While the NOAA-GFS rainfall can produce reasonably reliable and sharp forecasts for the Kavango 5 River, the product should be further compared against other operational precipitation products. A promising avenue for future research may be dynamic bias correction using other precipitation or soil moisture products. From Fig. 9, we conclude that extending the forecast lead time beyond 7 days could add value to the system, because CRPS scores are still well below the open-loop score at 7 day lead time and the persistence 10 index indicates break-even at around 4 days. NOAA-GFS does actually provide forecasts up to 16 days into the future. However, the spatial resolution is reduced by a factor of 2 for forecasting horizons beyond one week. It may nevertheless be valuable to explore the use of more long-term weather forecasts. To further improve the reliability and sharpness of the forecasts, an ensemble of weather forecasts should be used to drive 15 the forecasting system (Cloke and Pappenberger, 2009). However, such systems are presently not available for Southern Africa. As in other hydrologic data assimilation studies (e.g. Clark et al., 2008), parameterization of the model error is a fundamental issue for the performance of the assimilation scheme. Generally, model error terms can be added to the forcings, the states, and 20 the parameters of a model. Here, we assign all model error to the runoff forcing and quantify magnitude and auto-correlation of the error based on the comparison of simulated and observed river discharge. Unlike other authors, we do not apply error terms to the states and parameters of the routing model, because we assume that these error contributions are minor compared to the runoff error. While this approach is robust and HESSD 11,2014 Operational river discharge forecasting in poorly gauged basins: the Kavango River case study P. Bauer-Gottwein et al. As is common for studies dealing with probabilistic river discharge forecasting, we find that our probabilistic forecasts are over-reliable during low flow periods and underreliable during high-flow periods. This issue can be addressed by separating the total runoff forcing generated by the SWAT model into its components, i.e. overland flow, interflow and baseflow, and developing separate error representations for the various 5 runoff components. However, given the sparse availability of in-situ observations in the basins, it may be difficult to find robust parameters for these error representations.
In this study, focus has been on the final output of the modelling chain, i.e. river discharge. However, SWAT simulates a multitude of intermediate states and fluxes in the land phase of the hydrological cycle, which could be analysed and compared 10 to observations, if such observations were available. There is an obvious opportunity to inform the modelling system with other types of in-situ and remote sensing observations such as radar altimetry, soil moisture and total water storage from timevariable gravity (Milzow et al., 2011). If such data were to be formally assimilated to the modelling system, an ensemble approach would have to be chosen because of 15 the highly non-linear responses inherent in the SWAT model. Many studies have addressed ensemble-based streamflow forecasting with lumped-conceptual or distributed hydrological models. Common issues in these studies are high computational demand, time lags between the rainfall-runoff model states and streamflow response, and model error parameterization (e.g. Clark et al., 2008).

Conclusions
We have presented an operational probabilistic river discharge forecasting system for poorly gauged basins which relies exclusively on public-domain, open-source software and data. The forecasting system is specifically adapted to the conditions prevailing in many African basins, such as weak in-situ monitoring infrastructure, budget constraints 25 for operational monitoring and management as well as weak institutional capacity. We demonstrated the performance of the forecasting system for the Kavango River and obtained encouraging results. Zero to 7 day ahead probabilistic forecasts produced by the system are sharp and reliable. The results indicate that forecasting horizons could be extended to more than seven days, if suitable weather forecasting products can be made available. The system may also benefit from ingestion of other types of in-situ or remotely sensed observations such as radar altimetry and soil moisture.