Articles | Volume 18, issue 8
Research article
05 Aug 2014
Research article |  | 05 Aug 2014

Benchmarking hydrological models for low-flow simulation and forecasting on French catchments

P. Nicolle, R. Pushpalatha, C. Perrin, D. François, D. Thiéry, T. Mathevet, M. Le Lay, F. Besson, J.-M. Soubeyroux, C. Viel, F. Regimbeau, V. Andréassian, P. Maugis, B. Augeard, and E. Morice

Abstract. Low-flow simulation and forecasting remains a difficult issue for hydrological modellers, and intercomparisons can be extremely instructive for assessing existing low-flow prediction models and for developing more efficient operational tools. This research presents the results of a collaborative experiment conducted to compare low-flow simulation and forecasting models on 21 unregulated catchments in France. Five hydrological models (four lumped storage-type models – Gardenia, GR6J, Mordor and Presages – and one distributed physically oriented model – SIM) were applied within a common evaluation framework and assessed using a common set of criteria. Two simple benchmarks describing the average streamflow variability were used to set minimum levels of acceptability for model performance in simulation and forecasting modes. Results showed that, in simulation as well as in forecasting modes, all hydrological models performed almost systematically better than the benchmarks. Although no single model outperformed all the others for all catchments and criteria, a few models appeared to be more satisfactory than the others on average. In simulation mode, all attempts to relate model efficiency to catchment or streamflow characteristics remained inconclusive. In forecasting mode, we defined maximum useful forecasting lead times beyond which the model does not bring useful information compared to the benchmark. This maximum useful lead time logically varies between catchments, but also depends on the model used. Simple multi-model approaches that combine the outputs of the five hydrological models were tested to improve simulation and forecasting efficiency. We found that the multi-model approach was more robust and could provide better performance than individual models on average.