Articles | Volume 14, issue 12
Research article
13 Dec 2010
Research article |  | 13 Dec 2010

Why hydrological predictions should be evaluated using information theory

S. V. Weijs, G. Schoups, and N. van de Giesen

Abstract. Probabilistic predictions are becoming increasingly popular in hydrology. Equally important are methods to test such predictions, given the topical debate on uncertainty analysis in hydrology. Also in the special case of hydrological forecasting, there is still discussion about which scores to use for their evaluation. In this paper, we propose to use information theory as the central framework to evaluate predictions. From this perspective, we hope to shed some light on what verification scores measure and should measure. We start from the ''divergence score'', a relative entropy measure that was recently found to be an appropriate measure for forecast quality. An interpretation of a decomposition of this measure provides insight in additive relations between climatological uncertainty, correct information, wrong information and remaining uncertainty. When the score is applied to deterministic forecasts, it follows that these increase uncertainty to infinity. In practice, however, deterministic forecasts tend to be judged far more mildly and are widely used. We resolve this paradoxical result by proposing that deterministic forecasts either are implicitly probabilistic or are implicitly evaluated with an underlying decision problem or utility in mind. We further propose that calibration of models representing a hydrological system should be the based on information-theoretical scores, because this allows extracting all information from the observations and avoids learning from information that is not there. Calibration based on maximizing utility for society trains an implicit decision model rather than the forecasting system itself. This inevitably results in a loss or distortion of information in the data and more risk of overfitting, possibly leading to less valuable and informative forecasts. We also show this in an example. The final conclusion is that models should preferably be explicitly probabilistic and calibrated to maximize the information they provide.