Looking beyond general metrics for model comparison – lessons from an international model intercomparison study
Tanja de Boer-Euser1,Laurène Bouaziz2,Jan De Niel3,Claudia Brauer4,Benjamin Dewals5,Gilles Drogue6,Fabrizio Fenicia7,Benjamin Grelier6,Jiri Nossent8,9,Fernando Pereira8,Hubert Savenije1,Guillaume Thirel10,and Patrick Willems3,9Tanja de Boer-Euser et al.Tanja de Boer-Euser1,Laurène Bouaziz2,Jan De Niel3,Claudia Brauer4,Benjamin Dewals5,Gilles Drogue6,Fabrizio Fenicia7,Benjamin Grelier6,Jiri Nossent8,9,Fernando Pereira8,Hubert Savenije1,Guillaume Thirel10,and Patrick Willems3,9
Received: 08 Jul 2016 – Discussion started: 20 Jul 2016 – Revised: 29 Nov 2016 – Accepted: 16 Dec 2016 – Published: 25 Jan 2017
Abstract. International collaboration between research institutes and universities is a promising way to reach consensus on hydrological model development. Although model comparison studies are very valuable for international cooperation, they do often not lead to very clear new insights regarding the relevance of the modelled processes. We hypothesise that this is partly caused by model complexity and the comparison methods used, which focus too much on a good overall performance instead of focusing on a variety of specific events. In this study, we use an approach that focuses on the evaluation of specific events and characteristics. Eight international research groups calibrated their hourly model on the Ourthe catchment in Belgium and carried out a validation in time for the Ourthe catchment and a validation in space for nested and neighbouring catchments. The same protocol was followed for each model and an ensemble of best-performing parameter sets was selected. Although the models showed similar performances based on general metrics (i.e. the Nash–Sutcliffe efficiency), clear differences could be observed for specific events. We analysed the hydrographs of these specific events and conducted three types of statistical analyses on the entire time series: cumulative discharges, empirical extreme value distribution of the peak flows and flow duration curves for low flows. The results illustrate the relevance of including a very quick flow reservoir preceding the root zone storage to model peaks during low flows and including a slow reservoir in parallel with the fast reservoir to model the recession for the studied catchments. This intercomparison enhanced the understanding of the hydrological functioning of the catchment, in particular for low flows, and enabled to identify present knowledge gaps for other parts of the hydrograph. Above all, it helped to evaluate each model against a set of alternative models.
In this study, the rainfall–runoff models of eight international research groups were compared for a set of subcatchments of the Meuse basin to investigate the influence of certain model components on the modelled discharge. Although the models showed similar performances based on general metrics, clear differences could be observed for specific events. The differences during drier conditions could indeed be linked to differences in model structures.
In this study, the rainfall–runoff models of eight international research groups were compared...