Legitimising data-driven models: exemplification of a new data-driven mechanistic modelling framework
Abstract. In this paper the difficult problem of how to legitimise data-driven hydrological models is addressed using an example of a simple artificial neural network modelling problem. Many data-driven models in hydrology have been criticised for their black-box characteristics, which prohibit adequate understanding of their mechanistic behaviour and restrict their wider heuristic value. In response, presented here is a new generic data-driven mechanistic modelling framework. The framework is significant because it incorporates an evaluation of the legitimacy of a data-driven model's internal modelling mechanism as a core element in the modelling process. The framework's value is demonstrated by two simple artificial neural network river forecasting scenarios. We develop a novel adaptation of first-order partial derivative, relative sensitivity analysis to enable each model's mechanistic legitimacy to be evaluated within the framework. The results demonstrate the limitations of standard, goodness-of-fit validation procedures by highlighting how the internal mechanisms of complex models that produce the best fit scores can have lower mechanistic legitimacy than simpler counterparts whose scores are only slightly inferior. Thus, our study directly tackles one of the key debates in data-driven, hydrological modelling: is it acceptable for our ends (i.e. model fit) to justify our means (i.e. the numerical basis by which that fit is achieved)?