Articles | Volume 22, issue 5
Research article
22 May 2018
Research article |  | 22 May 2018

Framework for developing hybrid process-driven, artificial neural network and regression models for salinity prediction in river systems

Jason M. Hunter, Holger R. Maier, Matthew S. Gibbs, Eloise R. Foale, Naomi A. Grosvenor, Nathan P. Harders, and Tahali C. Kikuchi-Miller

Abstract. Salinity modelling in river systems is complicated by a number of processes, including in-stream salt transport and various mechanisms of saline accession that vary dynamically as a function of water level and flow, often at different temporal scales. Traditionally, salinity models in rivers have either been process- or data-driven. The primary problem with process-based models is that in many instances, not all of the underlying processes are fully understood or able to be represented mathematically. There are also often insufficient historical data to support model development. The major limitation of data-driven models, such as artificial neural networks (ANNs) in comparison, is that they provide limited system understanding and are generally not able to be used to inform management decisions targeting specific processes, as different processes are generally modelled implicitly. In order to overcome these limitations, a generic framework for developing hybrid process and data-driven models of salinity in river systems is introduced and applied in this paper. As part of the approach, the most suitable sub-models are developed for each sub-process affecting salinity at the location of interest based on consideration of model purpose, the degree of process understanding and data availability, which are then combined to form the hybrid model. The approach is applied to a 46 km reach of the Murray River in South Australia, which is affected by high levels of salinity. In this reach, the major processes affecting salinity include in-stream salt transport, accession of saline groundwater along the length of the reach and the flushing of three waterbodies in the floodplain during overbank flows of various magnitudes. Based on trade-offs between the degree of process understanding and data availability, a process-driven model is developed for in-stream salt transport, an ANN model is used to model saline groundwater accession and three linear regression models are used to account for the flushing of the different floodplain storages. The resulting hybrid model performs very well on approximately 3 years of daily validation data, with a Nash–Sutcliffe efficiency (NSE) of 0.89 and a root mean squared error (RMSE) of 12.62 mg L−1 (over a range from approximately 50 to 250 mg L−1). Each component of the hybrid model results in noticeable improvements in model performance corresponding to the range of flows for which they are developed. The predictive performance of the hybrid model is significantly better than that of a benchmark process-driven model (NSE  =  −0.14, RMSE  =  41.10 mg L−1, Gbench index  =  0.90) and slightly better than that of a benchmark data-driven (ANN) model (NSE  =  0.83, RMSE  =  15.93 mg L−1, Gbench index  =  0.36). Apart from improved predictive performance, the hybrid model also has advantages over the ANN benchmark model in terms of increased capacity for improving system understanding and greater ability to support management decisions.

Short summary
This research proposes a generalised hybrid model development framework and applies it to a case study of salinity prediction in a reach of the Murray River. The hybrid model combines five sub-models which describe one process of salt entry each and are developed based on the amount of system knowledge and data that are available to support each individual process. The model demonstrates increased performance over two benchmark models and has implications for future model development processes.