Stochastic models in hydrology are very useful and widespread tools for making reliable probabilistic predictions. However, such models are only accurate at making predictions if model parameters are first of all calibrated to measured data in a consistent framework such as the Bayesian one, in which knowledge about model parameters is described through probability distributions. Unfortunately, Bayesian parameter calibration, a. k. a. inference, with stochastic models, is often a computationally intractable problem with traditional inference algorithms, such as the Metropolis algorithm, due to the expensive likelihood functions. Therefore, the prohibitive computational cost is often overcome by employing over-simplified error models, which leads to biased parameter estimates and unreliable predictions. However, thanks to recent advancements in algorithms and computing power, fully fledged Bayesian inference with stochastic models is no longer off-limits for hydrological applications. Our goal in this work is to demonstrate that a computationally efficient Hamiltonian Monte Carlo algorithm with a timescale separation makes Bayesian parameter inference with stochastic models feasible. Hydrology can potentially take great advantage of this powerful data-driven inference method as a sound calibration of model parameters is essential for making robust probabilistic predictions, which can certainly be useful in planning and policy-making. We demonstrate the Hamiltonian Monte Carlo approach by detailing a case study from urban hydrology. Discussing specific hydrological models or systems is outside the scope of our present work and will be the focus of further studies.

A fundamental and highly non-trivial question in many applied sciences is how to make reliable predictions about the dynamics of a complex system. In hydrological modelling in particular, the ability of predicting extreme events like floods is obviously of paramount importance. Conceptual rainfall-runoff models that incorporate only a few state variables and a few system parameters often represent a very practical and efficient solution for making probabilistic predictions. The basic idea is to describe slow processes occurring at our observation scale by phenomenological differential equations and include all other processes as noise. Incorporating the noise in the model, where it arises, naturally leads to stochastic differential equation (SDE) models. Model parameters then need to be calibrated on observed data, usually provided in the form of noisy time series. The goal of the calibration process is to determine the parameters that allow the model to reproduce the observed data and quantify their uncertainties, expressed as probability distributions. For this purpose, Bayesian statistics is a consistent framework where our knowledge about model parameters is described by probability distributions and learning as a data-driven updating process of prior beliefs. Bayesian inference methods bear the great advantage over traditional optimization algorithms of providing an uncertainty estimation for the calibrated parameters in the form of a probability distribution. The knowledge of such uncertainty is important for making probabilistic predictions, which can in turn be a useful tool for decision-makers. Hydrology could potentially take great advantage of more realistic stochastic models and a fast and reliable method for their calibration. However, Bayesian inference turns out to be computationally very expensive for non-trivial stochastic models.

Uncertainty in rainfall-runoff hydrological modelling arises mostly from input errors associated with an inaccurate estimation of the integrated rainfall over a catchment

In

Here we combine the HMC and SIP methods to perform Bayesian inference with a stochastic input model. We show that the HMC method can be extended from the toy model and the smooth synthetic data used in

The HMC method bears valuable advantages with regard to both generality and efficiency. Indeed, it is by no means limited to an OU process, unlike the original SIP approach of

The SIP method of

The inference process allows us to learn from noisy rainfall and runoff time series,

For this purpose, we subdivide each interval between consecutive rain observations into

List of model parameters that are assumed to be known, with their values and units.

The inference goal is to sample both parameter combinations (

The stochastic process realization

On the other hand, when the inference target is only the posterior distribution for the model parameters (

To tackle this problem, we apply a Hamiltonian Monte Carlo (HMC) algorithm

The HMC algorithm allows us to sample simultaneously from the posterior of Eq. (

The method described here requires a stochastic input process (i.e. a rainfall model), a hydrological rainfall-runoff model to describe the observed discharges

The rainfall potential is described by a normal and linear Ornstein–Uhlenbeck (OU) process with mean zero and standard deviation unity, which can be written in the form of a Langevin equation as

The rainfall potential

Prior probability densities for the inferred parameters, with their mean values (

The storm water runoff is modelled by a linear reservoir supplied by rainfall precipitation,

The discretized form of Eq. (

The predicted discharge

The probability distribution for the observed discharges

The observation error model for the rainfall, given the rainfall potential

At time points where

At time points where

At this point, the only elements of Eq. (

Our prior knowledge of the rainfall potential

Before setting off to implement the HMC algorithm, we take one further convenient step; i.e. we apply the transformation from the coordinates

The action

The HMC algorithm interprets the negative logarithm of the posterior density as a potential energy driving the dynamics of a fictitious statistical mechanics system whose configurations, namely, the system's degrees of freedom, are described by combinations of both parameters (

The masses

The potential energy, proportional to

The HMC algorithm iterates the following steps. First, vectors of momenta

Note that the presence of pronounced local minima in a high-dimensional phase space might represent an insurmountable obstacle, even for more refined implementations of the HMC method, for example, with automatic tuning of the algorithm hyper-parameters (see Sect.

Using Eq. (

Let us now exploit the different timescales characterizing the three components of the Hamiltonian

As explained in

In this work we apply a HMC method with a stochastic input model (SIP) following

The precipitation data

The discharge flow at the outlet of the catchment was measured with a time resolution of 4 min, and the output observations

Scenario 1 represents a best-case scenario of input data availability, and we shall therefore classify it as the accurate input scenario, while scenario 2 is a typical example of inaccurate and unreliable input data due to both its sparsity and the distance of the P2 rain gauge from the area of interest. The runoff observations

We are particularly interested in the performance of the combined SIP-HMC method in the latter case, characterized by faulty precipitation data, which clearly represents the most challenging scenario and therefore the hardest test for the HMC method. Therefore, our work consists of three main steps. First, we use the combined SIP-HMC approach described above, with the inaccurate precipitation data

The HMC algorithm is implemented in C

The simulations were run on 2.6–3.7 GHz processors Xeon-Gold 6142 with 196 GB of memory. We observed a relatively short burn-in phase for all inferred parameters, suggesting the possibility of a straightforward (embarrassingly simple) parallelization of the algorithm obtained by simply breaking up the Markov chains into smaller independent chains that can then be executed as parallel processes. It is well known that Markov Chain Monte Carlo (MCMC) methods, like the HMC algorithm employed here, are indeed very well suited for parallel computing. This kind of approach was proven successful in

In the present work, for Sc2, after an initial single burn-in chain of 75 000 steps, which is then disregarded, we limited ourselves to running four independent Markov chains each of length 100 000 steps based on a serial implementation of the algorithm. For Sc1, which is faster than Sc2 due to the much smaller number of intermediate discretization points (see below for more details), we considered a single chain of 750 000 steps and disregarded the first 150 000 steps. The extension of the current serial version to an OpenMP parallel implementation of the HMC code would be straightforward.

We set a fine-grid time step

The algorithm requires tuning of two sets of parameters, that is, the parameters defining the Hamiltonian propagator in the molecular dynamics part of the HMC algorithm (see Eq. 26 in

It should be noted that the integration time interval of the Hamiltonian propagator could be automatically optimized by employing the so-called No-U-Turn Sampler (NUTS)

In Sc2, a full Markov chain of 100 000 steps requires approximately 1 h and 20 min on our hardware, while a chain of 750 000 in Sc1 requires about 2.5 h. At each iteration of the chain, the algorithm infers 1449 parameters, that is, eight model parameters (

The algorithm spends

Parameter masses

The Markov chains for the model parameters generated using the unreliable input data of Sc2 are shown in Fig.

Markov chains for the model parameters generated using the faulty input data of Sc2. As explained in Sect.

Markov chains for the model parameters from Sc1. Unlike Sc2, we have a single chain for each parameter in this case.

The two scenarios bear some interesting albeit not surprising differences. In general, the posterior distributions generated in Sc1 tend to be narrower than the corresponding distributions in Sc2, clearly reflecting the accuracy of the precipitation data. The rainfall observational error

Moreover, Fig.

Marginal posterior probability densities for the model parameters from Sc2 (thick black line) and Sc1 (thin grey line). The dashed lines represent the prior densities.

Among the parameters of the hydrological model, the marginal distribution of the retention time

In Fig.

Typical Markov chains for two different points of the stochastic process

In the left panels of Fig.

Comparison of observed and predicted discharges

The observed output peaks are used by the HMC algorithm as an additional source of information about the rain falling over the catchment area during the observation time. This new information, together with the stochastic input model, is used to attempt a reconstruction of a true rainfall pattern. The simulated rainfall and outflow patterns are represented by the medians of their inferred distributions (black line) and an uncertainty given by the 2.5 %–97.5 % quantiles (grey area). The rainfall pattern reconstructed using the inaccurate data of Sc2 (lower left panel) clearly displays the peaks corresponding to the rainfall events that had been missed by the pluviometer P2 located away from the catchment (filled blue squares). Such predicted peaks reproduce the rainfall events detected in Sc1 by the rain gauges P1

The right panels of Fig.

Although not shown here, we have also run the HMC inference without rainfall data at all, i.e. omitting the term

The inferred outflows, shown in the upper panels of Fig.

The goal of this work is to demonstrate that HMC algorithms employing a timescale separation can solve hard inference problems with stochastic hydrological models. In

The combined SIP-HMC method presented here allows us also to estimate the “true” average rain input to a hydrological system in the case of highly inaccurate precipitation data probabilistically and with great accuracy, using only prior knowledge and the observed outflow. Runoff data are used by the algorithm as a first-hand information resource about the unknown precipitation over the catchment. This information can override the available, and possibly inaccurate, rainfall data. The reconstructed precipitation is then used to infer the hydrological model parameters, which are thus protected from the deteriorating effect of the uncertainty on the rainfall observations. This approach considerably reduces the bias in the inferred parameters and therefore leads to more reliable runoff predictions, which can in turn be very useful for decision-makers in planning and policy-making.

The use of AD makes the algorithm in principle applicable to more complex models. Indeed, the generalization of the algorithm from the toy model used in

The extension of the HMC method described here to further hydrological models and systems will be the focus of future works. Furthermore, the HMC algorithm presented here is not at all limited to hydrology. It is a very general, efficient, easily parallelizable and scalable algorithm that makes Bayesian inference with expensive stochastic models feasible in spite of its computational hardships, with a very broad range of potential applications in applied sciences that can benefit from stochastic modelling and a fast Bayesian inference method.

The C

The supplement related to this article is available online at:

CA conceived the original idea, and SU and CA designed the overall study and developed the theory. SU developed the code and performed the simulations. SU prepared the manuscript with contributions from CA.

The contact author has declared that neither of the authors has any competing interests.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We wish to thank the High Performance Computing team of the Zurich University of Applied Sciences in Wädenswil, Switzerland, for the computational support. We are also grateful to Jörg Rieckermann, Eawag, for the permission to re-use the data from his group, originally published in

This research has been supported by the Swiss National Science Foundation (grant no. 200021_169295).

This paper was edited by Nadav Peleg and reviewed by three anonymous referees.