This paper presents a novel methodology for estimating the unknown discharge hydrograph at the entrance of a river reach when no information is available. The methodology couples an optimization procedure based on the Bayesian geostatistical approach (BGA) with a forward self-developed 2-D hydraulic model. In order to accurately describe the flow propagation in real rivers characterized by large floodable areas, the forward model solves the 2-D shallow water equations (SWEs) by means of a finite volume explicit shock-capturing algorithm. The two-dimensional SWE code exploits the computational power of graphics processing units (GPUs), achieving a ratio of physical to computational time of up to 1000. With the aim of enhancing the computational efficiency of the inverse estimation, the Bayesian technique is parallelized, developing a procedure based on the Secure Shell (SSH) protocol that allows one to take advantage of remote high-performance computing clusters (including those available on the Cloud) equipped with GPUs. The capability of the methodology is assessed by estimating irregular and synthetic inflow hydrographs in real river reaches, also taking into account the presence of downstream corrupted observations. Finally, the procedure is applied to reconstruct a real flood wave in a river reach located in northern Italy.

The definition of discharge hydrographs in specific river sections is still a
relevant hydraulic problem not only for flood modelling purposes but also for
more practical issues related to flood-protection measures, hydropower
plants, water resource management, the design of new structures, etc.
Flood-routing techniques, either hydrological or hydraulic, are extensively
studied and are widely used to estimate discharge hydrographs in downstream
ungauged sites based on data available at upstream gauged stations (forward
propagation). However, the flow hydrograph is often required in a river
section that is completely ungauged and does not have useful upstream
information for its definition. In these cases, discharge hydrographs at
specific sites can be estimated by coupling rainfall-runoff and forward
flood-propagation models. However, rainfall-runoff models
(

In addition to the above procedures, the estimation of an unknown upstream
flow hydrograph based only on downstream information (observations) can be
performed via optimization methods. These techniques aim at finding the
upstream flow hydrograph that, routed downstream, best matches the available
observations.

All the previously cited works adopted 1-D hydraulic models or simplified
hydrological routing schemes in combination with different optimization
procedures. Nevertheless, in many real cases, the complex hydrodynamic field
generated by the flood propagation cannot be accurately described under 1-D
assumptions, and it is necessary to adopt schemes based on the 2-D shallow
water equations, even if this poses the drawback of the computational burden
and requires a detailed terrain survey. However, nowadays, bathymetric data
can be easily obtained from high-resolution digital terrain models (DTM), and
fast 2-D numerical models have been developed. With the purpose of estimating
the discharge hydrograph in an upstream-ungauged river section, having water
level information only in a downstream observation site, this paper extends
the BGA methodology for reverse flow routing from

The paper is organized as follows; in Sect. 2 the theory of the Bayesian geostatistical approach is illustrated. A step-by-step description of the inverse procedure is provided in Sect. 3: the parallel implemented scheme, the forward model optimization for reducing the run times, and the iteration management between the local host and the remote server are described in detail. Section 4 is dedicated to the application of the procedure to synthetic test cases concerning the estimation of inflow hydrographs with different shapes in two rivers in northern Italy. The practicability of the inverse procedure for reconstructing a historical flooding event is presented in Sect. 5. Some concluding remarks are finally outlined in Sect. 6.

The optimization software adopted to solve the reverse flow routing problem
is the bgaPEST (

The crux of the adopted bgaPEST, as well as other methods based on the
Bayesian approach, is Bayes' theorem, which reads

The likelihood function

The prior knowledge about

The prior covariance matrix of the unknown parameters

With the assumptions made, the likelihood and prior terms that compose the
posterior pdf of Eq. (

In the likelihood function, the term

Recalling that the aim of the inverse procedure is to obtain the vector of
the unknown parameters

In case a linear relationship between parameters and observations (linear
forward model) holds, a computationally efficient method to find the best
estimate

Definition of the reverse flood routing problem

Therefore, the sensitivity matrix is evaluated at each iteration as follows
(

Analogously to the linear system in Eq. (

A proper selection of the covariance function structural parameters
(

After having described the theory of the Bayesian geostatistical approach in
Sect. 2, some operational information about the BGA inverse procedure is now
illustrated. As mentioned in the introduction and illustrated in
Fig.

Example of the base run

The BGA algorithm solves the inverse problem by means of the following steps.

First, the unknown parameters and the structural ones are initialized. The first ones may all be assumed equal to a constant discharge value coherent with the considered river, whereas the starting values for the structural parameters are assigned so that the variability between contiguous parameters is small (flat solution, with a high degree of correlation); complexity is then introduced during the optimization process if supported by the data. The variance of the epistemic errors is assumed as being close to the expected one.

Assuming that the first guess for the unknown parameters is
the upstream boundary condition, the
hydraulic forward model is run, and the resulting water levels are extracted
at the observation site. The simulation of a base run once a particular set
of parameters has been assumed (deriving from the initialization or from
previous estimation steps) represents a mandatory step for the Jacobian
matrix evaluation, which is performed at this point in the procedure in order
to quantify how each observation is influenced by the variation of each
estimable parameter. Particularly, Eq. (12) is approximated using a finite
difference method; hence each element of the matrix is evaluated as the ratio
between the variation of each observation (numerator) for given variation of
each parameter (denominator) with respect to the base run. Therefore, in
addition to the base run, the hydraulic forward model is further run as many
times as the number of parameters to estimate

In order to exemplify this step, Fig.

This procedure is repeated until convergence or the maximum number of
iteration

The whole BGA procedure previously described is illustrated in
Fig.

Illustration of BGA algorithm in the serial

The most relevant contribution to the total computational time required by
the inverse procedure is ascribed to the forward model runs (i.e. the
computation of each element in the Jacobian matrix) rather than to the
bgaPEST operations. However, since each of the

In this work, the PARFLOOD two-dimensional-GPU numerical model presented in

In the parallel bgaPEST (Fig. ^{®}
Tesla^{®} P100 GPUs hosted by the University of
Parma was adopted. As shown in Fig.

Schematization of the data transfer assuming three parameters and thus three parallel simulations.

Listing 1 provides a detailed description of the “run forward model” shell
file. In order to use the algorithm for different test cases and potentially
on different HPC clusters, all the paths are first declared together with the
involved variables (number of parameters to estimate, time interval among
parameters, and start and end of the simulation; line 2). Then, the algorithm
(line 3) checks if the considered run is one useful for the Jacobian matrix
evaluation where a given parameter varies, or if it is the base run.
Considering the first

Conversely, the physical time

As pointed out by Eqs. (

Time reduction

In order to perform the simulation, the host logs in to the HPC cluster by means of the SSH protocol (line 9) and a sleep condition ensures the login procedure (line 10). Then the job is submitted to the queue of the cluster using external parameters for passing the name of the simulation folder and the time for restart (line 11); the submitted job contains the reference to the PBS queue and the link to the executable two-dimensional SWE-GPU code. At the end of the simulation, the water levels at the observation site are automatically extracted. Once the job is submitted, the SSH login is closed (line 12). After having submitted all the simulations, for each parameter (line 15), the code regularly (line 18) tests the presence of the end_file via SSH, which states the end of the simulation (line 20) and waits in case it is missing (line 25). Once the simulation is finished, the resulting observations are copied back to the host client (line 28), and the folder is removed from the server (line 29).

Conversely, the

Run forward model for the parallel bgaPEST scheme.

In the context of applying the BGA method described above, it is worth noting
that reference solutions for inverse problems are by definition unavailable,
since the goal of the methodology is the estimation of an upstream inflow
hydrograph that is unknown at the beginning of the process. Therefore, in
this section the inflow hydrographs in two natural rivers in northern Italy
are estimated, and the reference solutions, which are necessary in order to
validate the inverse procedure, are obtained as follows
(

Exemplification of a test case definition.

Map of the maximum simulated water depths for the Parma River.

Quantitative information about the accuracy of the inverse methodology is
provided by evaluating the differences between the reference

Finally, the estimation error in the peak discharge

The first test concerns the estimation of a hypothetical discharge hydrograph
at the entrance of the Parma River (northern Italy).
Figure

The bathymetry was derived from a 1 m resolution DTM obtained through a
LiDAR survey carried out in drought condition. The domain was discretized by
means of a Cartesian grid with cell sizes

The inflow condition to be estimated was assumed as follows
(

Parma River: flow and stage hydrographs at
sections A and C, respectively,

During the estimation, when the sensitivity to the first parameter

The inflow hydrograph duration was limited to 40 h, and it was discretized
using 2 h time interval (

The inflow hydrograph was estimated first considering true observations (the
variance was set equal to 10

Qualitative assessment of the inverse methodology is achieved by comparing
the reference with the estimated inflow hydrograph, as well as the observed
with the modelled water levels in the observation site. Considering the
simulation without errors in the observations,
Fig.

Parma River: reference and estimated inflow hydrograph

The results of the simulation with random errors corrupting the observations
are depicted in Fig.

Parma River: initial and estimated structural parameters and epistemic error variance.

Parma River: reference and estimated (with a 95

The structural parameters and the epistemic error variance estimated in the
presence and absence of corrupted observations are reported in
Table

An assessment of the methodology accuracy has been quantified by means of the
Nash–Sutcliffe

Parma River: Nash–Sutcliffe

The second test case concerns both a different river reach and shape of the
inflow hydrograph. The studied domain includes a 25 km long reach of the
Secchia River (northern Italy) between the outflow of the flood control
reservoir of Rubiera-Campogalliano, located west of the town of Modena (point
A) and the gauging station of Ponte Bacchello (point C), referring the water
level observations to the gauging station of Ponte Alto (point B)
(Fig.

Map of the water depths at the flood peak occurrence on the Secchia River, with indication of the upstream (A) and downstream (C) boundary conditions and the intermediate observation site (B).

The domain was discretized by means of a non-uniform Block-Uniform Quadtree
(BUQ) grid (

The discharge hydrograph to be estimated is the flood wave of a 20-year
return period of the Secchia River, with a peak value of about
780 m

Secchia River: flow hydrograph in section A and flow and stage hydrographs
in section C

As before, the parameters were estimated in a logarithmic space, and their
initial values were calculated by adopting the Linesearch tool of the bgaPEST
(

As shown in Fig.

Secchia River: reference and estimated inflow
hydrograph

The results of the simulation with corrupted observations depicted in
Fig.

Secchia River: reference and estimated (with a 95

The structural parameters and the epistemic error variance estimated in the
presence and absence of corrupted observations are reported in Table

Secchia River: initial and estimated structural parameters and epistemic error variance.

The indicators used for evaluating the accuracy of the methodology are
reported in Table

Secchia River: Nash–Sutcliffe

For this case, some details about the computational characteristics are
reported in Table

Secchia River: characteristics of the simulation.

The computational time of the whole inflow hydrograph simulation (72 h) is
9.62 min, whereas the simulations for evaluating the Jacobian matrix and
testing parameters 2–37 required a computational time progressively lower
than 9.62 min, thanks to the restart option illustrated in the Sect. 3. In
order to evaluate the total time required by the inverse procedure, it is
noteworthy that dealing with an HPC cluster the global run time depends on
the number of the available GPUs. However, this test was performed using
10 GPUs, and the computational cost of the 609 runs was about 13 h. Since
the implemented procedure that manages the interaction between host and
server can be used for different HPC clusters, the availability of a cluster
equipped with

The inverse procedure is now validated by investigating the December 2009
flooding event on the Secchia River, which is one of the most significant
events that occurred in the last 10 years in this river. The Interregional
Agency for the Po River (AIPo) monitored the river and provided the water
stage hydrographs recorded in the two gauging stations indicated in
Fig.

December 2009 recorded stage hydrographs on the Secchia River at sections B and C.

The studied domain is the same as the one previously adopted for a synthetic inflow; thus, the reader is kindly referred to Sect. 4.2 for the information about bathymetry, initial conditions, and the roughness configuration.

As before, the parameters were estimated in a logarithmic space and their
initial values were calculated by adopting the Linesearch tool of the bgaPEST
(

Secchia 2009 event: initial and estimated structural parameters.

Figure

Secchia 2009 event: estimated inflow hydrographs assuming the
epistemic error variance equal to 10

Secchia 2009 event: observed and modelled water levels in section B. The residuals between recorded and estimated values are also reported.

With the aim of validating the methodology for this real application, it is noteworthy that the upstream section of the river is located immediately downstream from a flood control reservoir equipped with water level sensors. Therefore, the “reference” discharge hydrograph has been obtained from the dam's geometrical data (i.e. number and dimension of the bottom openings, crest length of the spillway, etc.) and the recorded water levels adopting the classic hydraulic theory of sluice gates and spillways.

Due to the uncertainty in evaluating the discharge coefficients and the fact
that during flood events a large amount of wood debris reduces the outflow
discharge from the bottom openings (especially during the depletion phase)
and interferes with the overflow spillway, the discharge hydrograph has been
calculated by adopting equally likely coefficients
(Fig.

Secchia 2009 event: comparison among the inflow hydrographs obtained from the inverse procedure using two different Manning coefficients and the envelope of different solutions obtained using the records at the flood control reservoir.

In this work the inverse problem of estimating the unknown inflow hydrograph
in an upstream-ungauged section, having water level information only in
downstream sites, has been solved by means of a Bayesian methodology. The key
aspects in the solution of this problem have been the adoption of a parallel
two-dimensional SWE code running on GPUs and the performance of the
simulations over a HPC cluster. The parallelization of the runs useful for
the Jacobian matrix computation and the implementation of an ad hoc
procedure, which allows one to take advantage of any HPC cluster with GPUs,
have provided a remarkable reduction of the computational costs. In a test
case, this parallel procedure reduced the computational time by a factor of 8
against running the two-dimensional SWE code on a single GPU. Furthermore,
the analysis of the runtimes has highlighted that the use of a parallel
hydraulic forward routing model is the

The bgaPEST software is open source and available at the
link:

The authors declare that they have no conflict of interest.

This work was partially supported by Ministry of Education, Universities and Research under the Scientific Independence of young Researchers project, grant number RBSI14R1GP, CUP code D92I15000190001. This research benefits from the HPC (High Performance Computing) facility of the University of Parma, Italy. The Interregional Agency for the Po River (AIPo) is also gratefully acknowledged for providing data. The authors are grateful to the editor, the anonymous reviewer, and Antonis D. Koussis for the valuable suggestions on the early version of this manuscript. Edited by: Roberto Greco Reviewed by: Antonis D. Koussis and one anonymous referee