Simultaneous assimilation of water levels from river gauges and satellite flood maps for near-real time flood mapping

Hydro-meteo hazard Early Warning Systems (EWSs) are operating in many regions of the world to mitigate nuisance effects of floods. EWSs performances are majorly impacted by the computational burden and complexity affecting flood prediction tools, especially for ungauged catchments that lack adequate river flow gauging stations. Earth Observation (EO) systems may surrogate to the lack of fluvial monitoring systems supporting the setting up of affordable EWSs. But, EO data, 5 constrained by spatial and temporal resolution limitations, are not sufficient alone, especially at medium-small scales. Multiple sources of distributed flood observations need to be used for managing uncertainties of flood models, but this is not a trivial task for EWSs. In this work, a near real-time flood modelling approach is developed and tested for the simultaneous assimilation of both water level observations and EO-derived flood extents. An integrated physically-based flood wave generation and propagation modelling approach, that implements a Ensemble Kalman Filter, a parsimonious geomorphic rainfall-runoff algorithm 10 (WFIUH) and a Quasi-2D hydraulic algorithm, is proposed. A data assimilation scheme is tested that retrieves distributed observed water depths from satellite images to update 2D hydraulic modelling state variables. Performances of the proposed approach are tested on a flood event for the Tiber river basin in central Italy. The selected case study shows varying performances depending if local and distributed observations are separately or simultaneously assimilated. Results suggest that the injection of multiple data sources into a flexible data assimilation framework, constitute an effective and viable advancement 15 for flood mitigation tackling EWSs data scarcity, uncertainty and numerical stability issues.


Introduction
Floods represent one of the most costly and deadly natural disasters (EM-DAT, 2016), affecting annually on average more than 21 million people and producing economic loss greater than US $ 100 billion (Desai et al., 2015). The ability to understand and predict floods represent a crucial assets of river basin management strategies (Knight and Shamseldin, 2005). Numerical 20 simulations of flood scenarios are used for proper design of structural (e.g. levees, diversion channels, dams, etc.) and nonstructural mitigation measures (e.g. land use regulations, flood zoning, flood proofing, flood forecasting and warning, disaster prevention, preparedness and response) (Thampapillai and Musgrave, 1985). Early Warning Systems (EWSs) are nowadays increasingly used for timely detection of flood events (Kundzewicz, 2013).
scheme that may value all available sources of observations for distributed flood modelling updates. The aim of this work is to mitigate flood prediction uncertainties by combining heterogeneous data and an integrated topographic-hydrologic-hydraulic modelling approach, while maintaining inundation forecasting robustness, scalability and numerical stability. In achieving this goal, novel scientific advances and technical challenges of EO-driven DA approaches for flood prediction are investigated and in particular: A methodology for updating the state variable of a hydraulic model for distributed flood routing in floodplain domains; the gathering of spatially distributed water level observations by means of flood extension processing and detection from 65 satellite images. This work conceptualizes and tests a novel framework for updating state variables of a Quasi-2D hydraulic model adopting the Ensemble Kalman Filter (EnKF) method to take advantage of observations gathered from heterogeneous sources. The Tiber river basin in central Italy is selected as case study that was recently the subject of flood events at the meso-scale level (approximately 100 km 2 of flood-prone domain), to investigate on improved flood modelling performances. 70 The paper is organized as follows: Section 2 describes the adopted hydrologic, hydraulic, and DA modelling methodologies.
Section 3 illustrates the case study, the available data and the proposed DA implementation procedure. Section 4 discusses case study results, while section 5 provides conclusive remarks underlining advantages, limitations and suggested future developments of this research.

Hydrologic and 2D hydraulic modelling
The physically based Quasi-2D hydraulic model (FLO-2D, O'brien et al., 1993) was selected for flood wave routing and propagation. In fact, the regular grid mesh of its computational domain and the simple open format of the input and output files make the model suitable to be integrated in a Data Assimilation framework. The model solves the differential form of the dynamic wave approximation of the de Saint Venant equations with a central, finite difference numerical scheme. The numer-80 ical solution is applied along the river flowpath for in-channel 1D flood wave routing and for out-of-channel unconfined flood propagation considering 8 potential flow directions in a bi-dimensional (2D) domain. Channel and floodplain grid cells are assigned an absolute elevation, defining the floodplain and channel top surface topography. Channel conveyance capacity is considered by assigning to each cell a cross section with top banks associated to the corresponding floodplain cell elevation.
The channel-floodplain flow exchange is simulated for taking into account over-bank and return flows within the riverine sys-85 tem. River and floodplain bridges, culverts, levees and any obstruction within the simulation domain are simulated in FLO-2D by means of rating curve, width and areal reduction factors. Two main boundary conditions were defined for the application of the Quasi-2D model: a) the floodplain domain extent; b) the hydrologic forcing from upstream for both the main river stem (i.e. source node) and for the tributaries. 90 The floodplain computational domain a) is associated to the maximum potential inundation extent. A DTM-based hydrogeomorphic floodplain mapping approach following Nardi et al. (2006) and Nardi et al. (2013) was selected for characterizing this boundary condition. The selected GFPLAIN hydrogeomorphic floodplain extent model identifies the floodplain buffer domain by employing hydrological scaling laws to assess maximum flood energy gradients using the upstream contributing area as scaling factor (Nardi et al., 2018. GFPLAIN allows to restrict the computational domain for the 2D flood 95 model application optimizing the DA approach implementation. Further details of the computational domain for the specific case study are described in Section 3.2. The upstream and tributary flow boundary conditions b) are derived by stage gauge observations (where available) or from the application of a rainfall-runoff model (considering rainfall observations are available at river basin scale). A parsimonious 100 hydrological model tailored for ungauged basins was selected to simulate the hydrologic forcing following Grimaldi et al. (2012) approach. This rainfall-runoff approach is based on the application of the geomorphic characterization of the Istantaneous Unit Hydrograph (IUH) adopting the WFIUH method (Mesa and Mifflin, 1986). In the WFIUH the shape of the river basin response to the rainfall forcing is associated to rainfall drop residency time distribution. The WF distribution may be expressed estimating the flow paths and associated to each flow path the travel time to reach the outlet (Rodriguez-Iturbe and 105 Rinaldo, 1997). The WFIUH distribution is, thus, estimated by applying channel (v c (x)) and hillslope (v h (x)) velocities to their corresponding flow paths L c (x) and L h (x): The WFIUH distribution can be estimated using the DTM as main input information and applying terrain analysis algorithms for river basin hydrologic processing (pit removal-Jenson and Domingue, 1988-, flow direction, flow accumulation - Tarboton   110   et al., 1991) to estimate flow paths at the basin scale. Channel velocities are considered constant according to Grimaldi et al. (2012). The hillslope velocity distributions u h are calculated according to NRCS (1997) as a function of the local slope and land use (Haan et al., 1994;McCuen, 2009). The adopted runoff modelling approach also consider distributed rainfall input and related infiltration losses using the SCS-CN method (Cronshey, 1986). Input rain gauge observations are interpolated using the Thiessen polygon methodology to properly assess distributed rainfall input for the hydrologic model (Thiessen, 1911).

Data Assimilation (DA) framework
A scheme of the whole DA framework with the reference of the related sections is illustrated in Figure 1. The Ensemble Kalman Filter method (EnKF-Evensen, 2003) was selected for DA application on the proposed 2D hydraulic modelling approach. EnKF, widely used in literature for DA, was selected for its efficiency in dealing with the significant non-linear flood dynamics (Reichle et al., 2002). The EnKF model is a sequential DA method that estimates the model state at time t + 1 120 (h t+1 ) based on the observations at the time steps in which they are available. The DA process is characterized by two steps: the forecast step and the updated step, whose variables will be represented respectively with the superscript − for forecasting and + for updating. The method is based on ensemble generations: the forecast (a priori) state error covariance matrix P − t+1 is approximated propagating the ensemble of the model states, according to the model errors expressed as a noise term w t+1 , from the previous time step; at the same time, an ensemble of observations y t+1 at each update time is generated according to their error distribution introducing the noise term η t+1 . The updated probability density function (pdf ) of the model states is given by a combination between data likelihood and forecast pdf of the model states by means of Bayesian update. Specifically, the posterior estimate of the i-element of the ensemble h i+ t+1 is calculated using the observation y i t+1 performing a linear correction with the Kalman filter to the forecast state ensemble members: where K t+1 is the Kalman gain matrix, H(...) t+1 is a propagator relating the state variables to the measured variables and provides the expected value of the output given the model state, v i t+1 is the sample of the observation errors, R y t+1 is the variance of the observation error.
The performance of the ensemble forecast is influenced by the spread of the ensemble (Murphy, 1988;Anderson, 2001), but also by the ensemble size. The size has to be sufficiently large to represent a statistically significant sample, but at the same time 135 it has to be computational efficient considering the purpose of the application (e.g. near-real time updating of a flood model).
In this work, the approach proposed by Anderson (2001) was selected. Therefore, the optimal ensemble size was selected to reach to a Normalized RMSE Ratio (NRR) equal to one.
The EnKF method application for the proposed Quasi-2D distributed hydraulic model was developed as follows. The state 140 variable h t+1 is associated to the water depth in a specific point of the computational floodplain domain. In case the observation is a stage gauge measurement, the state variable position is determined by identifying the closest channel cell. In case the observations is gathered from a satellite image, the EnKF method is applied to both the channel and the floodplain cells falling within the observation domain. The model error w t+1 is estimated considering the uncertainties related to the input forcing I t+1 and the model parameters as explained in Section 2.2.3. The observation y t+1 is a water depth value gathered indirectly 145 by the sensor. For this reason, the observation transition operation H introduced in Eq. (2) is an identity matrix, being a direct relationship between state variables and observations. The updating of the water levels from Static Sensors (SH) located at fixed infrastructures (e.g. bridges, weirs) in the proposed DA framework aims to correct both the channel and the floodplain water levels taking into account the interlinked channelfloodplain dynamics. The channel cell is here used to infer the water level information to floodplain cells in the proximity of the observation applying the EnKF. If the difference between the observed and the simulated water level is significant (e.g. 155 more than 1 meter), numerical surging may occur. To avoid surging, a gain function is applied adopting an approach similar to Madsen and Skotner (2005): where g(x k ) is the gain assigned to the k-cell, A is the gain amplitude (assumed equal to 1), and the g (x k ) term is expressed as: where x obs , x k , x uc , x dc are the linear coordinates along the channel of respectively the cell with the observation, the kcell to be updated, the upstream and downstream bounds for the gain function. The last two terms allow to consider how far the updating could be inferred to correct the flood water profile. Varying downstream or upstream model updates may be implemented to avoid the propagation of local inaccuracies of the model. The gain function also allows to inject into the DA where ∆h(x obs,u ) and ∆h(x obs,d ) are the water level updates respectively in the upstream and downstream stage gauges, g(x k,u ) and g(x k,d ) are the gains relative respectively to the upstream and downstream observation, x obs,u and x obs,d are 175 the linear coordinates along the channel of respectively the upstream and downstream cells of the observation. When the gain function is propagated upstream and the water level correction is positive, a water profile counterslope may be inferred, causing a numerical instability issue in the hydraulic model. To avoid this issue, a further condition was imposed: the absolute water level in the channel cell h + abs (x k ), cannot be lower than the adjacent downstream channel cell h + abs (x k+1 ): The proposed model updating procedure is invoked at each time step when one or more observations become available. The hydraulic simulation is stopped saving distributed floodplain water levels and volume conservation to binary files. Then, the EnKF is applied and the water depth corrections are applied to update model states in the binary files.

Satellite image observations 185
The assimilation of flow depths derived from satellite image processing is developed following 3 main steps: where ρ bg and ρ bm are the reflectance indices of respectively the Green and Mid Infra Red (MIR) bands. For SAR images, the image histogram thresholding methodology is implemented following Brivio et al. 2002.The satellite image processing is optimized by limiting the spatial extent to the pre-identified floodplain domain estimated as described in 2.1. The two raster flood maps (Water extension from SI and from HM ) are, then, quantitatively compared by applying the measure-of-fit F-index (Horritt and Bates, 2001): For the generic k-cell pertatining to the hydraulic modelling domain, the satellite-derived indirectly observed water depth h k o,t at the time t , is expressed as: where F i1 and F i2 are the two best fitting flood maps from the ensemble of the HM compared to the flood extent from the 215 SI, h k m,i1,t and h k m,i2,t are their related the flow depths of the k-cell at time t.

Model errors
The uncertainty related to model errors are numerically managed within the proposed DA by perturbing: the hydrologic forcing input given by the upstream static sensors and the rainfall-runoff modelling output; the hydraulic model parametrization associated to channel roughness expressed by the distributed Manning coefficients.

220
In both cases the flow discharge values at time t of the s-input for the i-element of the ensemble are expressed using the following equation (Weerts and El Serafy, 2006): where Q true s,t is the streamflow value by the s-input at time t, γ is a parameter that accounts for the uncertain estimation of the synthetic discharge, N (0, R s ) is a noise term normally distributed with zero mean and a given variance R s , α s is the 225 coefficient of variation related to the uncertainty of the discharge, b is an exponent related to the type of input flow. Equation 10 infers the intuitive principle that high discharge values are more uncertain than low values.
The uncertainty related to discharge observations gathered from static sensors (SG) is the sum of two components (Clark et al., 2008): the estimation of the water level from the static sensor reading (EW L); the conversion of the water level into 230 discharge using the fluvial cross section rating curve (ERC). In this work, the coefficient of variation related to the static sensor was set to α SG = 0.1, where α EW L = 0.1 (Weerts and El Serafy, 2006;Clark et al., 2008;Rakovec et al., 2012) and α ERC is considered negligible as respect to the α EW L (Baldassarre and Montanari, 2009). The coefficient of variation related to the input provided by the hydrologic model (α I ) can be derived from a validation analysis of the hydrologic model calculating the distribution of the simulated flow errors. For both uncertainties related to SG and I, γ and b values were imposed equal to 1.

235
In addition to the uncertainty due to the hydrologic forcing, the uncertainty related to the channel roughness is also considered as follows (Clark et al., 2008;McMillan et al., 2013): where p i s is the perturbed model parameter for the i-element of the ensemble, p s is the calibrated model parameter and p is 240 the fractional parameter error.

Errors related to stage gauge observations
The errors associated to observations of the SG within the floodplain domain are considered by performing a perturbation of 245 the observed value using a similar approach adopted for perturbing the input flow from stage gauges (Eq. 10). In this case, there is no error due to the rating curve application, considering the water level observations are directly compared to the simulated ones. The water depth for the i-element of the ensemble at time t is given by: where h true SG,t is the observed water level value by the static sensor at time t, N is a noise term normally distributed with zero 250 mean and a given variance R SG , assumed equal to 0.02 m (Schmidt, 2002;Pappenberger et al., 2006).

Errors related to satellite image observations
The procedure adopted for deriving water depth distributions from satellite images is affected by a series of errors that must be taken into account and in particular:

255
− Error in the water surface detection from satellite images this error is due to the water detection technique that could overestimate or underestimate the flood extension. Both multispectral and SAR image processing for water extent mapping require a threshold to apply in respectively the Water Index and the backscatter coefficient. Literature values of these thresholding values could lead to inaccuracies considering optimal threshold values are usually case study or event specific; therefore, a perturbation of the threshold value is performed by adopting a normal distribution with zero mean 260 and a standard deviation derived from literature values (Pierdicca et al., 2009).
− Error of the water surface extraction from the simulated WSE of the hydraulic model. This error is mainly due to the vertical error of the DTM that is used in the water surface elevation interpolation procedure. The generic i-DTM of the ensemble is perturbed generating a vertical error with a normal distribution characterized by a zero mean and a variance that is uniformly distributed between 0 and 0.3 m (U (0, 0.3)) according to literature values (Hodgson and Bresnahan,265 2004; Leon et al., 2014;Brouwer et al., 2017). Considering the proposed normally distributed independent errors does not take into account the spatial continuity of the elevation data (Raaflaub and Collins, 2006;Heuvelink et al., 2007) Where ∆h k 12 , is the water level difference at the k-cell of the two best fitting hydraulic simulations (see Eq. 9); c is a coefficient ranging between 0 and 1, considering that the gentle terrain slopes in floodplains limits the error of water depths derivation in an interval smaller than ∆h k 12 .
3 Case study: available data and DA implementation

2D hydraulic modelling
The bathymetry needed to represent the channel conveyance capacity in the hydraulic model was derived by interpolating available surveyed cross sections. The continuity of the channel-floodplain topographic domain was obtained by using the

DA implementation
The ensemble size for the application of the EnKF model was defined, according to Anderson (2001), by analyzing the available stage gauges and the selected flood events. The optimal ensemble size was set equal to 40.
From the validation of the hydrologic model, the frequency distribution of the relative flow errors was characterized by a mean equal to zero and a standard deviation equal to 0.28 · Q true , thus α I =0.28 (Eq. 10).
The application of Eq. 11 to consider channel roughness parametrization uncertainties resulted in a p s = 0.04m −1/3 /s (ac-  (Figure 4). A threshold value equal to 0 for MNDWI was chosen (Xu, 2006). To apply the EnKF method, the variance of the random noise related to the threshold value of MNDWI was set equal to 0.2.

345
Three sets of DA scenarios were selected: 1. stage gauge observations, 2. satellite image observations 3. both SG and SI.
Two stage gauges (Ponte Felice and Stimigliano gauging stations) were selected for analyzing water level forecasting performance indexes. However, simultaneous observations from 4 stage gauges were adopted for the application of the hydraulic model updating procedure. The 2005, 2010 and 2012 floods were selected for applying the DA framework using stage gauges observations. The Landsat 7 image, described in Section 3 was assimilated for the 2012 flood event. of the domain, that are better represented in higher resolution models (Nicholas and Mitchell, 2003;Neal et al., 2011). The spread of the assimilated ensemble at the stage gauges is very low because of the small error associated to the stage gauge observation as illustrated in Section 2.2.6. The updating of the state variable improves the prediction of the water levels both in terms of reduction of the uncertainty (see the red and blue bands of Figure 5) and NSE, R and Bias performance indexes (Table 1). The performance indexes are calculated considering the stage gauge observations as true values, given their low 360 uncertainty. Bias for both OL and ASS simulations tend to increase above 1 in the downstream stage gauges because of their overestimation of the water levels, especially in the recession limb (that is less important for EWS). For the same reason, for all the simulations, the more the flow is far from to the upstream inflow, the more the R coefficient tends to decay, but in case of ASS simulation, this decay is mitigated. hours. The improvement of simulation NSE and Bias for the SI assimilation case are not significant (Table 2), considering that the updating persists for only few hours, as shown in Figure 8. Figure 9 shows the reduction of the water levels uncertainty and the correction of the mean value along the channel profile at 385 the SI acquisition time. It is noted that the proposed SI acquisition approach allows to update the water depths for the entire domain, while the updating with the SG, as shown in Figure 6 does not guarantee a reduction of the uncertainty for the entire domain. The adopted updating procedure allows to increase the flood extent of 4 km 2 a the time of the SI acquisition ( Figure   10). Despite the smaller water level observation errors given by the stage gauge DA as respect to the SI derived water level uncertainty, the overall change of the mean flood extent between the OL and ASS SI simulations ( 14% ) does not show signif-390 icant differences as respect the Stage gauge DA (9%). for 20 hours after the SI acquisition, while the uncertainty reduction (i.e. the difference in the ensemble amplitude between the OL and ASS SI simulations) persists for 8 hours.

Assimilation of both stage gauge and satellite observations
At the positions where the channel flow gauges are located, it is reasonable to expect that stage gauges observations support 400 better model performances as compared to model updated based on satellite observations. However satellite image observations provide spatially distributed information that are important for flood wave routing in complex domain, especially when flooding impact large unconfined domains. Obtaining multiple stage gauge observations and satellite images constitute the optimal support for DA application for EWSs.

405
To demonstrate the potential benefit of taking advantages of multiple heterogeneous distributed data sources of flood observations, a failure of the static sensors was simulated, assuming an interruption of the gauge measurements during the flood event (i.e. at 11/11/2012, 09:30). This failure is assumed to happen just before the flood peak occurs. In this way it is tested the value of SI observations in covering the lack of standard SG observations. Figure 12 shows the simulated water level time series at the stage locations comparing the OL simulations as respect the DA of the indirect observations from both stage gauges and 410 the satellite images. Following the simulated failure time, it is noted that the amplitude of the ASS ensemble of water levels become gradually larger until it equals the OL simulation ensemble size. Shortly after the peak flow, the assimilation of the SI determine another narrowing of the amplitude, with small increasing of the mean water levels.
A summary of modelling performances is inserted in (Table 3 that (Table 2). This demonstrated the potential benefit of assimilating simultaneously different observations. Bias, RMSE and variance of the ensemble amplitude as a function of lead time are represented in Figure   13. Overall simulation Bias and RMSE improve following the assimilation of the SG, but not after the updating due to the SI.
However, Bias and RMSE improved locally if calculated starting from the acquisition of the SI as showed in Figure 11 . On the 420 other hand, the variance of the amplitude of the ensemble show an improvement of the updated model in terms of reduction of uncertainty right after the assimilation of the SI.

Pros and challenges
The adopted hydraulic model has a coarse spatial resolution (150 m cell size) and its performance can be considered acceptable for high magnitude flood events; however some limitations could raise in representing the flow patterns for low magnitude The simplified rainfall-runoff modelling allowed to generate input flow hydrographs very quickly according to the needs of a near-real time flood modelling purpose. However, the model can be considered appropriate only for small basins characterized by an impulsive response, for which the groundwater component can be neglected, complex topography and flow control structures are absent to avoid equifinality issues during the calibration/validation analysis (Beven, 2006). Furthermore the ap-440 plication of the SCS-CN model at sub-daily time scale (Grimaldi et al., 2013) is a strong limitation and more advanced models should be preferred for reduce the model uncertainties.
Despite the several measures adopted to prevent instability issues, instability can occur during the updating of the water levels from the stage gauges. For this reason, the model is tailored to remove from the ensemble the critical elements casing instability and generate new elements in order to keep the sample size constant. This measure can slow down the model, that should be as 445 fast as possible for a proper near-real time application. The instability issues that sometimes can occur could be due to the fact that FLO-2D model doesn't allow to update the flow velocities but only the flow depths. This limitation doesn't also enable to have control of the volume conservation, that is an important factor to verify the accuracy of the model simulation.
A methodology for indirectly deriving the distribution of the water depths from the water footprint gathered from a satellite image was proposed; this methodology is affected by a series of errors that were taken in to account for assigning a proper 450 reliability to the observation related to the satellite image; this reliability is numerically lower than the one related to stage gauges observations, but at the same time, provides distributed information instead of the local ones given by the static sensors.
The derivation of the water depths from the flood extent gathered by the satellite image was performed with a linear combination of the values given by the ensemble of the results provided by the hydraulic forecasting model. Since the latter has to be updated by an observation that is indirectly derived by the model itself, this approach can be considered disputable; however, practically it demonstrated to not cause instability issues during the model updating, since the distribution of the flow depths is coherent with the model state variable and it has demonstrated to slightly improve the model performance. This approach can be considered as an hybrid methodology of two literature approaches that consider prognostic and diagnostic variables for assimilating satellite derived information (Hostache et al., 2018), taking advantage of the rapid flood detection algorithms adopted for direct assimilating the flood extent and, at the same time, directly retrieving the water levels that are prognostic 460 variables, thus more straightforward to assimilate than the flood extent (Hostache et al., 2018).
During a flood event, the adoption, as observation, of a multispectral image, potentially corruptible by the cloud covering and sunlight is much less likely than the one with a SAR image; however, the proposed approach for the model updating can be applicable regardless the type of image as input observation. The satellite revisit time of the current satellite missions is still a strong limitation, since can be much higher than the travel time of small basins. Furthermore, usually multispectral SAR 465 images require time for being processed. However, new satellite missions and also the combination of more constellations will considerably reduce the revisit time in the near future, allowing to have different images for the same area with a temporal frequency higher than the persistence of the model correction (8 hours showed that the assimilation of stage gauges significantly improved the flood model performances in terms of NSE, R Bias, 475 also reducing simulated inundation extent uncertainties. The assimilation of distributed flood depths, indirectly retrieved from satellite-derived flood extent, provided slightly better modelling performances. Therefore, future tests are needed, taking advantage of increasing availability of satellite-derived flood extents at higher temporal and spatial resolution, to discover the effective capacity of the proposed flexible multi-source DA framework to value a larger EO data availability.
Moreover, the flexibility of the proposed model, to assimilate local and distributed observations, suggests its suitability to 480 use other data sources, also gathered from informal observation systems. In particular, future work will allow to investigate on the use of crowd-sourced observations to apply the proposed flood prediction framework in ungauged basins. Crowdsourcing already proved to be an effective means to improve hydrologic (Mazzoleni et al. 2017) and hydraulic (Annis and