Performance of automated flood inundation mapping methods in a context of flash floods: a comparison of three methods based either on the Height Above Nearest Drainage (HAND) concept, or on 1D/2D shallow water equations

Flash floods observed in headwater catchments often cause catastrophic material and human damage worldwide. Considering the large number of small watercourses possibly affected, the use of automated methods for flood inundation mapping at a regional scale can be of great help for the identification of threatened areas and the prediction of potential impacts of these floods. An application of three mapping methods of increasing level of complexity (HAND/MS, caRtino 1D, and Floodos 2D) is presented herein. These methods are used to estimate the flooded areas of three major flash floods observed 5 during the last ten years in South-Eastern France: the 15th of June 2010 flood on the Argens river and its tributaries (585 km of river reaches), the 3rd of October 2015 flood on small coastal rivers of the French Riviera (131 km of river reaches) and the 15th of October 2018 floods on the Aude river and its tributaries (561 km of river reaches). The common features of the three mapping approaches are their high level of automation, their application based on a high-resolution (5m) DTM, and their reasonable computation times. Hydraulic simulations are run in steady-state regime, based on peak discharges estimated 10 using a rainfall-runoff model preliminary adjusted for each event. The simulation results are compared with the reported flood extent maps and the high water level marks. A clear grading of the tested methods is revealed, illustrating some limits of the HAND/MS approach and an overall better performance of hydraulic models solving the shallow water equations. With these methods, the inundated areas are overall well retrieved, and the errors on water levels remain mostly below 80 cm for the 2D Floodos approach. The most important remaining errors are related to limits of the digital elevation model such as the lack 15 of bathymetric information, uncertainties on embankment elevation and to possible bridge blockages not accounted for in the models. 1 https://doi.org/10.5194/hess-2020-597 Preprint. Discussion started: 4 December 2020 c © Author(s) 2020. CC BY 4.0 License.


Introduction
Flash floods represent a significant part of flood related damages worldwide, particularly in regions prone to large rainfall accumulations over limited duration -typically several hundreds of mm in a few hours. In these regions, the development of 20 efficient risk management policies for small upstream watercourses has become a particularly important issue. However, the capacity to face flash-flood related risks is still highly limited by the very large number of small rivers, and by the specific features of flash floods: high unit discharge values, fast evolution in time, high spatial heterogeneity, and low predictability.
One crucial aspect to mitigate the flash floods risks is to improve the flood hazard mapping on small watercourses, typically with upstream drainage areas starting at a few km 2 . Such information is essential for an appropriate development of preven-25 tion policies and crisis management plans. If available, it may be particularly helpful for stakeholders to: (i) facilitate risk identification and awareness, (ii) speed-up decision-making before and during the crisis.
The development of detailed flash flood hazard mapping still suffers from serious limitations due to the lack of descriptive data for small rivers (topography, bathymetry,..) and their flood regimes. However, a large increase in resolution and accuracy of Digital Terrain Models (DTMs) has been observed in the last few years, particularly with the development of Lidar, and  (Brunner, 2016;Brunner et al., 2018). The structure of the hydraulic model is defined by an automatic positioning of cross-sections at selected distances along the river network, and then extracting the cross-sectional profiles from the DTM.
Although the positions of cross-sections may be modified manually to improve the accuracy of local studies, this possibility was not considered here. A post-treatment of the simulated water longitudinal profiles enables to retrieve the water surface 120 elevations and the water heights on the grid of the DTM. This method has already been evaluated for flash flood forecasting purposes, showing an interesting capacity to represent the observed inundations (Le Bihan et al., 2017). The caRtino 1D version used here corresponds to an evolution (reprogrammation in R) of the initial software.
Since this approach enables a full resolution of SWE equations, in steady state in the presented applications, it accounts for backwater effects and longitudinal channel geometry variations within river reaches. Its main limits, already identified in 125 previous works, lie in the 1D scheme which may not be adapted in areas with complex hydraulic features. The automated application may also be a source of significant errors. Cross-sections may not be positioned perpendicular to the stream main axis in meandering rivers, leading to cross-section shape distortions. Cross-sections may also be truncated leading to ignore locally part of the floodplain in the computations. Headwater losses due to hydraulic singularities such as bridges cannot be easily integrated. No distinction is made between the river bed and the floodplain and the floodplain continuity between 130 successive cross-sections is not embedded in the model. These limits may have a particular importance in areas with very wide floodplains, perched river beds, or at river confluences.

Floodos approach
Floodos is a 2D SWE computation code developed by Davy et al. (2017). It represents the hydrodynamic module of the Eros program, aiming at simulating erosion processes. The SWE resolution method is running directly on the DTM grid, 135 and is based on a particle-based so-called "precipiton" approach, which consists in propagating elementary water volumes on the water surface. The inertial terms are neglected in the SWE resolution scheme, which may result in errors in case of sudden changes in flow direction and in the vicinity of obstacles. However, the method enables a fast computation of the stationary solution thanks to the choice of a judicious numerical scheme. The CPU time changes approximately linearly with the number of pixels of the computation domain (see Davy et al. (2017)). The model has been compared with the widely used 140 2D LISFLOOD-FP model (Bates et al., 2010), showing equivalent results and faster computation times.
The Floodos model requires a careful verification of the convergence, since the choice of a too large precipiton volume may result in a bad convergence and significant errors (overestimation of water levels). The convergence verification has been automated here by using a new version of Floodos, enabling to reduce progressively the precipiton volume during the computation. Three decreasing precipiton volumes, defined in accordance with the criterion proposed by Davy et al. (2017), 145 have been systematically applied within each run to ensure the convergence to the right solution. The capacities of the three mapping approaches to reproduce actually observed inundation patterns (i.e. inundated areas and high water level marks) are compared.

150
The main advantage of this approach is that actual observations are used as reference, when reference expert simulations, with their limits and uncertainties, have often been used in previous similar studies. Hence, a total uncertainty is measured here, including all uncertainties sources in the input data (DTM, but also actual discharge values that are only inaccurately known), parameters (roughness values), and simulation methods.
A possible drawback is that the uncertainties in the input data, particularly in the estimated discharges, may be relatively large 155 and may dominate other sources of uncertainties associated with the simulation methods. For this reason, well documented case studies have been selected here, both in terms of peak discharges of the flood and observed inundation patterns. Particularly, extensive sets of peak discharge estimates on ungauged river sections, gathered within the HyMeX program (Ducrocq et al., 2019), are available for each selected event. These largely complement flood discharge data that are available for a generally limited number of stream gauging stations. Additionally, a comprehensive knowledge of the inundation characteristics is avail-160 able for the selected flood events, thanks to a high number of high water marks (HWM) and to field observations of the limits of the inundated areas (available for the Argens and Aude events).

Flood events selected
The rivers of Southern France are known to be prone to flash floods. This region has experienced a large number of catastrophic flash flood events in the past, including the three floods selected in this study, which are presented on Fig. 2.

165
The first selected flood occurred in the Argens river watershed (2750 km 2 ) on the 15 th of June 2010. It is certainly one of the most catastrophic event observed in the last decades in this region: twenty-five victims and 450 million euros of insured losses were reported (source Caisse Centrale de Réassurance -CCR, vehicles excluded). The flood particularly affected the eastern part of the Argens catchment area, where the maximum accumulated rainfall locally exceeded 400 mm in 36 hours.
Peak discharges were estimated at about 450 m 3 .s −1 on the Nartuby tributary river (222 km 2 ), 480 m 3 .s −1 on the Florieye 170 river (89 km 2 ), and 2500 m 3 .s −1 on the downstream part of the Argens river (Payrastre et al., 2019). The length of the river network selected for the hydraulic simulations is 585 km. 557 high water marks are available for the evaluation, as well as the observed limits of inundated areas.
The second event occurred on the 3 rd of October 2015 and hit several small rivers of the Alpes Maritimes coastline (French Riviera). A storm cell formed at the eastern edge of the Var river and ran along the coastline with a stationary regeneration 175 lasting two hours. A maximum accumulated rainfall of 220 mm in 24 hours (150 mm in 2 hours) was observed in a 30km by 15km band along the coastline. The main rivers such as the Var and Loup rivers were hit only in their downstream part and had limited reactions, but major floods were observed on small coastal rivers such as the Brague river (66 km 2 , peak discharge >400 m 3 .s −1 ), the Riou de l'Argentière river (48 km 2 , >300 m 3 .s −1 ) and the Grande Frayère river (22 km 2 , >180 m 3 .s −1 ).
These floods caused the death of 20 people and considerable material damage, insured losses being estimated at 520 million 180 euros for this event (CCR, vehicles excluded). The river network selected for the hydraulic simulations is 131 km in length.
428 high water marks have been used for the evaluation.
The last event occurred on the 15 th of October 2018 in the intermediate part of Aude river watershed (5050 km 2 ), where a accumulated rainfall of more than 300 mm in 24 hours was locally recorded (Caumont et al., 2020). Several tributaries of the Aude river had very strong flood reactions: Lauquet river (196 km 2 , peak discharge of about 880 m 3 .s −1 ), Trapel river (55 185 km 2 , >300 m 3 .s −1 ), and Orbiel river (253 km 2 , 490 m 3 .s −1 ). These tributaries caused a large flood of the Aude main river immediately downstream the town of Carcassonne. Numerous villages were heavily flooded and suffered large damages. 14 fatalities were reported, several bridges and roads were destroyed, and the insured losses exceeded 200 million euros (CCR, estimation still to be consolidated). The hydraulic simulations were performed on a 569 km river network. 1082 high water marks and the observed limits of inundated areas have been used for the evaluation.

Common input data and simulation workflow
The main steps of the simulation workflow are presented on Fig. 1. The mapping approaches are all implemented on segments of the computation domains (river reaches), the segmentation being based on the structure of the hydrographic networks (confluences). A 5 km 2 upstream catchment surface has been selected as a lower limit to define the river network (1 km 2 for the Alpes Maritimes case study). One independent computation is conducted on each river reach. This principle of segmentation areas. The lidar campaigns are conducted in low flow periods, but in some places, the permanent water surface is captured in the DTM. Fortunately, the low flow discharges in the small Mediterranean rivers considered here are limited -some being ephemeral streams -and the existing DTM generally provide acceptable estimates of the river cross-sections. As the methods may be sensitive to some categories of errors present in DTMs, including for instance the presence of bridges or other structures crossing the rivers and not cleaned up, an automatic pre-treatment has been systematically applied to eliminate any remaining 215 bridges in the river beds.
The peak discharges are estimated based on preliminary rainfall-runoff simulations obtained with the Cinecar distributed model (Naulin et al., 2013). The Antilope J+1 rainfall product of Météo-France (Champeaux et al., 2009), combining radar and point rainfall records, was used as input data. The model was calibrated for each event against available discharge observations to limit as far as possible the errors on peak discharges used as input of hydraulic simulations. Overall, the differences between 220 simulated and observed peak discharge do not exceed +-10% (see Fig. 3). However, observations are mainly based on post-flood surveys and have significant uncertainties.
The same Manning's roughness coefficients are used in all computations and for all river reaches. They are fixed to n=0.066, which can be considered as a reasonable value for flash floods according to the analysis of available post event surveys data in the considered case studies (Lumbroso and Gaume, 2012).
Lower roughness values were also tested, but resulted in negative 225 bias on water levels for the three case studies and the three methods. Success Index (CSI), computed for each river reach: 235 CSI values range from 0% (no common area between simulation and observation) to 100% (perfect match). According to Fleischmann et al. (2019), hydrodynamic model provide locally relevant estimates of flood extent when the CSI exceeds 65%.

Comparison of water surface elevation with high water marks data
The elevations of the simulated water surface and of available high water marks are compared here (see Fig. 4). This results in several hundreds of point differences between simulated and observed water levels. Negative values indicate an underestimation locally (zones 2 to 5): these correspond to external sources of errors, which are common to the 3 mapping methods and will be presented in the discussion section (see Sect. 5.2). Zone number 1 corresponds to an area for which the Floodos approach performs significantly better than the two other ones. This case will be discussed in Sect. 5.1.
Dynamic maps enabling a detailed visualisation of the simulation results for all the case studies and mapping methods are provided (Hocini and Payrastre, 2020), see the data availability section. The next sections provide a synthetic analysis of the 250 evaluation results based on the CSI scores computed at the river reach scale, and on the differences between simulated and observed water surface elevations. A detailed analysis of Fig. 5.a and c shows that the HAND/MS approach may perform similarly to other approaches in some sections, but that very large errors are observed on specific rivers reaches. These differences may be attributed to the large level of simplification of the method, particularly: 1 -the fact that riverbed geometry is averaged at the river section scale, 260 2 -the "boundary" effects between sub-basins which may break the continuity of flow between river sections, particularly at confluences which number is increased here by the level of detail of the river network , and 3 -the absence of representation of backwater effects. However, the discussion section will also illustrate another important cause for these differences (see Sect. 5.1).
Overall similar and satisfactory results are observed for the caRtino 1D and Floodos 2D approaches, with a slight advantage 265 for Floodos, for which the 15% quantiles of CSI values exceed 50 %, and the median CSI values are close to 80%. The largest observed differences between the two methods seem to be concentrated on a limited number of river reaches (Fig. 5.b and d).
They are often observed in a context of wide and flat floodplains, sometimes with presence of dikes. In these cases, the complex https://doi.org/10.5194/hess-2020-597 Preprint. Discussion started: 4 December 2020 c Author(s) 2020. CC BY 4.0 License.
connection between river bed and floodplain, and the non-uniform flow directions, may limit the validity of 1D approach and complexify its automatic adaptation in terms of width and orientation of the cross-sections. An example of such a situation is respectively the 5% and 95% (whiskers), and the 15% and 85% quantiles (boxes) Finally, Fig. 5 also shows that the lower CSIs values (below 50%) often occur in the same river sections for the three methods.
These low values are mainly related to external error sources, which are not related to the computation method used, but rather to input data (peak discharges, DTM, ..., see Sect. 5.2).
The comparison results with high water marks are presented on Fig. 6 due to the choice of roughness coefficients, since a negative bias is also observed with the two other methods. However, the negative bias is systematically higher for the HAND/MS method than for the two other ones, suggesting a systematic tendency of the HAND/MS approach to underestimate water levels. An explanation for this phenomenon is presented in the discussion section. Figure 6. Comparison of simulated water levels and observed high water marks (HWM), for the three methods and the three events. The boxplots represent respectively the 5% and 95% (whiskers), and the 15% and 85% quantiles (boxes) 5 Discussion 285

Origin of the main differences between the three simulation approaches
The hierarchy observed is very similar and consistent between the three case studies. It also appears fully consistent with the level of simplification of the three methods used: logically, the SWE hydraulic methods outperform the HAND/MS approach, and the 2D SWE resolution scheme provides slightly better results than the 1D one, despite the fact that inertial terms are neglected. The results also illustrate that the HAND/MS approach can locally have very similar performance than the two other 290 approaches, but largely fails to retrieve the inundation extent in some cases where the two other approaches perform very well.
To a lesser extent, the caRtino 1D approach also shows a significantly lower performance than Floodos for a limited number of reaches (see Fig. 5). Figure 7 shows an example of a river section for which the three methods lead to significantly different results. This section is located on the Argent-Double river at La Redorte (corresponding to zone 1 on Fig. 4). In this section, the CSI scores are 295 respectively 69% for the Floodos model, 63% for the caRtino 1D model, and 35% for the HANS/MS method. This example illustrates an unexpected limitation of the HAND/MS approach, due to the configuration of the floodplain encountered here: large and flat floodplain on the left hand bank, with a longitudinal slope significantly higher than the transverse slope in the floodplain. In such a situation, a large number of HAND pixels in the floodplain are connected to a drainage point located several hundred meters downstream, with consequently a very large HAND height values. As shown on Fig. 7).d, this results in 300 a large difference between the actual and HAND cross-sectional shapes. The HAND profile shows a sudden increase in HAND elevations which drastically limits the extent of the simulated floodplain ( Fig. 7.a). Figure 7.d also shows that on the right bank, where the transverse slope is largely higher, the shape of the HAND profile is very similar to the actual cross-section. This

Errors induced by the limitations of the DTM
First, Fig. 8 presents two examples of large water level over-estimations mostly due to imperfections in the terrain input data. The first example (Fig. 8.a and b, zone 2 on Fig. 4) corresponds to the Aude river at Carcassonne. In this section the 320 river bathymetry is significant, and the peak discharge of the flood was close to the limit of the riverbed capacity as no significant inundation was observed. In this case, the absence of bathymetric surveys in the DTM has a significant effect on the retrieved cross-sectional shapes and on the river bed capacity, resulting in a significant increase of the simulated water level, and simulated over-bank flows. The second example (Fig. 8.c  at Pezens. In this section dikes are separating the riverbanks and the floodplain. Figure 8.d shows that the relief of the dikes 325 is smoothed out in the 5-m DTM if compared to a higher resolution 1-m DTM. Again in this section, the peak discharge of the flood is close to the capacity of the river bed. The underestimation of the dike crest altitude causes in this case a large overestimation of the flood extent on the right bank of the river.

Local effects of possible bridge blockages or peak discharge errors
On the other hand, large water level underestimations are also observed for some reaches. Two examples are provided on Fig. 9.

330
The first case ( Fig. 9.a, zone 4 on Fig. 4), corresponds to the inundation of Villegailhenc village by the Trapel river. In this case, a bridge located in the village has been partly obstructed and submerged during the flood, causing a large backwater effect and very high water levels (>2m) in the vicinity of the bridge. The bridge was finally destroyed by the flood as shown by the picture on Fig. 9.a. Such important backwater effects, often related to bridge blockages, are erratic phenomenons that cannot be easily forecasted and accounted for in the automatic simulations. They may result in large underestimations of the water 335 levels immediately upstream and downstream the bridges. This situation is encountered at several points in the presented case studies, particularly in sections where the floods were the most intense (estimated return periods often exceeding 100 years).
The second example (Fig. 9.c and d, zone 4 on Fig. 4)   However, Fig. 9.c shows in this case that a reasonable variation in the peak discharge value (set from 84.2 m 3 .s −1 to 116.5 m 3 .s −1 to remain consistent with rainfall observations) is not sufficient to compensate the underestimation effect. Since the selected roughness value (n=0.066) is already relatively high, an underestimation of the locally estimated rainfall intensities is suspected to be at the origin of the errors in this case (Caumont et al., 2020).

Computation times 345
The computation times required on a single CPU (Intel Core i7-7700 3.60 GHz -32 Gb RAM) for the three mapping methods are presented in Table 1. They are well correlated with the length of the river network and the number of river reaches, except for the HAND/MS method which is very fast but less predictable. A factor of ten in average is observed between computation times of the HAND/MS and the caRtino 1D models, and of two in average between the caRtino 1D and the Floodos 2D models.
As expected, the SWE 2D approach is computationally the most expensive. However the computation times remain reasonable 350 for the 5 m resolution used here, and they can still be reduced by using parallel computing. Another important difference is the relative weight of the pre-treatment phase in the total computation time: pre-treatments are largely preponderant for the HAND/MS and caRtino 1D methods, while only the computation phase is present in the Floodos 2D model. This may be seen as an advantage of HAND/MS and caRtino 1D for real time applications, in which only the computation phase has to be repeated, and hence the required computation times are highly limited for these two methods.