Deriving global flood hazard maps of fluvial floods through a physical model cascade

Global flood hazard maps can be used in the assessment of flood risk in a number of different applications, including (re)insurance and large scale flood preparedness. Such global hazard maps can be generated using large scale physically based models of rainfall-runoff and river routing, when used in conjunction with a number of post-processing methods. In this study, the European Centre for Medium Range Weather Forecasts (ECMWF) land surface model is coupled to ERA-Interim reanalysis meteorological forcing data, and resultant runoff is passed to a river routing algorithm which simulates floodplains and flood flow across the global land area. The global hazard map is based on a 30 yr (1979–2010) simulation period. A Gumbel distribution is fitted to the annual maxima flows to derive a number of flood return periods. The return periods are calculated initially for a 25× 25 km grid, which is then reprojected onto a 1 × km grid to derive maps of higher resolution and estimate flooded fractional area for the individual 25 × 25 km cells. Several global and regional maps of flood return periods ranging from 2 to 500 yr are presented. The results compare reasonably to a benchmark data set of global flood hazard. The developed methodology can be applied to other datasets on a global or regional scale.


Introduction
Global flood hazard maps are an important tool in assessing global flood risk (Di Baldassarre et al., 2011, 2012;Hagen and Lu, 2011).They are used in reinsurance, large scale flood preparedness and emergency response and can also be used as benchmarks for future flood forecasting or climate impact assessment (e.g.Kappes et al., 2012;Willis, 2012;SwissRe, 2012).
Flood hazard maps are often only routinely compiled at a national level or river catchment level (see e.g.Hagen and Lu, 2011;Prinos et al., 2009;FEMA, 2003;Porter and Demeritt, 2012), and the uncertainties in these maps are recognized and made explicit to varying degrees (see review by Prinos et al., 2009, or Merwade et al., 2008;McMillan and Brasington, 2008).These smaller scale maps must then be aggregated to larger units, such as continents, in order to gain the large scale perspective (an example for such an initiative is EXCIMAP, 2007;van Alphen et al., 2009).In many countries across the globe such flood hazard maps are not available at the national level (Hagen and Lu, 2011).In addition, the tiling of maps generated by differing methods constructed with varying observations or other data can create considerable inconsistencies, and the resulting uncertainties are not always clear from the finalised product (van Alphen et al., 2009;Prinos et al., 2009).These issues of combining data sets with different provenance and uncertainty are commonly encountered in flood modelling over administrative and political boundaries and can be problematic if not carefully considered (Pappenberger et al., 2011;Thielen et al., 2009a;Bartholmes et al., 2009).The use of large scale hydrological models and atmospheric land surface schemes for producing global scale hydrological products is of increasing interest (Cloke and Hannah, 2011) and have been employed on continental scale to derive flood hazard maps (see for example Barredo et al., 2007) In this work we derive global maps of flood return periods using a homogenous approach across the globe.Producing "consistent" maps of flood hazard can be a first step, for instance, in understanding flood risk at the global scale, although they cannot and should not replace local detailed information for local flood risk studies where it exists.The global hazard maps can then be used in conjunction with vulnerability and impact information to produce global estimates of flood risk which are key information for disaster preparedness (WMO, 2011).Global flood hazard maps are often computed based on geomorphological regression, which uses non-linear regression of easily available geomorphological catchment attributes such as distance to river and upstream catchment area (Mehlhorn et al., 2005;SwissRe, 2012).Such methods have the considerable advantage of being able to calculate flood hazard zones to a very fine resolution using readily available data.An alternative method combines discharge observations with a simple river routing algorithm and flood outline observations (Herold et al., 2011).This method also uses regression to derive properties for ungauged catchments in which no observations exist.In addition, the Dartmouth Flood Observatory has produced global hazard maps based on observations (Brakenridge, 2012).
However, these methods do not exploit available globalscale hydrological data, such as global time series of precipitation, and they also lack in application of hydrological understanding; such as the hydrological processes operating in a river catchment, available spatial information and related physical understanding of land surface properties.They also make an implicit assumption that the method is transferable across a hydrologically and hydraulically diverse global land area.Such knowledge can, however, be included into a cascade of process-based models, using meteorological, hydrological and hydraulic models.The concept of using a model cascade for global flood hazard prediction has been discussed by Winsemius et al. (2012).Physically based model cascades have been successfully employed in short-range, mediumrange, monthly and seasonal forecasting of floods (Pappenberger et al., 2005(Pappenberger et al., , 2011;;Alfieri et al., 2012;Voisin et al., 2011;Thielen et al., 2009b), as well as projections of climate impact on flooding (Cloke et al., 2010(Cloke et al., , 2012)).Barredo et al. (2007) employed this technique on a European continental level to derive flood hazard maps.
In this paper we derive a modelled global flood hazard map using the cascading models simulation approach with the European Centre for Medium Range Weather Forecasts (ECMWF) land surface and river routing model, ERA-Interim reanalysis meteorological data (using a corrected precipitation).The overall aim is to evaluate the derivation of globally consistent flood hazard maps to a resolution of 625 km 2 and 1 km 2 with ECMWF products.This is also a novel approach in evaluation and understanding of a coupled hydro-meteorological system on a global scale as previous studies have either focused on discharge (e.g.Pappenberger et al., 2010) or on observed flooded inundation fraction (e.g.Decharme et al., 2008Decharme et al., , 2011;;Dadson et al., 2010;Yamazaki et al., 2011).This paper describes a proof-of-concept exercise and we carefully consider the limitations of this approach.

Method
In this study we derive global flood hazard maps using a cascading model simulation approach combined with ECMWF products and modelling systems.This cascade comprises four steps: (I) derivation of meteorological forcing data; (II) physically based model chain; (III) extreme value theory to derive return periods; (IV) remapping of results to required resolution.Each component of this cascade has been thoroughly tested with multiple calibration and validation studies.

Derivation of input data: ERA-Interim GPCP forcing
ERA-Interim (hereafter ERAI) is the latest global atmospheric reanalysis produced by ECMWF.ERAI covers the period from 1 January 1979 onwards, and continues to be extended forward in near-real time (Berrisford et al., 2009).ERAI data are freely available for access to researchers via ECMWF's webpage (http://www.ecmwf.int/research/era).Dee et al. (2011) present a detailed description of the ERAI model and data assimilation system, the observations used, and various performance aspects.Balsamo et al. (2011) performed a scale-selective rescaling procedure to improve ERAI precipitation.The procedure corrects ERAI 3-hourly precipitation in order to match the monthly accumulation provided by the Global Precipitation Climatology Project (GPCP) v2.1 product (Huffman et al., 2009) at gridpoint scale.The method uses information from GPCP v2.1 at the scale for which the dataset was provided (for a spatial resolution of 2.5 • × 2.5 • ) and rescales the ERAI precipitation at full resolution (about 0.7 • × 0.7 • ).The advantage of this procedure is that small scale features of ERAI (for instance related to orographic precipitation enhancement) can be preserved while the monthly totals are rescaled to match GPCP (see Balsamo et al., 2011, andSzczypta et al., 2011).

Land surface model HTESSEL
In this study the Hydrology Tiled ECMWF Scheme of Surface Exchanges over Land (HTESSEL; Balsamo et al., 2009Balsamo et al., , 2011) ) (Loveland et al., 2000).The grid box surface fluxes are calculated separately for each tile, leading to a separate solution of the surface energy balance equation and the skin temperature.The latter represents the interface between the soil and the atmosphere.The surface albedo is similar for all land tiles within a grid box except for those covered with snow.Below the surface, the vertical transfer of water and energy is performed using four vertical layers to represent soil temperature and moisture.Soil heat transfer follows a Fourier law of diffusion, modified to take into account soil water freezing/melting (Viterbo et al., 1999).Water movement in the soil is determined by Darcy's Law, and surface runoff accounts for the sub-grid variability of orography (Balsamo et al., 2009).In the case of a partially (or fully) frozen soil, water transport is limited, leading to a redirection of most of the rainfall and snow melt to surface runoff when the uppermost soil layer is frozen.The snow scheme (Dutra et al., 2010) represents an additional layer on top of the soil, with an independent prognostic thermal and mass content.The snowpack is represented by a single snow temperature, snow mass, snow density, snow albedo, and a treatment for snow liquid water in the snowpack.Part of the liquid precipitation is directly intercepted by the canopy (that is evaporated at a potential rate), and the remaining infiltrates into the soil, when snow is not present.When snow is present, liquid water is intercepted by the snowpack, and can freeze.Solid precipitation accumulates on the surface.The first soil layer receives liquid water from excess precipitation as well as melted snow that was not intercepted in the canopy or snowpack.Surface runoff is generated when the first soil layer is partially saturated.In the lowest model layer (2.89 m, constant globally) the boundary condition is free drainage, that produces the sub-surface runoff.Water is extracted from the soil via direct bare ground evaporation (only in the first soil layer), and by vegetation evapotranspiration (coupled to the surface energy balance).HTESSEL is part of the integrated forecast system at ECMWF, with operational applications ranging from the short-range to monthly and seasonal weather forecasts.HT-ESSEL is mainly used for operational forecasts coupled with the atmosphere, but it can also simulate the land surface evolution and exchanges with the atmosphere in stand-alone mode (commonly referred as "offline mode").In offline mode, the model is forced with sub-daily (at least 3-hourly) near-surface meteorology (temperature, relative humidity, wind speed and surface pressure), and radiative (downward solar and thermal radiation) and water fluxes (liquid and solid precipitation).This offline methodology has been widely explored and calibrated in research applications using HT-ESSEL and other land surface and large scale hydrological models (e.g.Dutra et al., 2011;Haddeland et al., 2011).Balsamo et al. (2012) present a detailed description of the simulations set-up and general performance evaluation.

River routing CaMa-Flood
There are many different river routing algorithms which have been developed on a global scale (e.g.Miller et al., 1994;Arora and Boer, 1999;Ducharne et al., 2003) some of which include the explicit representation of flood plains and storage (Decharme et al., 2008(Decharme et al., , 2011;;Dadson et al., 2010).The evaluation of these models either focuses on the impact of flood plains on discharge (e.g.Decharme et al., 2011;Pappenberger et al., 2010) or compares modelled flooded inundation fraction with satellite observations (e.g.Dadson et al., 2010).ECMWF has successfully employed several routing algorithms based on the TRIP model (Balsamo et al., 2011, andPappenberger et al., 2010).Yamazaki et al. (2011) developed this global routing methodology further by including flood plains into the routing algorithm through sub-grid parameterization of the floodplain topography.The sub-grid parameterization is based on a 1 km Digital Elevation Model and all horizontal water transport is modelled by a diffusive wave equation to account for backwater effects.Yamazaki et al. (2011) showed that this new model formulation (called CaMa-Flood) compares favourably to daily measurements of river flow gauging stations across the globe as well as indicating a good agreement between modelled and satellite observed flooded area.Water level simulations by CaMa-Flood in the amazon basin also show a good agreement with remote sensing altimetry (Yamazaki et al., 2012).
The river network construction, river parameters and subgrid scale floodplain profiles are derived by the Flexible Location of Waterways (FLOW) method (Yamazaki et al., 2009).This method is used to upscale a high-resolution flow direction map at 1 km (Global Drainage Basin Database, GBDB, Masutomi et al., 2009) into a coarse-resolution river network map, which is used in the global-scale river routing model.The SRTM30 digital elevation model (30 arc s DEM developed in the Shuttle Radar Topography Mission by NASA) is the DEM input for the FLOW method to derive the sub-grid scale topographic parameters.In the present configuration, the river network map was created at 25×25 km resolution (hereafter model grid).This relationship is shown in Fig. 1.The outlet on the model grid is indicated by a circle.The routing characteristics are based on the upstream catchment, which may span several 25 × 25 km cells.The FLOW method generates catchments with an outlet pixel lying over the model grid, but their areas do not necessarily coincide, depending on the local topography.One of the characteristics is elevation profile which is shown in the bottom panel of Fig. 1, where the topographic height is plotted against the fraction which would be flooded at this height.The FLOW method provides an automatic upscaling of the following parameters: channel length (L, m), surface altitude, distance to downstream cell, catchment area and elevation profile.In addition to these parameters, the channel width (W , m) and bank height (B, m) are also necessary to calculate the river water storage (S = L × W × D, where D is the river water depth) and other derived parameters like the river bed slope.Channel width and bank height cannot be resolved globally from existing databases, and were derived empirically following a power law of the climatological runoff estimates accordingly to Yamazaki et al. (2011).The daily HTESSEL simulated surface and sub-surface runoff, at approximately 80×80 km resolution, is interpolated to the river network resolution using a nearest neighbour approach.

Extreme value theory to estimating return periods
The return period estimation is based on the annual maxima of the river water storage produced by CaMa-Flood.There are many different statistical distributions which can be used in the estimation of flood frequency, ranging from general logistics distributions in countries such as the UK (Reed et al., 2002) to the log Pearson Type III in the USA (IACWD, 1982).In this study, the aim was to apply the same modelling method across the globe constrained by data availability (e.g.time series of only 30 yr available from ERA-Interim) and computational resources (e.g.requirement for dynamic distribution fitting in every cell across the global land area).Therefore the Gumbel distribution (EV1), estimated using Lmoments, was chosen whose two parameters can be easily estimated by the method of moments and which allows the cheap computation of confidence limits for the fitted data.
The method is described in detail in Shaw et al. (2011).The EV1 distribution was computed for the river water storage annual maxima on the model grid and 2, 5, 10, 20, 50, 75, 100, 200, 500 yr return periods calculated.The respective river water storages (S) were converted to river water levels (D) using the river network parameters, river length (L) and width (W ): The river water levels could then be used to establish whether the river has gone out of bank and whether cells are flooded or not flooded.

Remapping of required resolution
The river water storage produced by CaMa-Flood is representative of a sub-catchment whose parameters, in particular the floodplain elevation profile (Yamazaki et al., 2009), are integrated in the model.These sub-grid parameters are derived from the 1 × 1 km cells.Therefore, the river water level can be remapped into the 1 × 1 km grid consistently with the model structure and assumptions.The river water level in each model grid is remapped to the 1 × 1 km grid, allowing the identification of flooded and not-flooded pixels (as displayed in Fig. 1).The approach in this paper is equivalent to a volume filling approach as shown by Winsemius et al. ( 2012), with the additional advantage that the sub-grid topography was an integral component of the river routing model, influencing the river water storage simulations.This information is then upscaled to the model grid to derive fractional coverage by re-aggregating all respective 1 × 1 km cells.

Evaluation of hazard maps 1.7.1 Benchmark data set
A global flood hazard map has been produced for the 2011 Global Assessment Report on Disaster Risk Reduction (Herold et al., 2011;Herold and Mouton, 2011).In this report, peak flow values for 100 yr return periods were estimated for gauged sites and regionalized by clustering observations from river gauging stations and using regression to estimate return periods for ungauged sites.Peak flow values were routed through the catchments to derive flooded areas.These data have then been merged with data from actual flood events observed by the Dartmouth Flood Observatory (www.dartmouth.edu/∼ floods/) to derive maps indicating different return periods.A description of the procedure is given in Herold and Mouton (2011), where some verification for individual catchments is also shown.All data can be downloaded from the Global Risk Data platform (http://preview.grid.unep.ch/).Please note that we do not assume "correctness" of the data set and rather use this data set as a benchmark to establish whether the methodology used in this paper leads to similar results.This benchmark data set is currently used by agencies around the world (UNISDR, 2011).

Benchmark comparison score
In this study, the global flood hazard results are compared to the benchmark data set using three scores which are designed to evaluate extremes: the Equitable Threat Score (ETS, Hogan et al., 2010;Doswell III et al., 1990;Gandin and Murphy, 1992), the Extreme Dependency Score (EDS, Stephenson, 2008) and the Frequency Bias (FB, Gandin and Murphy, 1992).
The ETS is based on a contingency table (see Table 1) and compares the hits and correct negatives events of the benchmark data set to the hits + correct negatives events of the data set created in this study.The score ranges from −1/3 to 1 (perfect score) with 0 indicating that there is no skill (skill indicated by random chance).The score takes account of false alarms and missed events.
The EDS evaluates the association between forecasted and observed rare events based on hits and misses (not explicitly evaluating false alarms).It ranges from −1 to 1 (perfect score) with 0 indicating that there is no skill.The EDS is not sensitive to bias and thus needs to be complemented by the frequency bias.
The FB measures whether the frequency of the two data sets is similar.A FB > 1 indicates that there is a positive bias (over forecasting) of the data set computed in this study in comparison to the benchmark data set (and vice versa).It ranges from 0 to ∞, with 1 indicating a perfect score.

The global flood return period maps
Figure 2 shows the major river basins of the world to aid interpretation and discussion of the results.The total area of floodplains given by a 1000 yr return period is calculated as 1.9 × 10 6 km 2 , which is within the limits of other global estimates ranging from 0.8-2 × 10 6 km 2 (Mitsch and Gosselink, 2000; Ramsar and IUCN [World Conservation Union], 1999).In Fig. 3, the areal fraction of coverage of flooding occurring in the model grid is shown for a 50-yr return period. 1 means that the cell is completely flooded across its area, 0.5 means that 50 % of the area within the cell is flooded and 0 means that the area is not flooded at all.A minimum threshold of 5 % has been set for display purposes.As would be expected, flood hazard at a 50 yr return period shows up as a wide-spread phenomenon occurring at many locations on the globe and many major catchments can be clearly seen in Fig. 3.In addition, some lakes such as the Great lakes in Northern America, which are not explicitly modelled within the routing component, show as 100 % flooded.There are also delta areas which can be clearly seen, for example the Mississippi in North America, the Yangtze and Huang He in China, the Indus on the Indian subcontinent as well as the Ganges and Brahmaputra, the Euphrates 1 means that the cell is flooded to 100 %, 0.5 means that the area within the cell is flooded to 50 % and 0 means that the area is not flooded at all.The figures shows the 50 yr return period.
and Tigris and the Murray Darling in Australia.In Africa the upper Niger catchment, the Lake Chad catchment as well as the Congo show not only in the in the delta area, but particularly inland.In South America, the Amazon and Parana catchments are dominant.Many other areas with high fractional coverage can be seen in Asia (e.g.Volga into the Black Sea or the Kolyma).
Figure 4 shows how flood hazard increases with return period for the 20 largest catchments calculated as the average area of floodplains flooded (maximum extent of floodplains are estimated from computing a 1000 yr return period).The figure shows an average flooding of around 45 % of all floodplains within all major catchments to over 90 % at higher return periods.This information could be used in the calculation of the number of people or properties affected by a flood event of a certain return period and the analysis of this on a global or continental scale (see e.g.Winsemius et al., 2012).
It is of particular interest in many applications to analyse these maps at the continental scale.Figure 5a displays the fractional coverage for a 50 yr return period for Europe.To aid in the interpretation of the results, some of the major rivers of Europe are overlaid as blue lines.The flooded area follows those lines fairly closely (see for example the rivers Po and Danube), indicating that the resulting maps have some credibility.Even smaller rivers which are not explicitly plotted as major rivers can be seen (e.g. the Tisza).The effect of lakes can also be seen as was the case for the global results.These maps are derived from 1 km 2 re-projections, which are shown as an example in Fig. 5b.The similarities between Fig. 5a  SRTM data set as in Herold and Mouton, 2011), however, given the coarse resolution and uncertainties in the many other data inputs, this course of action is not recommended as the uncertainties would be very high, even though a higher resolution image would of course look more attractive (see discussion on hyperresolved modelling in Beven and Cloke, 2012).In all modelling exercises an appreciation of the uncertainties involved is paramount (Pappenberger and Beven, 2006).We demonstrate this uncertainty in Table 2 and Fig. 4.
Figure 5c shows the 50 yr return period map for South America, focusing on the Amazon catchment in particular.As for Fig. 5a, the major rivers are followed, but here the complexity of the channel network in the Amazon basin can be clearly seen.This is encouraging, as such complexity demonstrates the value of sub-grid representation of the channel network.Note that the major lakes such as the Titicaca and Poopo are identified as flooded pixels.
Table 2 shows the average percentage of floodplain flooded for individual river catchments.It is obvious that the fraction increases with increasing return period as seen in Fig. 4. One should take particular notice of the uncertainty bounds around the median.These uncertainty bounds increase with increasing return period, however, they are not very large.This may be explained by the fact that a large uncertainty in discharge does not translate to an equally large uncertainty in extent because of the valley filling phenomenon of floods over a certain magnitude (Schumann et al., 2009;Pappenberger et al., 2006).Uncertainty in the estimation of the return period has to be large enough to cover individual 1 km cells, which may require significant jumps in water level.These findings are also illustrated when the average over all 20 catchments is displayed (Fig. 4).This phenomenon was also observed in previous work on a smaller  scale (McMillan and Brasington, 2008) and should not be seen as equivalent to a reduction in flood risk.Indeed McMillan and Brasington (2008) show that when such information is translated into risk (e.g.number of houses flooded), the uncertainties are large.The reason being that it was just at this point, on the edge of the natural floodplain, that housing/infrastructure densities increased, as they were perceived as safe from frequent flooding.This is a good illustration of the importance of considering the end use of the information.

Comparison with Benchmark data set
The benchmark data set is shown in Fig. 6a and b for a 50 yr return period, which is equivalent to the representations of the model cascade flood hazard depicted in Fig. 5a and b.
A comparison of Figs.6a and 5a shows that the benchmark data set displays greater detail but has a lower intensity of flooding depicted for a 50 yr return period.Figure 6b shows a far more detailed river network than Fig. 5b because it has been computed by a river routing algorithm on a finer scale (90 m SRTM data).However, it also indicates a much lower extent of individual floodplains, which suggests that the 50 yr return period river discharges are calculated to be greater in this study than in the benchmark.This is probably a result of the longer time series in this study or alternatively, may be reasoned by different representations of floodplain and channel storage.
Major rivers are as equally well represented in the benchmark data set as in the data set of this study.Figure 7 directly compares the global flood hazard results with the benchmark set for major catchments (comparison is limited to the return periods, which can be extracted from the benchmark data set < = 75 yr).It is encouraging to observe some clear correlation between the modelled and observed data, as they have been derived by considerably different methodologies.This correlation is shown by the high number of hits and correct negatives, which always exceed false alarms and misses.It is important to note that neither the benchmark nor the present results represent the truth and this exercise seeks to compare them in order to identify differences and allow the exploration of the properties of the data set produced in this study.There are also more hits than false alarms, but more false alarms than misses at higher return periods, which suggests that the methodology in this paper produces larger areas of flooding than the benchmark data set.
The values of the contingency table (hits, misses and false alarms) shown in Fig. 7 can be used to calculate an agreement between the two data sets using scores such as the Equitable Threat Score (ETS).The ETS is displayed in Fig. 8 as an average over the largest 20 catchments.The ETS reaches an optimal score at 1 and is skilful in comparison to a random guess for values above 0.It is reassuring to observe that the ETS is above 0 for all return periods although the flattening of the curve at higher return periods indicates that a random benchmark is not too difficult to beat in this case.
The ETS peaks at a return period of 20 yr, indicating that the maps are in closest agreement at this return period.The Extreme Dependency Score does not illustrate this peak.This is explained by the fact that it does not explicitly incorporate false alarms which increase with increasing return period.It is influenced by a continuous increase in hits and decrease in misses.The score is always above 0, indicating skill at all return periods.This skill maybe purely topographically driven.The Frequency Bias is below 1 for the 2 and 5 yr return period and above 1 for higher return periods.This indicates that the benchmark has a higher number of flooded cells for a return period of 2 and 5 yr, and a lower number of flooded cells than this study's results for larger return periods in comparison to the data set computed in this study.

What did we learn?
This proof-of-concept study has demonstrated the potential for using the products of a modern Numerical Weather Prediction Centre to produce relevant global information on flood hazard.Using a relatively simple but globally consistent methodology produced an encouraging global hazard data set with information on return periods at 25 and 1 km scale, respectively.All tools and products are available for free for research purposes and can be downloaded from various sources on the internet.The global flood hazard maps derived by different methods produce broadly similar results.The uncertainty in the estimation of flood extent is not dominated by the uncertainty in the estimation of the extreme value distribution deployed, but instead it is likely dependent on the parameter uncertainty and model processes.Further definition of the characteristic uncertainties of the maps will be required.

How useful are the results?
A global picture of flood hazard will be very useful for current and future understanding of flood risk.However, this can only be supported by a thorough understanding of the limitations involved.This study ignores many important components such as operating rules of dams and reservoirs, protection structures and so forth.Although reasonably complicated to undertake at a global scale, such effects could be included by sub-grid parameterization.Herold et al. (2011) demonstrates the point with the example of the Bihar floods in 2008, in which a dyke breach causes significant difference between the modelled and observed flood outlines.Herold et al. (2011) carries the clear warning that global models should not be used for local planning.The price of global consistency is local accuracy (for an interesting discussion about the drivers for consistency in flood mapping please see Porter, 2010).However, although results may be wrong on a local scale, they can have a useful credibility on a global scale for large scale assessment.This usefulness and credibility comes through the averaging or coarse graining which is achieved in this study.One should not analyse the behaviour of individual cells but of a group of cells (i.e.catchment).This will then enable going beyond the quantification of hazard, to the derivation of risk and impact maps.Such maps could be used for insurance purposes or the direction of global investment, for example deriving priority regions in which an upgrading of river defence structures may result in the highest return in terms of impact.This information on its own has limited value, although it is essential for combination with other information to produce increased value as one could, for example, not compute hazard without flood frequency.These global maps have an additional advantage of allowing for the provision of initial information about unknown areas.These maps also have some scientific value in that they will allow us to explore global links between flood hazards and global changes and human impacts on flood hazards.
One of the main motivations in the application of the global framework was to achieve globally consistent maps, meaning that a grid point value in Honduras has been derived in the same way as one in Nepal.This is of course only partially true as, for example, the quality and behaviour underlying meteorological forcing is dependent on the local geography (a similar argument can be made for the hydrology).However, this approach still provides a homogeneous framework allowing for the flexibility to improve locally when and where necessary.

Future improvements
This is a proof-of-concept study, which seeks to calculate global flood hazard maps with a coherent methodology using global scale models and data sets.Future work will focus on improving the individual components of the model cascade.
In stage I (derivation of forcing data), the data set used in this study may be substantially improved by using an enhanced correction routine or better correction data (see e.g.Weedon et al., 2012).The ERAInterim reanalysis is too short to properly calculate high return periods and would be better replaced by a longer reanalysis data set, such as the forthcoming ERA-CLIM, (www.era-clim.eu/).A longer time series would be able to capture more extreme events and hence allow for an improved estimation of extremes.The use of  1) for the 20 largest catchments on the globe and different return periods (areas which contain no data in the benchmark are masked).This compares the maps computed in this study to the data of the benchmark study.stochastic weather generators and downscaling could also be a way to increase the quantity/quality of the input data set.
There are many aspects of the land surface scheme of stage II (physically based models) that could be improved (e.g.representation of ground water table see ECMWF, 2010, or consideration of hydrological parameter uncertainty, Cloke et al., 2011).This is valid for the land surface component as well as the river routing model, which may benefit from additional calibration, regionalisation and inclusion of sub-grid representations.For example, important components of the river routing, such as the underlying high resolutions flow direction and DEM, and the river section area (empirically derived), are associated with large uncertainties.Alternative physical and non-physical hydrological schemes could be considered.The physical model cascade has another additional clear disadvantage as it includes a larger number of parameters in comparison to the simpler geomorphological regression.Such complexity leads to considerable uncertainties and equifinalities in the model parameters and structure (Beven and Binley, 1992).In this study we solely estimated uncertainties stemming from the fitting of the extreme value distribution, which will clearly underestimate the total uncertainty.Future studies have to take greater care in the quantification of this uncertainty, which may be difficult given the large number of models and processes involved.
In this study one simple extreme value distribution was assumed for stage III (extreme value theory to derive return periods).Hydrological understanding of flood generating processes suggests that mixed distributions should be used (Woo and Waylen, 1984;Merz and Bloeschl, 2005).Future developments may need to apply a host of different distributions.An alternative method may be possible through extending the data set as mentioned above for stage I.Such an extension would allow the estimation of return periods using continuous simulations (see e.g.Blazkova and Beven, 2002), and this would allow stage III (extreme value theory to derive return periods) to be omitted from this estimation cascade and reduce a major source of uncertainty.
Re-mapping of stage IV to the required resolution in this study is done by interpolating a particular sub-grid parameterisation.Re-mapping requires careful balancing of what is possible (e.g. a 90 m resolved flood hazard map) with what is scientifically justifiable, accepting that resolution alone does not increase the information content (Beven and Cloke, 2012).Future studies should attempt to push the resolution boundary whilst not pretending to be able to do the impossible.
Comparison has been performed against a single global benchmark data set and further comparison is ideally required (also with other methods as mentioned in the introduction).Future analysis should use local data for comparison and employ further scores.In addition, the physical processes that determine the dynamics of flood inundation behaviour should be included in future versions.
Our suggestions of improvements indicate the major sources of uncertainty which are in this modelling chain.These sources are summarized in Table 3 as well as a qualitative ranking of their importance (for a more detailed consideration of uncertainties involved in this type of modelling exercise see Beven, 2012).
Many of the improvements discussed above are areas of active research, which also illustrates the strength of this methodology.Most of the individual components are active parts of operational forecasting chains with clear commitments by individual organisations (such as ECMWF) to improve them.This means that there is a continuous development on the individual components of this system.
We assume by no means that this is the only possibility to derive maps on a global scale.One could also employ probabilistic envelope curves (see e.g.Castellarin et al., 2009;Padi et al., 2011) combined with regionalisation for ungauged catchments (e.g.Blöschl and Sivapalan, 1997;Merz and Blöschl, 2008).In addition, approaches which are currently employed on regional scale could be upscaled (for a review see Prinos et al., 2009).

Conclusions
The aim of this paper is to demonstrate a methodology to derive global flood hazard maps which are derived by a consistent approach across the globe.This study is based on products of the European Centre for Medium range Weather Forecasts and uses models and data which are freely available.The application of a methodology on a global scale naturally includes many assumptions and therefore this study has to be seen as a proof of concept.
In this paper the flood hazard maps for different return periods are derived from a cascade of models and data.The major source of the atmospheric forcing is derived from reanalysis data (ERA Interim) corrected with observations (GPCP data).These inputs are used as boundary conditions to an operational land surface scheme (HTESSEL) whose results in turn are fed into a river routing algorithm which simulates and represents floodplains (CaMa-Flood).A map of global river water storage on a 25 km scale is produced (the 25 km contains sub-grid parameterization from a 1 km resolved grid).Return periods up to 1000 yr are computed by fitting a Gumbel distribution to the river water storage.River water storage is converted into river water level (through the river cross section parameters and channel length) and flooding is remapped onto a 1 km grid.We demonstrate that the resulting maps are physically plausible by showing the analyses of global and continental maps of the 50 yr return period.Uncertainty in this study is estimated from the fitting of the distribution and is relatively low compared to that expected.This is explained by the fact that uncertainty in river water storage is somewhat dampened if mapped into a flood inundation.This study also compares the results to a benchmark produced by Herold and Mouton (2011) using a different methodology.In general, the benchmark has a higher number of flooded cells for a return period of 2 and 5 yr and a lower number of flooded cells for larger return periods (> 20 yr) in comparison to the data set computed in this study.
The results of this study indicate that the approach in this paper is feasible and can produce realistic global flood hazard maps of various return periods.It can be used to both gain a global overview and prompt further research on the local scale.Limitations can be overcome by addressing each component of the system individually.The approach has the great advantage that it benefits from continuous model developments and improvements as most components are part of an operational forecast chain.

FFig. 1 .
Fig. 1.Illustrating the sub-grid parameterization.The coarse grid on the top figure represents the 25 × 25 km cells.The catchment elevation map is shown on a 1 × 1 km grid.The outflow is indicated by a red circle.The river routing properties for the cell with this outflow are derived from the 1 × 1 km grid.Such a property is shown in the bottom plot in which elevation vs. catchment fraction is plotted.

FFig. 3 .
Fig. 3. Fractional coverage of flooding of 25 km by 25 km cells.1 means that the cell is flooded to 100 %, 0.5 means that the area within the cell is flooded to 50 % and 0 means that the area is not flooded at all.The figures shows the 50 yr return period.

Fig. 4 .
Fig. 4. Average flooded area of the largest 20 catchments.The 5th and 95th percentile derived from the estimation of the Gumbel distribution is displayed as dotted lines.

F
Fig. 5a.Fractional coverage of flooding of 25 km by 25 km cells for a 50 yr return period focusing on Europe only. 1 means that the cell is flooded to 100 %, 0.5 means that the area within the cell is flooded to 50 % and 0 means that the area is not flooded at all.Major European rivers are shown as blue lines.All white areas indicate a percentage of flooding of less than 5 %.

Fig. 5c .
Fig. 5b.Flooding of 1 km by 1 km cells for a 50 yr return period focusing on Europe only (binary map).Major European rivers are shown as blue lines.
Fig. 6a.Fractional coverage of flooding of 25 km by 25 km cells for the benchmark data set focusing on Europe only. 1 means that the cell is flooded to 100 %, 0.5 means that the area within the cell is flooded to 50 % and 0 means that the area is not flooded at all.Scandinavia contains no data in this data set.Major European rivers are shown as blue lines.
Fig. 6b.Flooding of 1 km by 1 km cells for a 50 yr return period focusing on Europe only (binary map) for the Benchmark data set.Scandinavia contains no data in this data set.Major European rivers are shown as blue lines.

FFig. 7 .
Fig. 7.Comparison of percentage of hits, misses, false alarms and correct negatives (definition in Table1) for the 20 largest catchments on the globe and different return periods (areas which contain no data in the benchmark are masked).This compares the maps computed in this study to the data of the benchmark study.

Fig. 8 .
Fig.8.Equitable Threat Score of different return periods with a benchmark model as "observations".An ETS of larger than 0 is skilfull and the higher values are better.
energy fluxes and the temporal evolution of soil temperature, moisture content and snowpack conditions.At the interface to the atmosphere each grid box is divided into fractions (tiles), with up to six fractions over land (bare ground, low and high vegetation, intercepted water, shaded and exposed snow).Vegetation types and cover fractions are derived from an external climate database, based on the Global Land Cover Characteristic is used.HTESSEL computes the land surface response to atmospheric forcing, and estimates the surface water and Hydrol.Earth Syst.Sci., 16, 4143-4156, 2012 www.hydrol-earth-syst-sci.net/16/4143/2012/

Table 3 .
Sources of Uncertainty within the physical modelling cascade.