Propagation of hydrometeorological uncertainty in a 1 model cascade framework to inundation prediction 2 3

8 The purpose of this investigation is to study the propagation of meteorological uncertainty 9 within a cascade modelling approach to flood mapping. The methodology was comprised of a 10 Numerical Weather Prediction Model (NWP), a distributed rainfall-runoff model and a 11 standard 2D hydrodynamic model. The cascade of models is used to reproduce an extreme 12 flood event that took place in the Southeast of Mexico, during November 2009. The event 13 was selected as high quality field data (e.g. rain gauges; discharge) and satellite imagery are 14 available. Uncertainty in the meteorological model (Weather Research and Forecasting 15 model) was evaluated through the use of a multi-physics ensemble technique, which considers 16 variations in the specific setup options twelve parameterization schemes to determine a given 17 precipitation event. The resulting precipitation fields are used as input in a distributed 18 hydrological model, enabling the determination of different hydrographs associated to this 19 event. Lastly, by means of a standard 2D hydrodynamic model, flood hydrographs are used as 20 forcing conditions to study the propagation of the meteorological uncertainty to an estimated 21 inundation area. Results show the utility of the selected modelling approach to investigate 22 error propagation within a cascade of models. Moreover, the evolution of skill within the 23 model cascade shows a complex aggregation of errors between models, suggesting that in 24 valley-filling events hydro-meteorological uncertainty affects inundation depths in a higher 25 degree than that observed in estimated flood extents. 26


Introduction
Hydro-meteorological hazards can have cascading effects and far-reaching implications on water security, with political, social, economic and environmental consequences.Millions of people worldwide are forcibly displaced as a result of natural disasters, creating political tensions and social needs to support them.These events, observed in developed and developing nations alike, highlight the necessity to generate a better understanding on what causes them and how we can better manage and reduce the risk.
The assessment of flood risk is an activity that has to be carried out under a framework full of uncertainty.The source of these uncertainties may be ascribed to the involvement of different and often rather complex models and tools, in the context of environmental conditions that are at best partially understood (Hall, 2014).In addition to this, flooding events are dynamic over a range of timescales, due to climate variability and socio-economic changes, among others, which further increases the uncertainty in the projections.Therefore, numerous types of uncertainties can arise when using formal models in the analysis of risks.
Uncertainty is often categorised as aleatory or epistemic (Hacking, 2006): aleatory is an essential, unavoidable unpredictability, and epistemic uncertainty reflects lack of knowledge or the inadequacy of the models to represent reality.In the context of any modelling framework, epistemic uncertainties may be ascribed to the definition of model parameters and to the model structure itself (limited knowledge).
In a technological era characterised by the advent of computers, there is an increased ability of more detailed hydrological and hydraulic models.Their use and development has been motivated as they are based on equations that have (more or less) physical justification; and allow for a more de-tailed spatial representation of the processes, parameters and predicted variables (Beven, 2014).However, there are also disadvantages, these numerical tools take more computation time and require the definition of initial conditions, boundary conditions and parameter values in space and time, generally, at a level of detail for which such information is not available even in research studies.Moreover, these models may be subjected to numerical problems such as numerical diffusion and instability.All of these disadvantages can be interpreted as sources of uncertainty in the modelling process.
Due to the wide range of uncertainty sources in the flood risk assessment process, it is of great interest to investigate the propagation and behaviour of these different uncertainties from the start of the modelling framework to the result.The size of registered damages and losses in recent events around the world reveal the urgency of doing so even under a context of limited predictability.
In September 2013, severe floods were registered in Mexico as a result of the exceptional simultaneous incidence of two tropical storms, culminating in serious damage and widespread persistent flooding (Pedrozo-Acuña et al., 2014a).This unprecedented event is part of a recent set of extreme flood events over the last decade caused by record-breaking precipitation amounts across central Europe (Becker and Grünewald, 2003), the UK (Slingo et al., 2014), Pakistan (Webster et al., 2011), Australia (Ven den Honert and McAneney, 2011), the northeastern USA (WMO, 2011), Japan (WMO, 2011) and Korea (WMO, 2011).In all cases, the immediate action of governments through the implementation of emergency and action plans was required.The main aim of these interventions was to reduce the duration and impact of floods.In addition, risk reduction measures were designed to ensure both a better flood management and an increase in infrastructure resilience.
One key piece of information in preventing and reducing losses is given by reliable flood inundation maps that enable the dissemination of flood risk to the society and decision makers (Pedrozo-Acuña et al., 2015).Traditionally, this task requires the estimation of different return periods for discharge (Ward et al., 2011) and their propagation to the floodplain by means of a hydrodynamic model.There is currently a large range of models that can be used to develop flood hazard maps (Horrit and Bates, 2002;Horrit et al., 2006).
The aforementioned accelerated progress of computers has given way to the development of model cascades to produce hydrological forecasts, which make use of rainfall predictions from regional climate models (RCMs) with sufficient resolution to capture meteorological events (Bartholmes and Todini, 2005;Demeritt et al., 2010).Within this approach, the coupling of different operational numerical models is carried out, using numerical weather prediction (NWP) with radar data for hydrologic forecast purposes (Liguori and Rico-Ramirez, 2012;Liguori et al., 2012) or NWP with hydrological and hydrodynamic models to deter-mine inundation extension (Pappenberger et al., 2012;Cloke et al., 2013;Ushiyama et al., 2014).
The use of RCMs in climate impact studies on flooding has been reported by Teutschbein and Seibert (2010) and Beven (2011), noting that despite their usefulness, the spatial resolution of models ( ∼ 25 km) remains too coarse to capture the spatial resolution of precipitation.This is particularly important, as higher resolution is needed to effectively model the hydrological processes essential for determining flood risk.To overcome this limitation, the utilisation of dynamic downscaling in these models has been significantly growing (Fowler et al., 2007;Leung and Qian, 2009;Lo et al., 2008).
Significant challenges remain in the foreseeable future; among these, the inherent uncertainties in the predictive models are likely to have an important role to play.For example, it is well known that the performance skill of NWPs deteriorates very rapidly with time (Lo et al., 2008).To overcome this, the long-term continuous integration of the prediction has been subdivided into short-simulations, involving the re-initialisation of the model to mitigate the problem of systematic error growth in long integrations (Giorgi, 1990(Giorgi, , 2006;;Qian et al., 2003).Moreover, the use of ensemble prediction systems to obtain rainfall predictions for hydrological forecasts at the catchment scale is becoming more common among the hydrological community as they enable the evaluation and quantification of some uncertainties in the results (Buizza, 2008;Cloke and Pappenberger, 2009;Bartholmes et al., 2009).In these studies, an ensemble is a collection of forecasts made from almost, but not quite, identical initial conditions.
A key question that arises when using a cascade modelling approach to flood prediction or mapping is how uncertainties associated to meteorological predictions of precipitation propagate to a given flood inundation map?Previous work has been devoted to the examination of uncertainties in the results derived from different ensemble methods, which address differences in the initial conditions in the NWP or even differences in using a single-model ensemble vs. multi-model ensemble (Pappenberger et al., 2008;Cloke et al., 2013;Ye et al., 2014).However, less attention has been paid to the behaviour of errors within a model chain that aims to represent a flood event occurring at several spatial scales.In order to understand how errors propagate in a chain of models, this investigation evaluates the transmission of uncertainties from the meteorological model to a given flood map.For this, we utilise a cascade modelling approach comprised by a NWP model, a rainfall-runoff model and a standard 2-D hydrodynamic model.This numerical framework is applied to an observed extreme event registered in Mexico in 2009 for which satellite imagery is available.The investigated uncertainty is limited to the model parameter definition in the NWP model, by means of a multi-physics ensemble technique considering several multi-physics parameterisation schemes for the precipitation (Bukovsky and Karoly, 2009).The resulting precipitation fields are used to generate spaghetti plots by means of a distributed hydrological model, enabling the propagation of meteorological uncertainties to the flood hydrograph.Hence, the resulting hydrographs represent the runoff associated to each precipitation field estimated with the NWP.In order to complete the propagation of the uncertainty through the cascade of models to the flood map, the hydrographs are used as forcing in a standard 2-D hydrodynamic model.
However, it is acknowledged that each of the other models (hydrological and hydrodynamic) within the model cascade, will introduce other epistemic and random uncertainties to the result.In order to reduce their influence, the numerical setup of both these models is constructed with the best available data (e.g.lidar for the topography) and following recent guidelines for the assessment of uncertainty in flood risk mapping (Beven et al., 2011).In this way, the uncertainty associated to the meteorological model outputs is propagated through the model cascade from the atmosphere to the flood plain.Thus, the aim of this investigation is to study the uncertainty propagation from the meteorological model (due to model parameters) to the determination of an affected area impacted by a well-documented hydro-meteorological event.
This work is organised as follows: Sect. 2 provides a description of both, the study area and the extreme hydrometeorological event, which are employed to test our cascade modelling approach; Sect. 3 introduces the methodology, incorporating a brief description of the selected models setup.Additionally, we incorporate a description of the multi-physics ensemble technique used to quantify and limit the epistemic uncertainty in the NWP model.The resulting precipitation fields, hydrographs and flood maps are compared with available field data and satellite imagery for the event.In Sect.4, a discussion of errors along the model cascade is also presented with some conclusions and future work.

Case study
The selected study area is within the Mexican state of Tabasco, which in recent years has been subjected to severe flooding, as reported by Pedrozo-Acuña et al. (2011, 2012).This region comprises the area of Mexico with the highest precipitation rate (2000-3000 mm year −1 ), which mostly occurs during the wet season of the year between May and December.The rainfall climatology is also influenced by the incidence of hurricanes and tropical storms arriving from the north.
In this paper, the extreme hydro-meteorological event selected for the analysis corresponds to that registered in the early days of November 2009 in the Tonalá River.As it is shown in Fig. 1, the river is located in the border of Tabasco and Veracruz and during the event the substantial rainfall intensity provoked its overflowing leaving extensive inundated areas along its floodplain.The top panel of Fig. 1 shows the geographical location of the catchment, with an area of 5021 km 2 , as well as the location of 18 weather stations installed within the region by the National Weather Service.The event was the result of heavy rain induced by the cold front number 9, which persisted for 4 days along Mexico's gulf coast, forcing more than 44 000 people to evacuate their homes and affecting more than 90 communities.High intensities in rainfall were recorded in rain gauges from 31 October to 03 November, with cumulative daily precipitation values reporting more than 270 mm.The river is approximately 300 km long and before discharging into the Gulf of Mexico, it receives additional streamflow from other smaller rivers such as Agua Dulcita in Veracruz, and Chicozapote in Tabasco.The bottom panel of the same figure illustrates the lower Tonalá River, where severe flooding was registered, as it is shown in the photographs on the right.The yellow, blue and red dots on the panel represent the locations from which the photographs were taken.
The hydrometric data, in combination with the satellite imagery for the characterisation of the affected areas, enabled an accurate investigation of the causes and consequences that generated this flood event.The high quality of the available information allowed for the application of a cascade modelling approach comprised by state-of-the-art meteorological, hydrological and hydrodynamic models.This numerical approach is utilised with the intention to carry out an assessment of the modelling framework, with particular emphasis on the propagation of the epistemic uncertainty from the meteorological model to the spatial extent of an affected area.Such investigation paves the road for a more honest transfer of knowledge to decision-makers who will consider the reliability of the model results.

Methodology and results
The methodology is comprised of a NWP model, a distributed rainfall-runoff model and a standard 2-D hydrodynamic model.It is anticipated that the selected modelling approach will support the advance of the understanding of the connections among scales, intensities, causative factors, and impacts of extremes.This model cascade with stateof-the-art numerical tools representing a hydrological system enables the development of a framework by which an identification of the reliability of simulations can be undertaken.It should be noted that the model cascade contains several sources of uncertainty at every level of the numerical framework (meteorological, hydrological and hydrodynamic).However, the uncertainty evaluation is only carried out at the meteorological and hydrological levels of the model chain.This enables the investigation of how errors originated in the rainfall prediction interact at a catchment level and propagate to determine a given inundation area and depth.Therefore, the aim is not to reproduce an observed extreme event but to use a state of the art numerical framework to examine how errors aggregate in a hindcast scenario.
An uncertainty assessment is not carried out at the hydrodynamic level of the model cascade.Instead, the 2-D hydrodynamic model is setup following recommendations of published guidelines for the best possible representation of the case study and more specifically with regards to the selected spatial resolution, boundary conditions and roughness values (see Asselman et al., 2008).
The proposed investigation is important as uncertainties are cascaded through the modelling framework, in order to provide better understanding on how errors propagate within models working at different temporal and spatial scales.It is acknowledged that this information would enhance better flood management strategies, which would be based on the honest and transparent communication of the results produced by a modelling system constrained by intrinsic errors and uncertainties.

Meteorological model
Simulated precipitation products from NWPs typically show differences in their spatial and temporal distribution.These differences can considerably influence the ability to predict hydrological responses.In this sense, in this study we utilise the advanced research core of the Weather Research and Forecasting (WRF) model Version 3.2.The WRF model is a fully compressible non-hydrostatic, primitive-equation model with multiple nesting capabilities (Skamarock et al., 2008).
As it is shown in Fig. 2, the model setup is defined using an interactive nested domain inside the parent domain.This domain is selected in order to simulate more realistic rainfall, with the inner frame enclosing the Tonalá River catchment within a 4 km resolution.The 4 km horizontal resolution is considered good enough to compute a mesoscale cloud system associated to a cold front.It is shown that this finer grid covers the central region of Mexico, while in the vertical dimension, 28 unevenly spaced sigma levels were selected.The initial and boundary conditions were created from the NCEP global final analysis (FNL) with a time interval of 6 h for the initial and boundary conditions.Each of the model simulations was reinitialised every 2 days at 12:00 UTC, considering a total simulation time from 27 October 2009 until 13 November 2009.
Epistemic uncertainty is considered in the WRF model by means of the sensitivity of the results for precipitation due to variations in the model setup.For this, we utilise a multi-physics ensemble technique proposed by Bukovsky and Karoly (2009), where the sensitivity of simulated precipitation in the model results is examined through variations in the specific setup options by means of 23 different combinations.The comparison of computed precipitation fields against real measurements from weather stations within the catchment enabled the quantification of uncertainty in the meteorological model for this event.Table 1 shows a summary of the different multi-physics parameters used in the WRF model to generate the physics ensemble.As it is shown in this table, there is a large discrepancy in the model skill results for all 23 simulations.Error metrics reported in this table are computed using information from all available stations within the numerical domain, which comprised stations that are outside the area of the catchment.It is demonstrated that only 13 of these model runs report a positive Nash-Sutcliffe coefficient (NSC), which indicates a better accuracy for those realisations.In contrast, model runs with negative NSC were dismissed for the numerical reproduction of the event, as this condition is a clear indicator that the observed mean is a better predictor than the model.
Therefore, meteorological model runs that comply with a criteria defined by a NSC > 0.3 and a correlation coefficient (Cor) > 0.8 (for the whole numerical domain) are utilised to investigate the propagation of meteorological uncertainties through the modelling framework.This criteria narrows  down the meteorological model runs to 12, which will be cascaded to the hydrological model stage to attain streamflow predictions.In this approach, the selected 12 multi-physics ensemble runs of the model represent a plausible and equally likely state of the system in the future.
Figure 3 illustrates the cumulative precipitation curves computed for each of the 23 model runs of the multi-physics ensemble at four different stations located within the catchment.In this figure differences in the spatial distribution and intensity of precipitation are evident.Moreover, the selected 12 members by the criteria (NSC > 0.3 and Cor > 0.8) are illustrated by the blue solid lines, while the grey solid lines show those members that were rejected by it.Notably, dismissed members tend to underestimate the amount of precipitation in all four locations that are presented in this figure.For completeness, the rainfall measurements at each meteorological station are also shown by the black solid line, while the red dotted line depicts the mean value of the selected model runs to be propagated through the model cascade.If the 12 selected members are considered in the estimation of ensemble metrics at each station, it is shown that at station no.27075 the spread of the estimated cumulative precipitation curves is limited and quantified by a NSC = 0.917 and a NRMSE = 10.7 % (normalised root mean square error), indicating a good skill of the selected WRF precipitation estimates at this point.In contrast, at station no.27007 the spread of the cumulative precipitation is large and characterised by a NSC = 0.766 and a NRMSE = 19.4%, showing less skill in the model performance than that observed in the previous case.The observed differences of estimated precipitation for this event highlight the importance of incorporating ensem-ble techniques in the reproduction of precipitation with this type of models.
Figure 4 illustrates the cumulative precipitation fields computed for each of the 12 selected members of the multiphysics ensemble, where differences in the spatial distribution and intensity of precipitation were evident.These results suggest that for this event, the precipitation field estimated with the WRF was highly sensitive to the selection of multiphysics parameters.To revise in more detail the performance of the WRF in reproducing this hydro-meteorological event, the estimated cumulative precipitation by each of the selected 12 members of the multi-physics ensemble was compared against measurements at the 18 weather stations located within and close to the Tonalá catchment.
Table 2 presents a summary of the most well-known error metrics calculated at each weather station and for each member of the ensemble.Among these are the NRMSE, Bias, Nash-Sutcliffe coefficient (NSC), and the correlation coefficient (Cor).The columns show the local value of each coefficient for a given member of the ensemble (M1, . .., M12).As shown in all columns (i.e.member runs), the error metrics have a great spatial variability, hence, indicating the regions of the study area where the model performs better.To illustrate the performance of this ensemble technique at each weather station, the ensemble average of these error metrics is introduced in the last column and is written within the symbols < >.Again, the spatial variability of the metrics is evident.The two bottom rows in each sub-table correspond to the average of the ensemble averages for the whole catchment and for the all the stations.It is shown that when the average of all stations is taken into account the skill decreases.However, in this investigation the error that is of interest is the one corresponding to the average of those weather stations located within the catchment, as these will be used as input in the hydrological model.This will enable the propagation of errors in the meteorological model within the model cascade.For clarity, in the same table the stations within the catchment are highlighted in blue.
A question that has been seldom explored in the literature, is how the uncertainty in the prediction of the precipitation (i.e.errors described in this section) cascade into an estimated flood hydrograph determined by a distributed hydrological model.In this sense, the next step in this work, considers the non-linear transfer of rainfall to runoff using a distributed rainfall-runoff model.For this, we employ each one of the selected 12 precipitation fields derived from the WRF as input to determine the associated river discharge with the hydrological model.

Hydrological model
The hydrological model used in this study was applied to the Tonalá River catchment in an early work presented by Rodríguez-Rincón et al. (2012).This numerical tool was developed by the Institute of Engineering -UNAM (Domínguez et al., 2008), and comprises a simplified gridbased distributed rainfall-runoff model.The model has been previously applied with success in other catchments in Mexico (e.g.Pedrozo-Acuña et al., 2014b).
The model is based on the method of the Soil Conservation Service (SCS) with a modification that allows for the consideration of soil moisture before and after rainfall events.The parameters that are needed for the definition of a runoff curve number within the catchment are the hydrological soil group, land use, pedology and the river drainage network.Figure 5 shows for the Tonalá River catchment, the spatial definition of the river network (centre panels) and the runoff curve (right panels).For the numerical setup of the hydrological model, we employ topographic information from a lidar data set, from which a 10 m resolution digital elevation model (DEM) is constructed.
There are two main hypothesis that underpin the SCS curve number method.Firstly, it is assumed that for a single storm and after the start of the runoff, the ratio between actual soil retention and its maximum retention potential is equal to the ratio between direct runoff and available rainfall.Secondly, the initial infiltration is hypothesised to be a fraction of the retention potential.
Thus, the water balance equation and corresponding assumptions are expressed as follows: where P is rainfall, P e effective rainfall, I a is the initial abstraction, F a is the cumulative abstraction, S is the potential maximum soil moisture retention after the start of the runoff and λ is the scale factor of initial loss.The value of λ is related to the maximum potential infiltration in the basin.Through the combination of Eqs. ( 1)-( 3) and expressing the initial abstraction (I a ) by 0.2 • S we have where the value of S (cm) is determined by CN is the runoff curve number, as defined by the United States Department of Agriculture (USDA-SCS, 1985).Values for this parameter vary from 30 to 100, where small numbers indicate low runoff potential and larger numbers indicate an increase in runoff potential.Thus, the permeability of the soil is inversely proportional to the selected curve number.Another parameter that allows for the modification of the curve number is the soil water potential given by F s , following S = S • F s .The model includes a parameter to reproduce the effects of evaporation on the ground saturation (F o ).This parameter is useful when the event to be reproduced lasts for several days; however, due to the duration of this event it is assumed equal to 0.9 in all cases.The computation of the runoff in the basin is carried out through the addition of the runoff estimated in each cell to then construct a general hydrograph (see Rodríguez-Rincón et al., 2012).With regards to the definition of values for the other two free parameters in the hydrological model (λ and F s ), a traditional calibration process is implemented.For this, we utilise flood hydrographs from past extreme events (2001, 2005, 2007, 2008, 2009 and 2011) observed in this river.For these events, we employ as rainfall input the registered precipitation at the same four weather stations that are within the river catchment; their location is shown in the top panel of Fig. 1.Therefore, we determine six sets of free parameters that are good enough to represent the rainfall-runoff relationship in this catchment.The selected sets of values are illustrated in Table 3, where the correlation coefficient and NSC are also reported for each of the years.It is shown that in all the events, the selected set of parameters ensures a good correlation against the observed discharge which is given by Cor > 0.7, as well as a positive NSC (accuracy).
It is well known that both the amount and distribution of rainfall can significantly affect the final estimated river discharge (Ferraris et al., 2002;de Roo et al., 2003;Cluckie et al., 2004).In consequence, the propagation of meteorological uncertainty to the rainfall-runoff model is carried out using the 12 WRF rainfall precipitation ensembles as an input in the hydrological model, considering the six sets of free parameters reported in Table 3.This procedure enabled the generation of 72 hydrographs that could represent the 2009 event with different skill.Error metrics of all the computed hydrographs are reported in Table 4.
For completeness, Fig. 6a illustrates the 72 computed hydrographs for the Tonalá River catchment in relation to the  of the NSC for selected hydrographs in Table 4 illustrate the resulting differences in skill resulting from the combination of different setups in the hydrological model with the multiphysics ensemble.For instance, in the rows corresponding to the parameters determined for the 2011 event, member M12 indicates a NSC = 0.738 showing a poorer skill at reproducing the river discharge with the precipitation derived from this member, in comparison to that registered for member M2 with NSC = 0.938.The change in the values of the NSC indicates that results from the regional weather model can be enhanced or weakened by the performance of the hydrological model.The utilisation of the 31 selected hydrographs in a 2-D hydrodynamic model enables the study of the propagation of errors within the cascade of models.In particular, for estimating the flood extent during this extreme event.

Flood inundation model
Several 2-D hydrodynamic models have been developed for simulating extreme flood events.However, any model is only as good as the data used to parameterise, calibrate and validate the model.In general, 2-D models have been regarded as suitable for simulating problems where inundation extent changes dynamically through time, as they can easily repre-  Bates and Horritt, 2005).The use of these numerical tools has become commonplace when flows produce a large areal extent, compared to their depth, and where there are large lateral variations in the velocity field (Hunter et al., 2008).In this study, given the size of the study area the modelling system utilised is comprised of the flow model MIKE 21 with flexible mesh (FM).This numerical model solves the 2-D Reynolds-averaged Navier-Stokes equations invoking the approximations of Boussinesq and hydrostatic pressure (for details see DHI, 2014).The equations are solved at the centre of each element in the model domain.
The numerical setup is based on a previous work on the study area (Pedrozo-Acuña et al., 2012), with selected resolutions for the elements of the mesh with a size that guarantees the proper assimilation of a 10 m DEM to characterise the elevation in the floodplain.The topographic data has been regarded as the most important factor in determining water surface elevations, base flood elevation, and the extent of flooding and, thus, the accuracy of flood maps in riverine areas (NRC, 2009).Therefore, the elevation data used in this study correspond to a lidar data set provided by INEGI (2008).The choice of a 10 m DEM is based on recommendations put forward by the Committee on Floodplain Mapping Technologies, NRC (2007) and Prinos et al. (2008).As such, a DEM ensures both accuracy and detail of the ground surface.The model domain is illustrated in Fig. 7, along with the numerical mesh and elevation data; it comprises the lower basin of the Tonalá River and additional main water bodies.The colours represent the magnitude of the elevation and bathymetric data assimilated in the numerical mesh, where warm colours identify high ground areas and light blues represent bathymetric data.The integration of high-quality topographic information in a 2-D model with enough spatial resolution enables the investigation of the propagation of the meteorological uncertainty to the determination of the flood extent.Moreover, as it is illustrated in Fig. 7, the numerical mesh considers three boundary conditions.These are the input flow boundary where the hydrograph from the rainfall-runoff model is set (red dot); the Tonalá's river mouth, where the astronomical tide occurs for the period of the event (27 October-12 November 2009) (yellow dot), and the Agua Dulcita River set, where a constant discharge of 100 m 3 s −1 is introduced (blue dot).The astronomical tide (microtidal in nature with tidal range < 1 m) is determined using the monthly tidal forecast at a nearby point, which is published by CICESE (Centro de Investigación Científica y de Educación Superior de Ensenada) and is available at http://predmar.cicese.mx/calmen.php.
On the other hand, hydraulic roughness is a lumped term known as Manning's coefficient that represents the sum of a number of effects, among which are skin friction, form drag and the impact of acceleration and deceleration of the flow.The precise effects represented by the friction coefficient for a particular model depend on the model's dimensionality, as the parameterisation compensates for energy losses due to unrepresented processes and the grid resolution (Bates et al., 2014).The lack of a comprehensive theory of "effective roughness" has determined the need for calibration of friction parameters in hydraulic models.Furthermore, the determination of realistic spatial distributions of friction across a floodplain in other studies, have shown that only one or two floodplain roughness classes are required to match current data sources (Werner et al., 2005).Indeed, this suggests that application of complex formulae to establish roughness values for changed floodplain land use are inappropriate until model validation data are improved significantly.Therefore, in this study hydraulic roughness in the floodplain is assumed to be uniform and different from the main river channel; in this sense two values for the Manning number are used, one for the main river channel (M = 32 m 1/2 s −1 ) and another for the floodplain (M = 28 m 1/2 s −1 ).
It should be noted that several investigations confirm that there is significant uncertainty associated with flood extent predictions using hydraulic models (e.g.Aronica et al., 1998Aronica et al., , 2002;;Bates et al., 2004;Pappenberger et al., 2005Pappenberger et al., , 2006Pappenberger et al., , 2007;;Romanowicz and Beven, 2003).These uncertainties may be ascribed to differences in spatio-temporal resolutions or the hydraulic roughness that is used in the hydraulic model.In this investigation, however, a more detailed analysis of the different sources of uncertainty in the hydraulic model is not implemented.The numerical setup of the hydraulic model is built following published guidelines for an accurate representation of the case study (see Asselman et al., 2008), which enables us to build the discussion on how an uncertainty generated at the meteorological stage of the model chain propagates and influences a resulting flooded area and depth.
In order to assess whether the 2-D model is able to reproduce the flood extent observed in 2009, numerical results of flood extent are compared against the affected area determined from a SPOT image (resolution of 124 m).This practice is widely used in the literature to evaluate the results from inundation models and to compare its performance (Di Baldassarre et al., 2010b;Wright et al., 2008).
Figure 8a introduces the result of the hydrodynamic simulation for each of the 31 selected hydrographs, which resulted from the utilisation of the rainfall-runoff model using as input the WRF multi-physics ensemble output.The illustrated flood map summarises the 31 different scenarios of the inundation area that could result from the characterisation of precipitation with the WRF model.Each of these flood maps can also be associated to a probability enabling the representation of a probabilistic flood map, shown in the figure.This allows for the identification of the areas highly vulnerable to flooding from this event.Additionally, Fig. 8b introduces the infrared SPOT satellite image of 12 November 2009, which is used for comparison against the produced flood maps derived from running the 31 hydrographs as inputs in the 2-D model.Notably, in the numerical results, the blue area identifies the region of the domain that is most likely to be flooded (90 %), the comparison of this area with the observed inundation in the satellite image shows a good skill of the model chain for reproducing the registered flood in the study area.
Despite the variability in the estimated peak discharge utilised as input in the different hydrodynamic runs, inundation results show similar affected areas in all realisations (only with small differences in its size).This is verified in the results shown in Fig. 9a, where the relationship between peak discharge of the 31 hydrographs, is plotted against the size of the maximum-flooded area.The distribution of points in this graph clearly indicates that although there are differences in the estimated peak flow (see histogram in Fig. 9b), in most cases the resulting size of the inundated area is similar.The histogram plot shown in Fig. 9c indicates a clear concentration of numerically derived flooded areas with a size larger than 130 km 2 .Indeed, the mean value of the maximum-flooded estimated area is 138.94 km 2 , while the standard deviation is 16.09 km 2 .
These results support that the hydraulic behaviour in all hydrodynamic simulations was indeed very similar, regardless of the peak discharge of the hydrograph.It is reflected that this may be the result of induced hydrodynamics by a valley-filling flood event, which is identified with the relatively high floodplain-area-to-channel-depth ratios in all simulations.Hence, all possible hydrographs generated with the hydrological model show similar levels of lateral momentum exchange between the main channel and floodplain.For this reason, the predictive performance of all hydrodynamic sim-  ulations used to reproduce the inundation extent appears to be good (see Table 5).
The estimation of several error metrics in these results was performed using binary flood extent maps, where the comparison is based on the generation of a contingency table, which reports the number of pixels correctly predicted as wet or dry.From this, measures of fit such as Bias, false alarm ratio (FAR), probability of detection (POD), probability of false detection (POFD), critical success index (CSI) and the true skill statistics (TSS) are estimated.Table 5 introduces the results for all 31 members and error metrics.Clearly, there is little variability in the performance of the model for each of the runs, showing that there has been a small propagation of the error to the flood map.The ensemble average of these quantities is also illustrated in the last column of the table, where Bias = 1.013,FAR = 0.189, POD = 0.819, POFD = 0.180; CSI = 0.686 and TSS = 0.639.As noted before, these results indicate an apparent good skill of the model chain for reproducing the flood extent, due to the incidence of this extreme event.It should be borne in mind, however, that some misclassification errors may also be included in the observed flooded area due to specular reflections that may classify some wet vegetation as water or open water as dry land.In consequence, flood extent maps should be used with caution in assessing model performance (Di Baldassare, 2012).This is particularly true during high-magnitude events where the valley is entirely inundated, such as the case study of this investigation where small changes in lateral flood extent may produce large changes in water levels.
In this sense, it has been argued that flood extent maps are not useful for model assessment (Hunter et al., 2005) and high water marks are more useful to evaluate model performance.Unfortunately, for the case study, information of inundation depths was not available.Despite this fact, a further revision of simulated inundation depths is also carried out.For this, 10 points distributed within the numerical domain are selected.These are illustrated by the coloured dots in Fig. 10, along with the values of mean water depth in all 31 simulations (red solid line).In all cases, a high variability in the estimated inundation depth on the floodplain is depicted (with values varying between 1.5 and 3 m).This result supports that in the case of valley-filling flood events, there is a higher sensitivity to errors in the vertical dimension of the flood.
On the one hand, this demonstrates that the geomorphological characteristics of the site (e.g.low-lying area, smooth slopes in the river channel and floodplain) are dominant in the accurate determination of the magnitude of an inundated area, regardless of the peak discharge.This implies that for this type of rivers and when predicting inundation extent, it may be more important to have a good characterisation of the river and floodplain (e.g.high quality field data and a lidarderived DEM), than a good characterisation of the rainfallrunoff relationship.Current approaches to flood mapping have pointed out that in order to produce a scientifically justifiable flood map, the most physically realistic model should be utilised (Di Baldassarre et al., 2010).Nevertheless, even with these models the amount of uncertainty involved in the determination of an affected area is important and should be quantified.

Discussion and conclusions
Flood risk mapping and assessment are highly difficult tasks due to the inherent complexity of the relevant processes, which occur in several spatial and temporal scales.As pointed out by Aronica et al. (2013), the processes are subject to substantial uncertainties (epistemic and random), which emerge from different sources and assumptions, from the statistical analysis of extreme events, and from the resolution and accuracy of the DEM used in a flood inundation model.By acknowledging that all models are an imperfect representation of the reality, it is important to quantify the impact of epistemic uncertainties on a given result.
The utilised methodology was comprised of a NWP model, a distributed rainfall-runoff model and a 2-D hydrodynamic model.Thus, the numerical framework contains several sources of uncertainty at every level of the model cascade (meteorological, hydrological and hydrodynamic).The quantification of uncertainty was only carried out at the meteorological and hydrological levels of the model chain; from which non-behavioural ensemble members were removed based on the fit with observed data.In contrast, at the hydrodynamic level, the numerical model was setup in a deterministic way following recommendations of published guidelines for a good representation of the case study, more specifically with regards to the selected spatial resolution, boundary conditions and roughness values (see Asselman et al., 2008).This was done as uncertainties at this level have been mainly ascribed to issues with model implementations and definition of free parameters (Beven et al., 2011).This enabled the assessment of uncertainty and its propagation, from a modelled rainfall event to a predicted flooded area and depth.
At the meteorological level, a multi-physics ensemble technique was utilised to evaluate the generation of epistemic uncertainties (designed to represent our limited knowledge of the processes generating precipitation in the lower troposphere).While in the hydrological model a multi-response validation was implemented by means of the definition of six sets of plausible parameters from past flood events.This was done in order to reduce the dimensionality of the parameter calibration problem (see Gupta et al., 2009).This procedure was preferred over a GLUE (generalised likelihood uncertainty estimation) analysis (e.g.Pedrozo-Acuña et al., 2015), as the investigation was aimed to understand the propagation of uncertainty along the model chain.
It should be borne in mind that it is not easy to disaggregate the many sources of uncertainty within the model cascade.Thus, it is necessary to make assumptions about how to represent uncertainty.Therefore, the assessment of hydrometeorological model performance at the three levels is carried out through the estimation of skill scores.
Figure 11 presents a summary of the propagation of two well-known error metrics, Bias (top panel) and NSC/TSS (bottom panel).These metrics were selected as they enable a direct comparison of their values at each of the stages within the model cascade.In both metrics, the evolution of the confidence limits is illustrated by the size of the bars.Their evolution from the meteorological model to the hydrological model show an aggregation of meteorological uncertainties with those originated from the rainfall-runoff model.However, the skill is considerably improved from a mean value of 0.65 in the meteorological model to 0.793 in the hydrological model.In the last stage of the model chain (hydrodynamic model), the confidence limits of the results show an apparent improvement in model skill.However, it should be noted that this may be ascribed to the complex aggregation of errors in valley-filling events, which is verified in the observed sensitivity of the simulated inundation depths.The mean value of the skill is reduced to TSS = 0.639.The results provide a useful way to evaluate the hydro-meteorological uncertainty propagation within the modelling cascade system.
Bias and NSC/TSS error metrics (Fig. 11) revealed discrepancies between observations and simulations throughout the model cascade.For instance, an increase in the NSC from the rainfall to the flood hydrograph implies that the hydrological model is more sensitive (wider uncertainty bars) to its main input (precipitation) than the WRF model is to the set of micro-physics parameterisations.However, the uncertainty bounds in the hydrological model imply a high sensitivity of hydrographs to both errors from the meteorological model and its numerical setup with free parameters (amplifying the uncertainty).This is observed in the spaghetti plot shown in Fig. 6a, where large uncertainty bounds were identified.In order to reduce errors from the interaction of uncertainties coming from both models, these bounds were reduced with the selection of 31 hydrographs that comply with Cor > 0.7 and NSC > 0.6 (see Fig. 6b).It is reflected that the estimated error in the meteorological model may reflect a spatial scaling issue (comparing observations from rain gauges to simulations at the mesoscale).
Results concerning predictions of inundation extent indicate an apparent good skill of the model chain at reproducing the flood extension.The propagation of uncertainty and error from the hydrological model to the inundation area revealed that it is necessary to assess model performance not only for flood extension purposes but also to estimate inundation depths, where results indicate a higher variability (e.g. increase in the error).This last modelling step is quite important given the consequences for issuing warning alerts to the population at risk.
The similar magnitude in inundation extents of all numerical results indicated the predominance of a valley-filling flood event, which was characterised by a flooded area strongly insensitive to the input flood hydrograph.While this can be explained by the limited effect that the volume overflowing the riverbanks and reaching the floodplain will have on the maximum inundation area, the difference between the observed and the simulated flooded area remains important (TSS = 0.639).
It should be pointed out that this methodology contains more uncertainties that were not considered or quantified in the generation of flood extent maps for this event.Results showed that a large amount of uncertainty exists in the NWP model, and such uncertainty can propagated and aggregated at the catchment level.Members of the ensemble were shown to differ significantly in terms of their cumulative precipitation, spatial distribution, river discharge, inundation depths and areas.The evolution of skill within the model cascade shows a complex aggregation of errors between models, suggesting that in valley-filling events hydro-meteorological uncertainty has a larger effect on inundation depths than that observed in estimated flood inundation extents.
It is advised that in the future, attention should be given to the assessment of hydro-meteorological uncertainty in a similar numerical framework applied to catchments with different morphological settings.The assessment of the error propagation within the model cascade is seen as a good step forward in the communication of uncertain results to society.However, as shown in this work, an improvement in model prediction during the first cascade step (rainfall to runoff) can be reverted during the second cascade step (runoff to inundation area) with important consequences for early warning systems and operational forecasting purposes.Finally, the proposed numerical framework could be utilised as a robust alternative for the characterisation of extreme events in ungauged basins.

Figure 1 .
Figure 1.Top panel: location of the Tonalá River basin in Mexico; the yellow line represents the boundary limits of the catchment; blue dots: locations of weather stations; red dot: streamflow gauge.Bottom panel: close up of the study area and photographs of observed impacts; yellow, blue and red dots represent the locations at which the photos were taken.

Figure 2 .
Figure 2. Numerical setup of the WRF with a nested domain covering Mexico.Domain 1: 25 km resolution; Domain 2: 4 km resolution; the orange region illustrates the Tonalá catchment.

Figure 3 .
Figure 3.Comparison of cumulative precipitation estimated by the 23 model runs of the WRF multi-physics ensemble.Blue solid line: selected members with NSC > 0.3; grey solid line: disregarded members with NSC < 0.3; red dotted line: mean of the selected 12 members; black solid line: measurements at each of the four weather stations from 27 October 2009 to 12 November 2009.

Figure 4 .Figure 5 .
Figure 4. Cumulative precipitation fields estimated by the WRF model using the selected 12 members of the multi-physics ensemble (27 October 2009-12 November 2009).

Figure 6 .
Figure 6.(a) 72 hydrographs computed using the rainfall-runoff model with six sets of parameters and 12 WRF ensemble precipitation fields as input data; (b) 31 selected hydrographs to serve as input in the hydrodynamic model; grey lines illustrate the ensemble members and the blue dashed line shows the measured river discharge for the event.

Figure 7 .
Figure 7. Model domain along with the numerical mesh and elevation data in the study area.Boundary conditions are represented by the blue dot: Agua Dulcita river; red dot: input hydrograph; yellow dot: river mouth.

Figure 8 .
Figure 8. Data vs. model comparison of flood extent; (a) probabilistic flood map derived from the ensemble runs with the hydrodynamic model; (b) infrared SPOT image corresponding to 15 November 2009.

Figure 9 .
Figure 9. (a) Maximum-flooded area vs. peak discharge estimated for all 31 hydrodynamic simulations of the 2009 flood event; (b) histogram of peak discharges; (c) histogram of estimated size of the maximum-flooded area.

Figure 10 .
Figure 10.Estimated maximum inundation depths at different locations within the floodplain.Red line represents the median.Bars correspond to the standard deviation.Upper and lower limits of the box are the values of the 25th and 75th percentiles, respectively.Crosses depict outliers.

Table 1 .
Ensemble members defined for the multi-physics WRF ensemble (rows in bold indicate selected members; TDM -Thermal Diffusion Land Surface Model).

Table 2 .
Error metrics in the estimation of precipitation by members of the multi-physics ensemble (rows in bold indicate the stations located within the Tonalá catchment).

Table 3 .
Flood events in the Tonalá River used in the calibration process of free parameters in the hydrological model, along with computed error metrics.
meteorological model can be compensated by an error in the hydrological model and vice versa.To illustrate this in more detail, average values of the calculated error metrics for the 31 selected hydrographs are estimated and reported in Table 4, with NSC = 0.79, Cor = 0.96 and Bias = 1.11.Values

Table 4 .
Error metrics in the estimation of river discharge by the rainfall-runoff model using six parameter sets and 12 members of the multi-physics ensemble (those selected are shown in bold with NSC > 0.6 and Cor > 0.7).

Table 5 .
Error metrics in the estimation of river discharge by the hydrodynamic model using the 31 members of the multi-physics ensemble.Comparison of flooded areas between numerical results from running ensemble members vs. observed www.hydrol-earth-syst-sci.net/19/2981/2015/ Hydrol.Earth Syst.Sci., 19, 2981-2998, 2015