Bayesian uncertainty assessment of flood predictions in ungauged urban basins for conceptual rainfall-runoff models

: Urbanization and the resulting land-use change strongly affect the water cycle and runoff-processes in watersheds. Unfortunately, small urban watersheds, which are most affected by urban sprawl, are mostly ungauged. This makes it intrinsically difficult to assess the consequences of urbanization. Most of all, it is unclear how to reliably assess the predictive uncertainty given the structural deficits of the applied models. In this study, we therefore investigate the uncertainty of flood predictions in ungauged urban basins from structurally uncertain rainfall-runoff models. To this end, we suggest a procedure to explicitly account for input uncertainty and model structure deficits using Bayesian statistics with a continuous-time autoregressive error model. In addition, we propose a concise procedure to derive prior parameter distributions from base data and successfully apply the methodology to an urban catchment in Warsaw, Poland. Based on our results, we are able to demonstrate that the autoregressive error model greatly helps to meet the statistical assumptions and to compute reliable prediction intervals. In our study, we found that predicted peak flows were up to 7 times higher than observations. This was reduced to 5 times with Bayesian updating, using only few discharge measurements. In addition, our analysis suggests that imprecise rainfall information and model structure deficits contribute mostly to the total prediction uncertainty. In the future, flood predictions in ungauged basins will become more important due to ongoing urbanization as well as anthropogenic and climatic changes. Thus, providing reliable measures of uncertainty is crucial to support decision making. Abstract. Urbanization and the resulting land-use change strongly affect the water cycle and runoff-processes in watersheds. Unfortunately, small urban watersheds, which are most affected by urban sprawl, are mostly ungauged. This makes it intrinsically difﬁcult to assess the consequences of urbanization. Most of all, it is unclear how to reliably assess the predictive uncertainty given the structural deﬁcits of the applied models. In this study, we therefore investigate the uncertainty of ﬂood predictions in ungauged urban basins from structurally uncertain rainfall-runoff models. To this end, we suggest a procedure to explicitly account for input uncertainty and model structure deﬁcits using Bayesian statistics with a continuous-time autoregressive error model. In ad-dition, we propose a concise procedure to derive prior parameter distributions from base data and successfully apply the methodology to an urban catchment in Warsaw, Poland. Based on our results, we are able to demonstrate that the autoregressive error model greatly helps to meet the statistical assumptions and to compute reliable prediction intervals. In our study, we found that predicted peak ﬂows were up to 7 times higher than observations. This was reduced to 5 times with Bayesian updating, using only few discharge measurements. In addition, our analysis suggests that imprecise rainfall information and model structure deﬁcits contribute mostly to the total prediction uncertainty. In the future, ﬂood predictions in ungauged basins will become more important due to ongoing urbanization as well as anthropogenic and climatic changes. Thus, providing reliable measures of uncertainty is crucial to support decision making.


Introduction
Urbanization and the resulting land-use change strongly affect the water cycle in watersheds (Rosso and Rulli, 2002;Ott and Uhlenbrook, 2004;Shepherd, 2005;Brath et al., 2006;Clarke, 2007;Quilbé et al., 2008;Barron et al., 2011;Jung et al., 2011;Schaefli et al., 2011). By 2020 it is estimated that more than 80 % of European citizens will be living in urban agglomerations and there is no apparent slowing in this trend (EEA, 2006). Probably the most obvious consequence of urbanization is that semi-natural pervious lands are substituted by sealed ones, which changes the hydrology of the urbanized basin and not only increases flood risk, but also impairs the chemical and ecological status of receiving water bodies through erosion and increased pollution (Dietz and Clausen, 2008).
To assess flood risk and mitigation strategies, urban planners rely on models which predict the runoff from a given rain event, design storm or long-term precipitation record. Unfortunately, small urban watersheds in areas of urban sprawl are mostly ungauged (Sivapalan, 2003) and where data are available, records often contain only few years of the most basic hydrological variables, such as rainfall and streamflow. This makes it intrinsically difficult to assess the consequences of urbanization and therefore predictions of such ungauged or poorly gauged basins are considered highly uncertain Sivapalan et al., 2003;Wagener and Gupta, 2005). In ungauged catchments, the lack of data also prohibits the use of detailed, physically-based models and simple conceptual models with only few parameters are often the only feasible tool to predict the consequences of future urbanization (Kumar et al., 2007;Gironás et al., 2009;Sikorska and Banasik, 2010;Bocchiola et al., 2011;Khaleghi et al., 2011). A clear advantage of using such models is that their parameters often can be related to the physical catchment characteristics (Kapangaziwiri and Hughes, 2008) and therefore can be directly obtained for ungauged catchments. The price, on the other hand, is the increased uncertainty due to model structure deficits (Seibert and Beven, 2009). It is commonly accepted that the uncertainty of predicted flows stems from parameter uncertainty, model structure error, measurement error, and uncertain inputs to the model (Kavetski et al., 2006a, b;Ajami et al., 2007).

Published by Copernicus
In the context of urban planning and flood prediction, a reliable measure of uncertainty in predicted runoff is of vital interest. It is current practice to map prediction uncertainties entirely to parameter uncertainties and propagate them through the model (Wagener and Gupta, 2005;Ajami et al., 2007;Vrugt et al., 2008a). A popular example for this approach is Generalized Likelihood Uncertainty Estimation (GLUE) (Beven and Freer, 2001). However, as shown by Ajami et al. (2007), ignoring either input forcing error or model structural uncertainty may lead to unrealistic model simulations and associated uncertainty bounds that do not consistently capture and represent the real-world behaviour of the watershed. It has been demonstrated that Bayesian statistics is conceptually more satisfying than other approaches of uncertainty analysis (Mantovan and Todini, 2006;Yang et al., 2008). One advantage of formal Bayesian approaches is the possibility to disentangle the effect of forcing, parameter and model structural error on total predictive uncertainty, which cannot be achieved with a GLUE (Vrugt et al., 2008a).
Bayesian statistics requires an explicit formulation of the error process. This error process represents typically the input, structural and measurement uncertainty together. Unfortunately, it has been shown that the assumption of independent and normally distributed residuals, although mathematically convenient, is often violated (Sorooshian and Dracup, 1980;Kuczera et al., 2006;Cawley et al., 2007;Balin et al., 2010). A promising alternative is the lumped continuous autoregressive error model proposed by Yang et al. (2007), which is based on more realistic assumptions but has not been widely recognized so far.
Such a lumped error process is usually sufficient to compute the prediction uncertainty. However, a separate treatment of the uncertainty sources makes it possible to quantify the contribution of each to the output uncertainty. This is useful to assess in how far the prediction uncertainty can be reduced by reducing the uncertainty of a particular source. With regard to the importance of the individual sources of uncertainty, it is reported that measurement errors of the runoff, while acknowledged, are often considered to be relatively small and in the order of about 5 % (Leonard et al., 2000;Di Baldassarre and Montanari, 2009). It is worth noting here that one should only assume small measurement errors when they are supported by calibration or reference measurements. Otherwise, the error may be large especially for flood flows when runoff is calculated from measured water stages with rating curves that are extrapolated beyond observation range (Di Baldassarre and Montanari, 2009;McMillan et al., 2010;Montanari and Di Baldassarre, 2011). In contrast, Gourley and Vieux (2006) state that model structure errors and input uncertainty can be the most significant sources of uncertainty in predicted flows.
The uncertainty of forcing input, such as rainfall, is well recognized but rarely considered in hydrological modelling (Kavetski et al., 2002;Kuczera et al., 2006). Unfortunately, this is particularly important for ungauged catchments where rain gauges, if available, are often sparse and do not capture the spatial variability of precipitation (Kavetski et al., 2006a;Bárdossy and Das, 2008;Moulin et al., 2009;McMillan et al., 2011). A promising approach to treat the introduced input error originally proposed by Kavetski et al. (2006a) and adopted by others (e.g. Vrugt et al., 2008aVrugt et al., , 2008bMcMillan et al., 2011) is to tackle the uncertainty of the precipitation measurements with estimating event-specific parameters (rainfall multipliers).
Similarly, other procedures have been proposed to assess model structure errors or to simultaneously evaluate different sources of uncertainty. Many of them adopt a Bayesian viewpoint, such as the Simultaneous Parameter Optimization and Data Assimilation (SODA) (Vrugt et al., 2005), or the Integrated Bayesian Uncertainty Estimator (IBUNE) (Ajami et al., 2007). SODA merges the input and model structure uncertainty together into a single forcing term. IBUNE treats input error parameters as constant over time, which might not be appropriate as it was shown that the estimated rainfall multipliers typically vary from event to event (Vrugt et al., 2008a;McMillan et al., 2011). Another promising approach to decompose uncertainties into contributing sources was presented by Renard et al. (2011), who applied a hierarchical structural error model with a single stochastic parameter.
Besides the possibility to separate the sources of uncertainty, Bayesian statistics has another feature that makes it appealing for application in ungauged catchments: it is possible to incorporate knowledge about the parameters from various sources, such as expert knowledge or previous results, as a probability distribution. This prior distribution can be subsequently updated if measurement data become available (Beck and Katafygiotis, 1998;Sivia and Skilling, 2006;Zhang et al., 2011).
While in other applications of Bayesian statistics the definition of a prior distribution, e.g. through the elicitation of expert knowledge, is a research field of its own (Winkler, 1967;O'Hagan, 1998;Garthwaite et al., 2005), most hydrological studies disregard this aspect. More often than  (McIntyre et al., 2005) or the software's user manual (Yang et al., 2007). For modelling ungauged catchments, further difficulties arise from unavailable base data and, to the best of our knowledge, a concise approach of formulating the prior knowledge on hydrological model parameters is missing. In this paper, our aim is therefore to investigate the uncertainty of flood predictions in ungauged urban basins with structurally uncertain rainfall-runoff models. Specifically, we apply current state-of-the-art approaches to explicitly account for both input uncertainty and model structure deficits. In addition, we propose a concise procedure to derive prior parameter distributions from base data. Our study is innovative in three distinct aspects: 1. For Bayesian inference with a Unit Hydrograph-type model, we use a likelihood function that combines a Box-Cox data transformation with a continuous-time autoregressive error model. Additionally, we explicitly account for input uncertainty using rainfall multipliers. To the best of our knowledge, this is the first time that this has been done for a small urban ungauged catchment.
2. We support the concise formulation of prior knowledge by combining five different methods to derive model parameters from base data. This approach is readily transferable to model other ungauged catchments with this type of model.
3. We assess the importance of parameter uncertainty, input and model structure error on the uncertainty of predicted flows and use scenario analysis to derive practical recommendations regarding the performance of the methods for prior knowledge generation.
Our approach was tested on a case study from the Sluzew Creek catchment in Warsaw, Poland, which has undergone rapid urbanization in the last three decades and has been strongly affected by urban flood flows and soil erosion in recent years. As no routine monitoring data of precipitation or discharge are available for the Sluzew Creek, we performed a dedicated monitoring campaign to have a thorough basis for this analysis. Our results clearly show that predictions in ungauged basins remain a difficult task: after calibration uncertainties in peak flow are high and up to 5 times larger than observed values. This is mainly due to imprecise rainfall information and the simplistic model structure.
The remainder of the article is structured as follows: in Sect. 2 we present the conceptual rainfall runoff model and details on the Bayesian parameter estimation. In Sect. 3, the Sluzew Creek case study catchment is described and the experimental design of the monitoring is given. In Sect. 4, we present the results. Finally, we discuss the results and draw conclusions in Sects. 5 and 6.

Conceptual modelling in ungauged basins
As mentioned above, modelling ungauged or poorly gauged catchments is a difficult task due to the lack of measurement data. Therefore, different conceptual rainfall-runoff models have been applied to predict the magnitude of floods.
The most frequently applied runoff models for ungauged or poorly gauged catchments rely on the Soil Conservation Service Curve Number (SCS-CN) (USDA-SCS, 1986, 1989Walker et al., 2000;Rosso and Rulli, 2002;Mishra and Singh, 2003;Hawkins et al., 2009;Soulis et al., 2009). The SCS-CN accounts for most runoff-producing characteristics of a watershed such as soil type, land use and treatment, surface and antecedent moisture conditions while its parameter can be derived from physical properties of the catchment. Therefore, it is popular for modelling in ungauged catchments (Mishra and Singh, 2003;Banasik et al., 2008;Hawkins et al., 2009;Soulis et al., 2009;Sikorska and Banasik, 2010).
In this study we applied a conceptual model that combines the SCS-CN method with a commonly applied unit hydrograph model (Kumar et al., 2007;Ahmad et al., 2009Ahmad et al., , 2010Gironás et al., 2009;Khaleghi et al., 2011) and its instantaneous form proposed by Nash (1957) (IUH) to convolute effective rainfall into direct runoff at the outlet of the catchment: where Q(t) is a runoff, P e (ϕ) is unit volume of instantaneous hyetograph, and t eff the duration of the effective rainfall. The unit hydrograph h(t) is expressed as: where A is the area of the catchment, t is an interval time, and u(ϕ) is the instantaneous unit hydrograph defined by a gamma probability density function (Nash, 1957). N is the number of identical linear reservoirs with retention time k. The effective rainfall P e (t) in Eq. (1) is computed with the SCS-CN method: where P(ϕ) is the total rainfall at time ϕ, S is the maximal potential retention of a catchment, and I is the initial abstraction. Similar to the parameters of the SCS-CN, the parameters and characteristic values of the IUH can be linked to catchment properties (see Sect. 2.5). Therefore, a conceptual runoff model allows for predictions even if no runoff data are available to calibrate the model.

Bayesian prediction and updating
The calculation of the prediction uncertainties is based on the likelihood function p(y|θ) and the distribution of the parameters (see below). The likelihood function describes the probability (density) of observing the data y = y t 0 ,y t 1 ,...,y tn given the model and parameters θ. Consequently, the observed output is a random variable. This is commonly modelled by combining a deterministic model with a random error term (see Sect. 2.3).
The parameters are not known precisely but the current knowledge of the parameters is described by the probability density function p(θ). For a concise formulation of prior knowledge applicable for a scarce data catchment through eliciting the maximal information content from the available basic data see Sect. 2.5. The predictive distribution of the model is then calculated by marginalizing the joint distribution of the runoff y and the parameters θ: If calibration data y C are available the distribution of the parameters θ is updated by applying the Bayes' theorem: The posterior distribution p(θ|y C ) now describes the updated knowledge about the parameters; a combination of the prior knowledge and the data (Gelman et al., 1996). Using the posterior distribution p(θ|y C ) in Eq. (5) the predictive distribution becomes: Note that if calibration data is not available, the formulated knowledge of parameters remains untouched and only Eq. (5) is employed to calculate the model's predictive distribution.

Likelihood function
It is often stated in the literature that the assumption of independent normal distributed errors does not hold for hydrological models. Due to model structure deficits, the residuals are often heavily auto-correlated (Sorooshian and Dracup, 1980;Romanowicz et al., 1994;Cawley et al., 2007;Yang et al., 2007Yang et al., , 2008. Therefore, we have constructed the likelihood of the model described in Sect. 2.1 with a continuous representation of an AR(1) process together with a Box-Cox transformation as proposed by Yang et al. (2007Yang et al. ( , 2008. See Appendix A for details.

Input error model
The introduced error process could represent the lumped uncertainty of the model structure deficits, measurement errors, and input uncertainty (Chatfield, 1996). However, additional insights can be gained if different sources of uncertainty are treated separately. It is well known that precipitation measurements contain errors, usually because point measurements represented by rare gauges are uncertain due to the significant spatial and temporal variability of rainfall fields (Kavetski et al., 2006a;Bárdossy and Das, 2008;Moulin et al., 2009;McMillan et al., 2011). Such spatial variation cannot be captured by traditional rain gauges. Additionally, in many situations only a single rain gauge is located close enough to be used. Consequently, the model input might be highly uncertain. This uncertainty propagates through the model and can lead to large output uncertainty. Therefore, it is sensible to consider this error in the model input. On the other hand, it is a reasonable assumption that the measurement error of the runoff is negligibly small compared to model structure deficits and input uncertainties (Gourley and Vieux, 2006). Furthermore, the rating curve error strongly depends on the case study and may be significantly reduced by carefully maintaining the gauging station (McMillan et al., 2010) (see Sect. 3.2). We therefore treat only input uncertainty separately. However, further error decomposition within a Bayesian context is possible and has been applied to different extends in the last years (Renard et al., 2011;Honti et al., 2012;Reichert and Schuwirth, 2012).
As proposed by Kavetski et al. (2006a), we tackle the uncertainty of the precipitation measurements with individual rainfall multipliers ζ j for each storm event as illustrated in Fig. 1. The product of ζ j and the measured precipitation is then used as input for the model. Every event has a separate factor, as uncertainties in effective rainfall vary depending on the characteristic of the rainfall event. We furthermore assume that ζ j is lognormally distributed with an expected value of one, which was shown by McMillan et al. (2011) to be a good approximation (for more details see Appendix B).
Note that this approach requires an event-based modelling approach. As for any analysed storm event, a separate rainfall multiplier must be inferred, the number of parameters increases with the number of events.

Formulation of prior knowledge
The specification of the prior distribution of parameters is, even for experts, a difficult task as no explicit rules exist Hydrol. Earth Syst. Sci., 16, 1221-1236, 2012 www.hydrol-earth-syst-sci.net/16/1221/2012/ Fig. 1. Schematic representation of the rainfall multipliers and the rainfall-runoff model. j refers to the number of the rainfall event, X O j (t) and Y O j (t) are the observed rainfall and runoff for the j th event at time t, respectively. X j (t) is the inferred rainfall and Y j (t) the modelled runoff for the j th event. ζ j is the rainfall multiplier for the j th event. θ are parameters of a model and θ ζ -parameters of the rainfall multipliers distribution (mean and variation) over all j events. (O'Hagan, 1998;Scholten et al., 2012). The aim here is to find the distribution that best reflects the current knowledge.
As for ungauged catchments, no measured flow is available and parameters have to be estimated from other sources of information. In the literature there are three common approaches to specify the knowledge on model parameters, which all have their difficulties. The first approach is to obtain parameters values straightforward from GIS data, topographic maps or tabulated values from the literature (Merz and Blöschl, 2004). The second one is to directly use parameters estimated on gauged catchments with similar characteristics (Seibert and Beven, 2009). Finally, parameters can be derived by empirical equations from readily available data through a regionalization process (McIntyre et al., 2005).
The disadvantage of the first method is that it can only be used to obtain physically-based parameters, such as the area of a catchment or the Curve Number (based on land use and soil characteristics maps) (USDA-SCS, 1986, 1989, which can be biased due to out-dated data. The second method raises the question of how the "similarity" of two catchments can be assured (Oudin et al., 2010;Patil and Stieglitz, 2011). The third method is promising, because it links also non-physically based parameters to catchment characteristics, such as length of the stream, slope, or impervious area (Madsen et al., 2002). However, with all three approaches no statement about the uncertainty of the obtained parameters can be made. Therefore, we propose to extend the third approach by using several empirical equations in parallel and constructing prior distributions from the population of obtained parameter values. Here, we combine five empirical relations to obtain values for N and k from catchment characteristics, which we label as: (i) SCS, (ii) Lutz, (iii) Rao, (iv) Geomorphologic IUH (GIUH), and (v) Geomorphoclimatic IUH (GCIUH). The corresponding equations for all methods are given in Table 1. Other methods (Haan et al., 1994;Bhunya et al., 2003;Jain et al., 2006;Singh, 2007) are not suitable as they re-late IUH characteristics to discharge properties that are not available for ungauged catchments. However, they may be included if such data are available.
The SCS method is the most common method to inform IUH characteristics (USDA-SCS, 1986, 1989. Originally, it was developed for small agricultural watersheds (<16 km 2 ), however, it accounts for different types of land use and has since then been adopted for urban and forest watersheds (Banasik et al., 2008;Seibert and Beven, 2009;Soulis et al., 2009).
In a similar study, Lutz (1984) analyzed over 950 rainfallrunoff events from 75 watersheds located in the Southwest of Germany with an area up to 250 km 2 . This method relates the parameters to stream properties and the ratio of forest and urbanized areas within the watershed.
An approach developed directly for small urban catchments was proposed by Rao et al. (1972), who explicitly took into account the degree of urbanization by relating the total area of the watershed to the fraction of impervious area.
The GIUH approach is based on numerical experiments with a detailed physically based watershed model on four basins in Venezuela and Puerto Rico with areas from 3 to 103 km 2 (Rodríguez-Iturbe and Valdés et al., 1979;Hall et al., 2001). In this approach the parameters are described as a function of watershed geomorphology and the dynamic parameter by the average peak flow velocity ν, which is then related to rainfall and stream properties.
A variation of the GIUH approach is GCIUH, which was developed to relate the parameters only to geomorphologic and climatic data (Nowicka and Soczynska, 1989;Hall et al., 2001).
The parameters of the IUH characteristics t p and u p are related to the Nash model parameters N and k as follows: and Lag = N · k.
The mean and variance of the parameters obtained by the five described methods are used to fit lognormal distributions as priors for N and k using the method of moments. The retention capacity S is related to the Curve Number (C N), which can be derived from GIS data (USDA-SCS, 1986, 1989Walker et al., 2000;Mishra and Singh, 2003;Hawkins et al., 2009;Soulis et al., 2009): The initial abstraction (I ) from Eq. (4) is specific for every rainfall event and therefore difficult to estimate in advance. However, it can be related to the S through the ratio factor, ·R 0.6 L ·A ·i r ·t r u p = 1.53 1 0.67 P 2 = 0.64, P 3 = 1.04 (Lutz, 1984); -Iturbe et al., 1982;Hall et al., 2001); Notes: L -Length of the stream from the water gauge to the watershed ridge (km), J z and J g -Average slope of catchment (%) and average slope of the stream (-), s = J g , L c -Length of the stream to the central point, assumed to be equal to 0.5 l, U and W -Ratio of urbanized and forest areas (%), P 1 -Parameter dependent on the roughness of the stream, P 2 and P 3 -Dependent on the interval of estimation, Lag -Lag time (h), A -Catchment area (km 2 ), U -Fraction of the impervious area in the catchment (-), P e and D e -Amount and duration of effective rainfall, respectively (mm) and (h), i r and t r -Effective rainfall intensity (cm h −1 ) and its duration (h), A , B , L -Area, width and length of the highest order stream (km 2 , m, km), R A , R B and R L are the Horton area, bifurcation and length ratios of the catchment (Tarboton, 1996), ν− Average peak flow velocity (ms −1 ), n -The Manning roughness coefficient (m −1/3 s −1 ). which for urban catchments is typically equal to 5 % of S (Hawkins et al., 2009).
For the watershed characteristics A and CN, an error due to inaccurate maps may be considered. However, while A usually remains constant for a catchment over the time, S may alternate and a sufficient wide prior distribution should be provided thereto. We assumed therefore a normal distributed error with a standard deviation of 10 % of the mean for A and based on CN a lognormal distribution with the mean of 55 and standard deviation of 30 for S.
The prior distribution for the standard deviation σ of the error model is difficult to define as σ must represent a combination of both model structure deficits and measurement errors. To reflect this, a wide distribution was selected. Similarly, a wide distribution was proposed for the characteristic correlation time of the autoregressive process τ (Table 1). Since it is extremely difficult to evaluate and formulate knowledge of the mutual interactions between all parameters beforehand, we assumed independence between all parameters, which is common praxis (Yang et al., 2007;Reichert and Schuwirth, 2012).

Assessing prediction performance
In the context of floods the predicted peak flow is the most important model result. For the Sluzew Creek, it can be readily transferred to stream water levels and flooded areas during a flood event. Specifically, we used the peak flow and its 80 %-interquartile range to assess the model performance.

Scenario analyses
To assess the individual error contributions and the gain of information from observations, the prediction uncertainty was analysed for four scenarios, which reflect different degrees of data availability and knowledge of the modeller. Scenario A describes a typical case of a completely ungauged basin, where no flow data are available for calibration. Here, the runoff is predicted using only the prior distribution. For Scenario B, the prior distribution has been updated with calibration data of 14 rain events. For the flood predictions of B, we used the estimated standard deviation over all rainfall multipliers (σ ζ ) instead of the estimated individual rainfall multipliers for every rainfall event. Thereby inferred rainfalls were sampled from the posterior parameter space over all rainfall multipliers. This is the best option to predict the runoff of a future rain event, for which an appropriate multiplier cannot be known in advance. In addition, we estimate the individual contributions of input uncertainties (Scenario C) and parameter uncertainties (Scenario D) to the total prediction uncertainty. Scenario C is similar to B but disregards input uncertainty by setting σ ζ to zero, which illustrates the effect of the uncertain precipitation measurements. Scenario D is similar to scenario A, only that the parameters were derived with the Rao method and considered exactly known. This scenario illustrates the impact of the parameter uncertainty (see Table 3).

Implementation details
The model was implemented in R (R Development Core Team, 2011). We sampled from the posterior probability distribution using an adaptive Monte Carlo Markov Chain (MCMC) sampler (Vihola, 2011). The sampler of Vihola (2011) adjusts the covariance matrix of the jump distribution to achieve a defined rejection rate and thus guarantees efficient sampling but other algorithms could have been applied as well (Gilks et al., 1995;Brooks et al., 2011). Convergence to the stationary distribution was achieved by running 72 chains in parallel with 50 000 samples each. The number of chains was chosen in preliminary trials to ensure good coverage of the parameter space. The prediction uncertainty bands are based on 1000 Monte Carlo simulations for each of the described scenarios.

Test catchment
As a test catchment, the upper part of the Sluzew Creek basin was chosen; it is located in the city of Warsaw (Poland) and has an area of about 26.9 km 2 (A red : 18.3 km 2 ) (see Fig. 2). In the last three decades, it has undergone rapid urbanization. As a consequence, it is strongly affected by urban flood flows (every second year) and soil erosion (WAU, 2002;Banasik et al., 2008). The average annual precipitation in this part of city is about 540 mm and the average daily temperature varies from −3 • C in January to +18 • C in July (WAU, 2002;Majewski et al., 2010). As a lowland watershed, no steep slopes exist and the elevation varies from 95 m to 110 m above sea level. Thus, the topography of the watershed does not have a major influence on the surface runoff, which instead is dominated by the land use type (Barron et al., 2011). Urban areas cover 58.7 % of the catchment and the percentage of impervious areas in the whole catchment is 32 %. As a small ungauged basin, no routine monitoring data of precipitation and discharge are available and we implemented our own monitoring program.

Data collection
Rainfall data at three locations and the runoff at the outlet of the catchment have been observed for three hydrological years with a temporal resolution of 10 min.
For our analysis, we selected 14 rainfall-runoff events from the period 2007-2009. The selection and separation of the events were based on both the amount of total areal-averaged precipitation per an event (>3 mm) and the maximal observed discharge at the outlet (>1 m 3 s −1 ). Events with discontinuous rainfall and during the winter period, where potential snowmelt can significantly contribute to runoff, were excluded from the analysis.
For the stream gauge we additionally possessed detailed hydraulic information of the cross section. Furthermore, the variation in the rating curve due to seasonal and alluvial   We selected the values following (Reichert and Mieleitner, 2009) based on the analysis of the innovations; c σ ζ relates to the standard deviation of each rainfall multiplier, identical for all multipliers; and d n -number of selected rainfall-runoff events. changes within the channel were considered negligible. First, all analyzed events occurred during spring and summer seasons and were clearly storm-related. Second, the Sluzew Creek is a rather small catchment. We assigned therefore a low uncertainty to the rating-curve and a small error on observed runoff.

Prior distribution
The prior distribution for the parameters of the IUH and the watershed characteristics was derived as described in Sect. 2.5. The obtained prior distributions are summarized in Table 2 (see also Fig. 3). We find that the values for N obtained with the empirical formulas roughly vary by a factor of 2, whereas the results for k vary by a factor of 4. The resulting lognormal distributions have a mean of 3.21 and standard deviation of 0.97 and a mean of 1.78 h and standard deviation of 0.86 h for N and k, respectively.

Bayesian parameter estimation
The model was calibrated with seven parameters: N and k of the IUH model, A and S for the watershed characteristics, σ and τ of the error model, and σ ζ of the rainfall multipliers. Additionally, all 14 rainfall multipliers were inferred Hydrol. Earth Syst. Sci., 16, 1221-1236 together with model parameters. The marginal posterior parameter distributions for most parameters were found to be distributed close to the prior but, as expected, with smaller variances (Fig. 3). An exception was the asymptotic standard deviation of the error process σ , for which more information was gained from the data, because not only its variance was greatly reduced but also the mode was shifted from 1 to 0.4. Interestingly, the posterior standard deviation of all rainfall multipliers (σ ζ ) increased compared to the prior. This means that the input error may have been slightly underestimated by the prior knowledge. Further explanation is provided in the Discussion (Sect. 5). The mode of the estimated rainfall multipliers varied from 0.58 to 1.70 with a mean of 0.96 for all events (see Supplement). For large events with greater observed precipitation amounts the accuracy of the rainfall measurement was found to be higher and closer to the value of one. For events with lower observed rainfall the input error was found to be higher (Supplement).
The diagnostic plot of model residuals and innovations is presented in Fig. 4. Not surprisingly, the residuals show a strong autocorrelation (Fig. 4, top row). This highlights the fact that the assumption of simpler likelihood functions with independent error terms would be clearly violated. The assumption of the continuous AR process with independent innovations is fulfilled much better (Fig. 4, bottom row), even if some week autocorrelation is still observed for the event 12. Moreover, time series of standardized observed innovations of the autoregressive error model show a reduced heteroscedasticity compared to the residuals (not shown).

Predictive uncertainties of flood discharges
The performance and uncertainties of model prediction were assessed under the four scenarios defined in Sect. 2.7. First, the model accuracy was evaluated with the relative error of the predicted to the observed peak flow (Table 3). As ex-pected, this was the highest for Scenario D. Using prior parameter distributions delivered from different methods (Scenario A) instead of a single method (Scenario D) makes it possible to better account for uncertainties in the parameters and slightly improves the accuracy of the peak flow estimation. The accuracy may be further improved through calibration with runoff data (Scenario B).
Second, the predictive uncertainties were calculated for the different scenarios (Fig. 5). Solid lines correspond to predictions using the mode of the posterior density. Gray bands depict the 80 % predictive intervals. For scenarios where no runoff data are available, the achieved uncertainty bands were up to 7 times more than observed peak flows, which is large. A calibration with data allows reducing the uncertainties to 5 times more (Scenario B). However, its 80 %prediction interval is still wide compared to the observed data. Scenario C illustrates that the contributions of the input uncertainty are important, because the uncertainty bands are 50 % narrower compared to Scenario B. Scenario D shows that the parameter uncertainty is not a relevant contribution in predictive uncertainties for this ungauged catchment.

Discussion
In this study, we present an approach to assess the prediction uncertainties of a conceptual rainfall-runoff model in ungauged urbanized catchments. The above results show that the prediction uncertainty is rather large and dominated by input uncertainty and model structure errors. Here, we would like to discuss four important aspects, namely (i) the obtained results for the prior and posterior parameter distribution, (ii) the choice of the likelihood function and the consideration of input uncertainty, (iii) uncertainty contributions and difficulties in their assessment, and (iv) problems with  Hydrol. Earth Syst. Sci., 16, 1221-1236, 2012 www.hydrol-earth-syst-sci.net/16/1221/2012/ assessing the consequences of urbanization and modelling in ungauged basins with a brief outlook on future challenges. With regard to point (i), to derive a useful prior distribution, we propose the use of five different empirical methods. As described above, the parameter values varied significantly depending on the choice of the empirical method. This indicates that the use of a single empirical method most probably leads to biased flow predictions, which usually overestimate the runoff from big events and underestimate runoff during small events. However, as the largest uncertainty is contributed by input uncertainties (shown by a comparison of Scenarios B & C) and model structure errors, the predictions are not so sensitive to the prior distributions of the model parameters (shown by Scenarios A & D). While we can suggest a concise approach to derive a prior for the model parameters, obtaining prior distributions for the parameters of the error model (σ and τ ) is difficult. While τ can be interpreted as the memory effect of the catchment, σ captures both the model structure error and the measurement error and has no physical meaning. Therefore, the prior distribution of σ is best directly taken from calibrated models. As such information is not available so far, we hope that the results from our case study could represent a valuable contribution. For larger or more rural catchments, we recommend choosing a conservative (i.e. wide) prior distribution for σ to avoid overconfidence.
In our case study, we furthermore find that the modes of obtained posterior distributions lie within the expected ranges. If the model is calibrated to each rain event separately, the retention S is comparably larger for "small" storm events with less than 12 mm rainfall. This explains the wide posterior distribution for this parameter and corresponds to recent findings (Hawkins et al., 2009;Soulis et al., 2009). With regard to the correlation time τ of the autoregressive process, we obtained most probable values around 400 min, which is reasonable for a small urbanized catchment. Interestingly, the posterior mode of σ is more than two times smaller than that of the prior, which was rather unexpected and is discussed further below.
With respect to point (ii), the results of Bayesian parameter estimation are only meaningful if the assumptions of the error model are fulfilled. Here, reasonable uncertainty bands were obtained with the proposed autoregressive error process. The fact that larger flows have higher uncertainties than dry weather flows confirms our expectations. We find that the applied error model is very convenient, because it is straightforward to implement. Due to the continuous form, it is suitable for data that are not equally spaced in time (e.g. due to missing values).
Input uncertainty was considered by using rainfall multipliers. Inferring one rainfall multiplier per rain event from the observed rain and the runoff has several advantages. First, it limits the number of parameters to be inferred to a manageable number. Second, it allows for a better fit to the data (Supplement) and an estimation of the uncertainty in the input. This uncertainty must then be considered in the prediction uncertainty. The main limitation of rainfall multipliers is that they fail when no precipitation has been observed for a runoff event. While we took great care in our study to eliminate this problem by an experimental design with multiple rain gauges, this can be relevant in practical applications. The posterior rainfall multipliers were found to vary around one, whereas the standard deviation σ ζ of all multipliers was found to be relatively high (about 0.4). Consequently, the uncertainties linked to the input error are important. However, it must be noted that the rainfall multipliers ultimately increase the flexibility of the model and thus partly compensate for model structure deficits.
Regarding (iii), a careful interpretation is required. The observed dominance of input uncertainty over other contributing sources may be potentially caused by mutual interactions between both: input and model structure error. This is especially true when poor knowledge is available but may be minimized by the precise prior information on input uncertainty (Renard et al., 2010(Renard et al., , 2011. Therefore, we agree with Seibert and Beven (2009) that the inferred rainfall should not be interpreted as "real" rainfall, but as the estimated inputs for the applied model. Furthermore, the introduced error model lumps runoff measurement errors, model structure deficits and rating curve error. The decomposition of those is not straightforward. However, as the measurement and rating curve error can here be assumed insignificant (see Sect. 3.2), the estimated uncertainties are dominated by model structure deficits. Contrarily, if no sufficient information for the rating curve is available, the corresponding error may be significant (McMillan et al., 2010).
With regard to point (iv), we found that the uncertainties in peak flows are about 5 times higher than the observed values, which is large and raises concerns for practical applications of IUH-type models. On the one hand, we are able to show that the parameter uncertainty can be greatly reduced and the prediction improves even with only few discharge measurements. However, on the other hand, for the Sluzew Creek, putting more effort into flow monitoring or collecting long-term records will most probably not improve the results significantly, because model structure deficits and input uncertainty remain. Reducing the input uncertainty seems most promising, but typically has some cost attached to it. For example, more detailed rainfall information requires investments into a denser network of rain gauges, weather radar or retrieving data from microwave links (Berne and Uijlenhoet, 2007;McMillan et al., 2011;Wang et al., 2012). In our case, the model structure can probably be also improved, but in general this is tied to the availability of runoff data for calibration. In totally ungauged catchments, however, one is limited to models where the parameters are roughly known or can be derived with empirical methods. This is particularly true if the conditions of the basin are expected to change, for example due to urban growth which is currently especially relevant in Eastern European cities (EEA, 2006 With continuing urbanization, even a complex model calibrated to current conditions will not reliably predict future runoff. Therefore, despite their limitations, simple conceptual models are justified when it is straightforward to derive parameters and predictions for different future scenarios. In summary, we agree with predecessors (Sivapalan et al., 2003;Seibert and Beven, 2009;Reichert and Mieleitner, 2009) that hydrological modelling in small ungauged or poorly gauged catchments is not a trivial task and find, once more, that the predictions are very uncertain. Unfortunately, in many cases there is no alternative to predictions of models that cannot be verified with data (Kapangaziwiri and Hughes, 2008). Therefore, we believe that it is especially important in these situations to quantify the prediction uncertainty. Furthermore, the uncertainty must be communicated to the decision makers and, if possible, taken into account in the urban planning process (Ramos et al., 2010). This, again, is especially important for urbanized catchments, where the economic consequences of floods can be severe.
Despite economic losses due to floods, urban growth usually affects the receiving water quality through point and non-point source pollution. In this regard, wet-weather pollution, which is often associated with the amount of total suspended solids, is especially crucial. Future work should therefore investigate the uncertainties of water quality impairments (e.g. from sediment loads) in ungauged catchments under urbanization. Promising approaches for multi-objective calibration within a Bayesian framework that would lend themselves to this task have been recently suggested (Dietzel and Reichert, 2010;Reichert and Schuwirth, 2012;Honti et al., 2012).

Conclusions
In this study, we investigated the uncertainty of flood predictions in ungauged urban basins with structurally uncertain rainfall-runoff models. We used Bayesian statistics to explicitly account for parameter uncertainty, input uncertainty and model structure deficits together with measurement errors and successfully demonstrated the approach on an urban catchment in Warsaw, Poland. Based on our results we conclude that: -The proposed procedure to derive prior distributions for the model parameters from base data by combining five different empirical methods is concise. It delivers meaningful results and is readily transferable to other ungauged catchments. In contrast, it is difficult to specify prior distributions for the parameters of the error model, which do not necessarily have a physical meaning. Our results for the latter might therefore be beneficial for other studies in similar basins.
-The statistical assumption of independent and normally distributed residuals does not hold for simple hydrological models because of model structure deficits. In our case, it was possible to meet the statistical assumptions much better using a Box-Cox data transformation with a continuous-time autoregressive error model. This lumped error process is convenient and straightforward to implement.
-Flood predictions of IUH models in ungauged basins are difficult because predictive uncertainties are large. In our study, we found that predicted peak flows were up to 7 times higher than observations. This was reduced to 5 times with only few discharge measurements.
-The separation of uncertainties is beneficial because it makes it possible to assess the individual error sources. The major contribution to the predictive uncertainties is input uncertainty, i.e. imprecise rainfall information.
Model structure deficits rank second, whereas parameter uncertainties were found to be not so important. Flow predictions will improve most with better rainfall information, for example from a denser network of rain gauges or microwave links. dg dy = (y + λ 2 ) λ 1 −1 (A5) Where y is the runoff (observed or modelled), z is the forward transferred runoff, λ 1 and λ 2 are Box-Cox transformation parameters. Note that g includes the identity (λ 1 =λ 2 = 1) and a log-transformation (λ 1 =λ 2 = 0) as special cases, y + λ 2 and z must be larger than zero for all values y and z. The simplest assumption for the error process in Eq. (A2) is that ε t i is independently and identically distributed. However, it is often reported that this assumption does not hold for runoff (Yang et al., 2007(Yang et al., , 2008. To consider auto correlated error terms, a continuous autoregressive error model based on the Ornstein-Uhlenbeck process proposed by Yang et al. (2007Yang et al. ( , 2008 was used. Thereby, the independence and normal distribution is assigned not to the error but rather to the random distributes, called innovations (Chatfield, 2003;Yang et al., 2007Yang et al., , 2008: where σ I t i is a standard deviation of the innovation I t i , σ is a standard deviation of the error ε, and τ is a characteristic correlation time. In combination with the Box-Cox transformation, the following likelihood function results: p(y|θ,R) = 1 √ 2π 1 σ exp − 1 2 g y t 0 − g y t 0 (θ) where y t is an observation and y t (θ) is a simulated response of a model at time t.