Assessing rating-curve uncertainty and its effects on hydraulic model calibration

This study considers the overall uncertainty affecting river flow measurements and proposes a framework for analysing the uncertainty of rating-curves and its effects on the calibration of numerical hydraulic models. The uncertainty associated with rating-curves is often considered negligible relative to other approximations affecting hydraulic studies, even though recent studies point out that ratingcurves uncertainty may be significant. This study refers to a ∼240 km reach of River Po and simulates ten different historical flood events by means of a quasi-twodimensional (quasi-2-D) hydraulic model in order to generate 50 synthetic measurement campaigns (5 campaigns per event) at the gauged cross-section of interest (i.e. Cremona streamgauge). For each synthetic campaign, two different procedures for rating-curve estimation are applied after corrupting simulated discharges according to the indications reported in the literature on accuracy of discharge measurements, and the uncertainty associated with each procedure is then quantified. To investigate the propagation of rating-curve uncertainty on the calibration of Manning’s roughness coefficients further model simulations are run downstream Cremona’s cross-section. Results highlight the significant role of extrapolation errors and how rating-curve uncertainty may be responsible for estimating unrealistic roughness coefficients. Finally, the uncertainty of these coefficients is analysed and discussed relative to the variability of Manning’s coefficient reported in the literature for large natural streams.


Introduction
During the last decades the increased computational resources and advances in numerical modelling have led to the spread of different hydrological and hydraulic models characterized by different complexity (e.g.mono dimensional model -1-D model: MIKE11, Danish Hydraulic Institute, 2003, HEC-RAS, Hydrologic Engineering Center, 2001; quasi-twodimensional models, quasi-2-D, or fully 2-D models: LISFLOOD-FP, Bates and De Roo, 2000;TELEMAC, Galland et al., 1991).Nevertheless, the capability of mathematical models to well reproduce the hydraulic behaviour of natural rivers is closely related to the availability and accuracy of observed streamflow data for calibrating and validating the models themselves.In this context streamflow data plays a dominant role and the accurate set up of a stagedischarge relation in a specific gauged station becomes of utmost importance for the reliability of results (e.g.Pappenberger et al., 2006;Herschy, 2002).
Usually, the streamflow hydrograph relative to a specific gauging station and flood event is calculated by converting measured water level into flow rate by means of an existing stage-discharge relation, or rating-curve.The curve is generally calibrated over a series of h(t) − Q(t) pairs, where h(t) is the water level measured at time t and Q(t) the concurrent river discharge, which, in turn, is often estimated trough the velocity-area method (Herschy, 1999;Fenton and Keller, 2001).Even though Q(t) values are not direct measurements, but rather estimates of the real and unknown discharge values, they are seldom associated with a statement of their uncertainty in practical applications (Herschy, 2002).For instance uncertainty affects the velocity-area method (the most widely used method for discharge record, see European ISO EN Rule 748, 1997, ISO 748:97;Sauer and Meyer, 1992), the mathematical interpolation of h(t) − Q(t) pairs, as well as the extrapolation of the curve beyond observed data (see also Pelletier, 1987, and references therein).Furthermore, the construction of stage-discharge relationships is based on several assumptions, some of which inevitably introduce simplifications and errors.Inaccuracy for example may arise from instruments not always working in ideal conditions (Schmidt, 2002).Besides, errors may be associated with measures of water level and width of the river crosssection (Sefe, 1996).Velocity-area method for discharge estimation introduces a number of approximations that are associated for example with the finite number of verticals which cross-section is divided into, and to the limited number of measurement points along each vertical (see e.g.Herschy, 2002;ISO 748:97); also wind as well suspended sediments could alter velocity measurements.In particular, the application of the velocity-area method refers to a hypothetical steady-flow condition which does not guarantee an accurate estimation of the real unsteady stage-discharge relationship, especially in cases of very mild river slopes characterised by wide loop-rating curves (e.g.Dottori et al., 2009).Again, the geometry of gauging cross-section is assumed to be stable in time, even though significant changes may occur during high flood events due to erosion, sediment transport and deposition.
Literature reports several studies focussing on the analysis of the different error sources and global uncertainty affecting discharge measurements and rating-curves construction (e.g.Di Baldassarre and Montanari, 2009;Pappenberger et al., 2006;Di Baldassarre and Claps, 2011).Leonard et al. (2000), Schmidt (2002) and Herschy (2002) for example indicated that errors in discharge measurements are approximately 6 % of the flow value provided by the current meter.Pellettier (1987) reviews more than 140 publications, and maintains that, depending on many operational factors (e.g.number of verticals and sampling points, current velocity, exposure time of instruments, location of gauged section and many others) the uncertainty of discharge measurements might be as high as 20 % of the observed value.Recently, the International Standard Organisation provided an estimation of the overall uncertainty affecting discharge measurements due to the application of the velocity-area method (ISO 748:97).
Also, resorting to rating-curves to convert river stage levels into flow rates inevitably introduces an additional source of uncertainty that depends on the number of field observations available and mathematical expression adopted to describe them.Furthermore, since discharge measurements are often impracticable during high floods, extrapolation errors are generally introduced.Interpolation and extrapolation errors are generally not negligible.For instance, Di Baldassarre and Montanari (2009) estimated average interpolation and extrapolation errors for a reach of the River Po through steady state simulation and quantified them as 1.7 % and 13.8 % of Q(t), respectively.Despite that, flow hydrographs calculated by means of rating-curves are often used as error-and uncertainty-free upstream boundary conditions in numerical hydraulic modelling.
Even though recent years have shown an increase of the attention of researchers on uncertainty in hydrology and effects on hydrological modelling (see e.g.Montanari, 2007;Montanari and Brath, 2004;Pappenberger et al., 2006), only a few attempts have been made on the evaluation of the effects of streamflow data uncertainty on numerical hydraulic modelling, even though these effects could significantly impact or undermine the reliability of numerical models themselves (see e.g.Di Baldassarre and Montanari, 2009).Agencies in charge of hydroclimatic monitoring usually do not provide users with indications on uncertainty associated with ratingcurves, rather referring to observed data in a deterministic way.
This analysis addresses three main goals, which are reflected in the structure of the manuscript: 1. to develop a numerical procedure for quantifying the uncertainty for quantifying the uncertainty associated with a given rating-curve; 2. by applying the procedure proposed at point (1), to compare the uncertainty associated with two different approaches to rating-curve estimation; 3. to analyse how rating-curve uncertainty propagates to Manning's roughness coefficients during the calibration of numerical hydraulic models.
The three goals are addressed for the Cremona streamgauge, located long the middle-lower reach of the River Po in Italy.

Objectives and methods
Our study refers to discharge data evaluated through the velocity-area method, one of the most widely used technique for the determination of discharge in natural rivers (see e.g.Herschy, 1978;Pelletier, 1987;Sauer and Meyer, 1992).Literature reports many different approaches for accurately measuring river streamflows and for constructing rating-curves (see e.g.Rantz et al., 1982;Dottori et al., 2009;Perumal et al., 2010).
Although the literature presents a number of mathematical expressions for relating water levels to flow rates in a given cross-section (see e.g.Ackers et al., 1978;Petersen-Øverleir, 2004;Franchini and Ravagnini, 2007), we preferred to refer to power-law (1), in light of its simplicity and wide utilization (e.g.Petersen-Øverleir, 2004;Schmidt and Yen, 2009).The power law expresses streamflow Qas follows: where h is the water level above a vertical reference and e is the level corresponding to zero flow rate above the same

Uncertainty of discharge measurements
All discharge measurements in open channel cross-sections are not free of errors.While it is not possible to predict this error exactly, an estimation of its likely magnitude may be performed by analysing the individual velocity measurements that are required to estimate the river discharge.ISO 748:97 provides some quantitative indications on the main error sources.These indications are summarized by the following equation: in which X b expresses the random uncertainty related to the measurement of cross-section width; X d represents the uncertainty on the measurement of water depths along each vertical which the river cross-section is divided into.Furthermore, many errors sources are associated with the measurement of the stream main velocity through a current-meter: X e related to the duration of the measurement, X m depending on the number of verticals, X p function of the number of measurement points along each vertical and X c associated to current-meter calibration.
Under the assumption that measurement errors are normally distributed, ISO 748:97 indicates that the uncertainty interval of discharge measurements is equal to 5.3 % of the discharge value at 95 % confidence level when at least 20 verticals are considered.This means that in 95 % of the cases, the correct value of streamflow is ±0.265 times the calculated value.

Rating-curve construction
A rating-curve, or stage-discharge relation, is identified for a given cross-section by interpolating measured discharges and concurrent observations of water depths.Since ratingcurves are normally used to convert river stage observations into discharge values, uncertainty on these curves results in errors in streamflow hydrographs, which, in turn, practitioners may use for a number of practical applications.
European ISO EN Rule 1100-2 (1998, ISO 1100-2:98) provides guidelines for a correct rating-curve construction, indicating the optimal characteristic and amount of measured data.In particular, the rule indicates that a measurement campaign should consists of at least 15 h(t) − Q(t) pairs, uniformly distributed within the range of measurable streamflows, given that, for practical reasons, no measures are generally taken during large to extreme flood events (e.g.Kuczera, 1996;Rantz et al., 1982).In our study we explicitly refer to these indications (see Sect. 2.3).
Concerning the actual rating-curve construction, previous studies point out the importance of extrapolation error associated with the utilization of the curve beyond the range of observed data (e.g.Di Baldassarre and Montanari, 2009;ISO 1000-2;1998;Herschy, 2002).Uncertainty due to extrapolation may vary significantly depending on the approach used for the construction of the curves.In order to better understand this component of uncertainty we consider two different approaches to rating-curve construction, which we term Traditional and Constrained approaches.
The Traditional approach follows ISO 1100-2:98 guidelines and refers in our study to power-law (Eq.1).Equation parameters are estimated over a set of at least 15 h(t) − Q(t) pairs by means of least squares method.It is worth noting here that the uncertainty of Traditional rating-curves due to extrapolation might be particularly significant.
The Constrained approach uses the largest discharge observation in the set together with the associated water level to calibrate an ad-hoc 1-D steady-state hydraulic model that extends upstream and downstream the cross-section of interest (i.e.considered gauging station) to limit the effects of boundary conditions (the length of the reach may vary depending on local conditions, see e.g.Castellarin et al., 2009).The calibrated 1-D model is then used to evaluate the maximum discharge capacity Q max of the cross-section of interest (maximum steady-state discharge contained within lateral embankments) and its corresponding water level.The additional pair h max − Q max is then used to constrain the estimation of Eq. ( 1) parameters, which are identified by fitting the (at least) 15 observed h(t) − Q(t) pairs by minimizing the sum of squared residuals while concurrently forcing the curve through h max − Q max .Concerning this approach, Di Baldassarre and Claps (2011) performed some numerical experiments on the applicability of a rating curve for high flood event.They pointed out that the indirect measurement of discharges beyond the measurement range should rely on a physically based model rather than on the Traditional approach of extrapolating rating-curves, also suggesting that the use of a calibrated hydraulic model to extrapolate the rating-curve could be a good operational strategy in order to reduce overall uncertainty.

Assessment of rating-curves global uncertainty
We propose to evaluate the global uncertainty associated with a given rating-curve by referring to a number of synthetic discharge measurements campaigns, each one consisting of 15 synthetic h(t) − Q * (t) pairs (see ISO EN Rule 1100-2, 1998).The synthetic "true" h(t) − Q(t) pairs are generated by means of numerical simulations through a suitable numerical hydrodynamic model, for which the study stream-gauge represents an internal cross-section.Synthetic stream-flow observations Q * (t) are then obtained by corrupting simulated discharges Q(t) at the cross-section of interest with a normally-distributed random error with 0 mean and 2.7 % standard deviation (see ISO EN Rule 748, 1997 and Sect. 2.1).Traditional and Constrained approaches can be applied to fit Eq. ( 1) to all synthetic measurement campaigns.The variability of resulting rating-curves enable one to define the 90 % confidence intervals around the expected rating-curve for the study stream-gauge and each one of the approaches.
It is worth remarking here that the proposed approach quantifies rating-curve global uncertainty under a series of fundamental assumptions: overall measurement error is normally-distributed; hypothesized current-meters work in ideal conditions and systematic errors are excluded; flow is orthogonal to cross section; the river-bed geometry is stable; sediment transport and wind are neglected; effects of unsteady flow conditions are neglected (hysteresis in unsteady rating-curves) as well as the effects due to seasonal variation of the Manning roughness coefficient (see e.g.Di Baldassarre and Montanari, 2009).
An assessment of overall rating-curve uncertainty is a valuable piece of information that can be of use in a number of practical applications.For instance, classical literature presents Manning's roughness coefficients as physically interpretable parameters that can be identified on the basis of rive-bed characteristics (e.g.vegetation, sinuosity, sediments' diameter, etc., Chow, 1959).Recent studies point out that roughness coefficient should rather be regarded as a mere calibration coefficient, which compensate for several error sources while describing roughness conditions (e.g.structure of the model, uncertainty in input data and boundary condition, accuracy of the description of riverbed geometry, etc.).As a result, calibration of roughness coefficients may identify optimal values that are not physically interpretable or justifiable (see e.g.Di Baldassarre et al., 2010).As an application example, we illustrate how uncertainty in stream-flow hydrographs propagates to Manning's roughness coefficient, n, during the calibration of hydrodynamic models.

Study area
Our analysis focuses on the streamgauge located in Cremona, along the Po River (see Fig. 1).The River Po, the longest Italian river, flows ∼650 km eastward across northern Italy, from the northern-eastern Alps to the Adriatic Sea near Venice.Its drainage basin area, ∼71 000 km 2 , is the largest in Italy.Cremona belongs to Po's middle-lower reach (see Fig. 1), which is characterised by a stable main channel with width ranging from 200 to 500 m.The floodplain, whose overall width varies from 200 m to 5 km, is confined by two continuous artificial main embankments.The embanked floodplain is densely cultivated, and a large portion of it is protected against frequent flooding by a complex system of minor dykes (dyke-protected floodplains), which are mainly located between Cremona and Borgoforte (total retention volume: ∼450 Mm 3 ; Castellarin et al., 2011a,b).These features make a standard one-dimensional hydrodynamic model unsuitable for representing the complex hydraulic behaviour of the system (i.e.main channel-secondary channels-dykeprotected floodplains) during major flood events (Castellarin et al., 2011a).
In October 2000 the River Po and some of its major tributaries experienced the second important flood event of the last 50 yr, producing a peak flow of about 12 240 m 3 s −1 at Piacenza, 11 850 m 3 s −1 at Cremona, and 9750 m 3 s −1 at Pontelagoscuro.The flood event is well documented in terms of water level and flow hydrographs (Castellarin et al., 2011a,b).

Hydrodynamic models
We use to two different quasi two-dimensional (quasi-2-D) hydraulic models and a simplified 1-D model in our study.Both quasi-2-D models are built using the UNET code (Barkau, 1997), which numerically solves the Saint-Venant equations through the classical Preissmann implicit fourpoint finite difference scheme, but they refer to two different reaches of the Po River.The first quasi-2-D model refers to the reach from Piacenza to Pontelagoscuro (see Fig. 1) and is used in the study to generate synthetic measurement

Reference model for rating-curve identification (Piacenza-Pontelagoscuro model)
The Piacenza-Pontelagoscuro quasi-2-D model extends for ∼240 km reach of the River Po from Piacenza, the upper cross-section, to Pontelagoscuro (contoured in grey in the lower panel of Fig. 1).Dyke-protected floodplain are modelled as storage areas, connected to the main channel by means of lateral weirs, which represent the minor dyke elevations.All geometric data needed for the implementation of the model are retrieved by analysing a 2 m DTM in a GIS environment (see Castellarin et al., 2011a and b for details).
The numerical model was calibrated for the recent flood event of October 2000 in the light of the event magnitude, whose recurrence interval is ∼50 yr, and the completeness of the available flood data, which include stream-flow hydrographs for the major tributaries represented as lateral inflow (see Figs. 1 and 2; Castellarin et al., 2011a,b).The calibration focused on the reproduction of high water marks surveyed in the flood aftermath at 132 cross-sections, and stage hydrographs in three internal cross sections (Casalmaggiore, Boretto and Borgoforte).The model adopts three Manning's roughness coefficients, n f for unprotected floodplains, n u for the upper ∼170 km reach, and n l for the lower ∼70 km reach (see Fig. 1); a subdivision of the study reach into an upper and lower portion reflects the morphology of the riverbed.The best performance is obtained with n f = 0.1 m −1/3 s, n u = 0.041 m −1/3 s, and n l = 0.032 m −1/3 s (see Table 1, Calibration Event -CE).These values agree with those recommended in the literature for large rivers (see e.g.Pappemberger et al., 2006;Di Baldassarre et al., 2009) and roughness coefficients obtained for the same reach in previous studies (see e.g.Castellarin et al., 2009Castellarin et al., , 2011a)).
We then used the Piacenza-Pontelagoscuro model for simulating 10 significant historical flood events observed from 1951 to 1982, for which discharge hydrographs are available at Piacenza streamgauge.Figure 2 (right panel) reports, as an example, the flow hydrographs observed in 1951 (largest observed peak-flow: 12 850 m 3 s −1 ) and 1970 (smallest peak-flow of the set: 2700 m 3 s −1 ).From these 10 events we generated 50 synthetic field campaigns (5 for each simulated flood event) at the internal cross-section of Cremona, which is located 47 km downstream Piacenza.Each synthetic field campaign consists of 15 pairs dischargewater level, randomly selected within the flood wave during both rising and recession limbs and for discharge values 1000 m 3 s −1 ≤ Q ≤ 6000 m 3 s −1 , which is the interval of stream-flow values for which discharge measurements are practically executable relative to Cremona streamgauge A. Domeneghetti et al.: Assessing rating-curve uncertainty and its effects on hydraulic model calibration (e.g.Di Baldassarre and Di Baldassarre and Montanari, 2009).Discharge values retrieved from model simulations were then corrupted as described and used as data set for rating-curve construction.
Referring to the presented procedure for the evaluation of synthetic campaigns it is worth noting that all h(t) − Q(t) pairs were reproduced by the calibrated quasi-2-D model using a single calibrated Manning's coefficient.The evaluation of h(t) − Q(t) pairs using a quasi-2-D model calibrated for high events inevitably introduced uncertainty, which is expected to be higher for low-flow conditions.Moramarco and Singh (2010) analyzed this aspect evaluating the trend of Manning's coefficient for two river sites along the Tiber River and they highlighted that the n value decreases with increasing flow depth (and hence increasing discharge), showing an asymptotical behaviour for high water levels.The same behavior has been observed at the Cremona river crosssection where the Piacenza-Pontelagoscuro quasi-2-D model has been calibrated for different steady-flow conditions.Referring to a set of observed h(t) − Q(t) pairs corresponding to different flow conditions (ARPA-RER, 2006), Fig. 3 reports calibrated roughness coefficients at the Cremona crosssection.As also observed by Moramarco and Sing (2010), n values decrease with increasing flow-rates, tending asymptotically to a constant value (n ∼ 0.044 m −1/3 s) for high flow conditions.Once the water depth in the river exceeds this threshold, the calibrated Manning coefficient can be considered to be constant.
Vertical line on Fig. 3 (grey dashed line) defines the lower bound of flow-rates range considered in the synthetic campaigns (1000 m 3 s −1 ≤ Q ≤ 6000 m 3 s −1 ).As one may note, almost all h(t) − Q(t) pairs used for rating-curve construction refer to hydraulic conditions where the Manning value has already reached the asymptote.On the basis of these results we can assume that the range of flows considered in this study is not affected by a decrease of manning values, enabling us to consider synthetic measurements free of distortions and available for rating-curve construction.Nevertheless, this is an important point, particularly when medium and low flows are considered, which will be analysed in future analyses.
Crosses in both panels of Fig. 4 illustrates two examples of synthetic measurement campaigns, whereas the grey dots in Fig. 5 represent the compound of simulated discharge-water level pairs for all 10 flood events at semi-hourly timescale showing the loop-rating (hysteresis) that characterizes the unsteady stage-discharge relation at natural cross-sections in relatively flat-sloped streams.
Concerning this set of simulations, we are aware that considered flood events span a large time interval, within which anthropogenic or natural modification of the riverbed geometry have certainly occurred.Nevertheless, the quasi-2-D hydrodynamic model calibrated relative to year 2000 is used here as a tool to generate realistic h(t) − Q(t) pairs at Cremona cross-section, while the observed flood events are used as plausible hydrological boundary conditions.
The set of 50 synthetic campaigns with 15 pairs of h(t) − Q(t) values each were firstly used for rating-curve construction under the Traditional approach and then provided for the 1-D model.

Model for constraining the empirical rating-curve (Cremona 1-D model)
Considering the steady-state Cremona 1-D model (black box and line, in the lower panel of Fig. 1), Constrained approach for rating-curve construction (see Sect. 2.2) is based on the estimation of the h max − Q max pair, in which Q max represents the maximum river discharge capacity at Cremona cross-section.The h max − Q max pair is estimated by means of the 1-D model by setting to zero the freeboard in Cremona cross-section.The 1-D model extends ∼10 km upstream and ∼50 km downstream Cremona streamgauge to exclude influences of boundary conditions at the cross-section of interest, and is calibrated referring to the h(t) − Q * (t) pair showing the largest corrupted discharge value.We repeated the 1-D model calibration for each and every synthetic measurement campaigns (i.e.50 times, on the basis of 50 different h(t) − Q * (t) pairs) to obtain a h max − Q max pair for each campaign.In this way the 1-D model provides additional information to physically constrain the identification of the mathematical expression representing the rating-curve.

Model for assessing the propagation of rating-curve uncertainty (Cremona-Pontelagoscuro model)
Finally, we adopted the Cremona-Pontelagoscuro quasi-2-D model, which extends from Cremona to Pontelagoscuro (∼190 km), to assess the propagation of rating-curve uncertainty on calibrated Manning's coefficients through numerical simulation.Each calibration of the Cremona-Pontelagoscuro model focussed on the identification of n u and n l , while n f is assumed invariable and equal to 0.1 m −1/3 .We calibrated the Cremona-Pontelagoscuro model for the 2000 flood event by optimizing the output of the model relative to high water marks recorded at 102 cross-sections and stage hydrograph observed in three internal cross-sections.Keeping n f constant leads to a simplification of the analysis, and, above all, it is in agreement with experiences reported in literature which drawn how the performance of 1-D and quasi-2-D models is in many cases relatively insensitive to floodplains roughness (Pappenberger et al., 2006;Castellarin et al., 2009Castellarin et al., , 2011a)).

Global uncertainty of Cremona's rating-curve
Considering results obtained from the Traditional approach to rating-curve construction, left and right panels of Fig. 4 reports two examples of empirical rating-curves (thin black lines) constructed by fitting Eq. ( 1) to synthetic data (black stars) for two events characterized by different magnitudes, showing a large part of extrapolation without any data, i.e. for Q > 6000 or 3000 m 3 s −1 respectively.Figure 5 reports the non-parametric estimate of a steady-state rating-curve (blue line in the figure and hereafter referred to as "normal rating-curve"), obtained as a recursive running mean (window width: 10 Q(t) values; 4 iterations) of all h(t) − Q(t) pairs simulated for the historical events (grey circles) by means of the calibrated Piacenza-Pontelagoscuro model.
Left panel of the Fig. 6 reports the median empirical rating-curve (red dashed line), together with 5th and 95th percentiles for the 50 empirical rating-curves identified through the Traditional approach (black lines), hereafter also referred to as 5 TRC (Traditional Rating-Curve) and 95 TRC. Figure 6, left panel, also reports the normal rating-curve estimated at the same gauged section (blue line; the same of Fig. 5) and the compound of simulated h(t) − Q(t) pairs (grey circles; same pairs reported on Fig. 5).The comparison presented in Fig. 6, left panel, shows a rather significant negative bias for both 5 TRC and 95 TRC rating-curve for discharge values higher than 4000-6000 m 3 s −1 .Left and right panels of Fig. 4 also illustrate two examples of empirical rating-curves identified by applying the Constrained approach (grey lines).As illustrated in Fig. 4, the Constrained approach fits Eq. ( 1) to the synthetic 15 h(t) − Q * (t) pairs while simultaneously forcing the equation through h max − Q max (black dot in Fig. 4).h max is constant and represents the elevation of the lowest embankment crest at Cremona cross-section, while Q max depends on the calibration of the Cremona 1-D model for the particular set of 15 synthetic measurements (see previous section).Right panel of Fig. 6, similarly to the left panel, presents the normal rating-curve, the compound of simulated h(t) − Q(t) pairs, together with the median empirical rating-curve (red dashed line) and 90 % confidence interval relative to the Constrained approach (black lines).Minimum and maximum Q max values are illustrated as error-bands, while the box-plot represents the whole distribution of Q max values: the central line is the median value (∼12 330 m 3 s −1 ), the box represents the interquantile range, IQR (50 % of the empirical values around the median), while whiskers indicate the extent of the sample aside from outliers (circles), defined as the values located more than 1.5 times the IQR from the upper or lower edge of the box.
Figure 7 illustrates the bias of Traditional (grey line) and Constrained (dashed line) median rating-curves relative to normal rating-curve (blue curve in Figs. 5 and 6).Concerning Traditional approach, underestimation prevails for our case study (negative bias) and bias increases in absolute value with streamflow, showing a value smaller than −30 % for 12 000 m 3 s −1 .Concerning Constrained approach, bias is limited (∼ ±10% for the stream-flow values of interest); overestimation prevails for low stream-flow values (i.e.6000-9000 m 3 s −1 ), while, for stream-flow values higher than 9000 m 3 s −1 , bias is negative (underestimation).
A comparison of left and right panels of Fig. 6 shows that the application of Constrained approach narrows significantly the confidence interval relative to Traditional approach.This aspect is highlighted in Fig. 8, which depicts the width of 90 % confidence intervals for Traditional and Constrained approaches in terms of relative deviations from their median rating-curve as a function of river discharge.Traditional approach shows a symmetric 90 % confidence interval (grey line on Fig. 8), while the confidence band is asymmetric for Constrained approach.

Propagation of rating-curve uncertainty to calibrated Manning's coefficients
Concerning Traditional approach, left panel of Fig. 9 reports the stream-flow hydrographs computed on the basis of the selected percentile rating-curves 5 TRC and 95 TRC (termed here as 5 TRC and 95 TRC hydrographs), which are used as upstream boundary conditions for the calibration of the Cremona-Pontelagoscuro quasi-2-D model (see Sect. 3.2), and compares them with the stream-flow hydrograph simulated at Cremona for the 2000 flood event by the Piacenza-Pontelagoscuro quasi-2-D model.Both hydrographs are markedly lower than the simulated one, as it was expected due to extrapolation (see Figs. 5 and 6).
Table 1 shows calibrated values of Manning's coefficients n u and n l for 5 TRC and 95 TRC, along with the calibrated values for the reference model (i.e.Piacenza-Pontelagoscuro quasi-2-D model; Calibration Event -CE).The table shows variations relative to CE ranging from 10 % to 19 % for 95 TRC and from 46 % to 59 % for 5 TRC.
The same procedure was applied referring to the Constrained approach.5 CRC and 95 CRC stream-flow hydrographs (hydrographs retrieved from the 5th and 95th percentiles rating-curves estimated through the Constrained approach, respectively) are reported in the right panel of Fig. 9 and compared with the hydrograph simulated by the Piacenza-Pontelagoscuro model (red line).5 CRC and 95 CRC hydrographs were used as upstream boundary conditions for calibrating the Cremona-Pontelagoscuro quasi-2-D model.Results of calibration are reported in Table 1, and variations relative to CE range from −7 % to 3 % for 95 CRC and from 6 % to 13 % for 5 CRC.

Discussion
As it was expected, extrapolation error plays a dominant role on the overall uncertainty of rating-curves identified by means of the Traditional approach.Uncertainty in these cases is far from being negligible -see Figs. 6 (left panel) and 7.The Constrained approach reduces the overall uncertainty significantly, especially for stream-flow values in the extrapolation range (i.e.≥6000 m 3 s −1 in our study), which is typically the case when design-flood events are investigated.Figures 7 and 8 quantitatively represent the reduction in terms of bias and overall uncertainty when moving from a Traditional approach to the so-called Constrained approach.The bias associated with the Traditional approach is remarkable and it clearly increases as the magnitude of the events included in the discharge measurement campaigns decreases.The lower the measured maximum discharge, the greater the extrapolation error that may be made.Evidently, the significant bias and overall uncertainty associated with ratingcurves estimated through the Traditional approach have a strong impact on practical applications of the curves, such as the calibration of roughness coefficients (see Table 1).
Reduced bias and small overall uncertainty characterize the empirical rating-curves estimated through the Constrained approach, which evidently results in smaller uncertainty of calibrated Manning's coefficient (see Table 1).Figure 9 shows rather clearly the better agreement between the optimal stream-flow hydrograph (i.e.simulated with the reference model Piacenza-Pontelagoscuro quasi-2-D model) and hydrographs retrieved from empirical rating-curves constructed through the Constrained approach.
Concerning the possible effects of rating-curve uncertainty on hydrodynamic model calibration, Table 2 reports reference values of Manning's roughness coefficient for large natural streams.A comparison of the values reported in Tables 1 and 2 may suggest three considerations:  (Chow, 1959).As a concluding remark, it is worth highlighting here that the sign of the bias associated with the approaches to the construction of rating-curves considered in this study (i.e. the Traditional and Constrained approaches) cannot be determined a priori.Underestimation prevails at Cremona crosssection using Traditional approaches (see Figs. 5-7), but no general conclusion can be drawn and bias may also have the opposite sign elsewhere when a Traditional approach is adopted (i.e.fitting a mathematical expression to the available set of measured data).Regardless of the sign of the expected bias associated with Traditional approaches (underestimation or overestimation) our study clearly points out that bias and overall uncertainty associated with rating-curves can be dramatically reduced by constraining the identification of rating-curve with information resulting form simplified hydraulic modelling, with significant advantages for practical applications (see calibration of roughness coefficients).

Conclusions
No measurement of a physical quantity is exact, or certain, hence it is always very important to quantify the deviation, or uncertainty, of the measured value relative to the unknown true value.Keeping this concept in mind, we focussed on the quantification of the overall uncertainty that normally affects river discharge measurements and stage-discharge relationships (i.e.rating-curves).
The European ISO rule 748:97 characterizes the expected error for discharge measurements when using the velocityarea method and assuming that the overall uncertainty depends on a number of component uncertainties that are all independent and normally distributed.Additional uncertainty comes into play when a rating-curve is identified from a set of observations of concurrent stage and discharge values.
Rating-curves counts a number of practical application in hydrology, hydraulics and water resources management.For instance, hydrological rainfall-runoff models are usually parameterized on the basis of concurrent observations of rainfall and discharge; discharge observations in turn are generally derived from water-level observation by means of a rating-curve.Roughness coefficients of mathematical hydrodynamic models are calibrated by simulating historical events that are usually described in terms of boundary conditions, which include discharge hydrographs.Many studies point out that uncertainty associated with discharge measurement and, more in general, rating-curves should not be neglected (e.g.Pelletier, 1987;Schmidt, 2002).Nevertheless, discharge time-series estimated from rating-curves are still treated deterministically by practitioners and researchers and the literature presenting frameworks and procedures for quantitatively assessing this uncertainty is still sparse (e.g.Di Baldassarre and Montanari, 2009;Di Baldassarre and Claps, 2011;Pappenberger et al., 2006).We propose a general numerical procedure for quantifying rating-curve uncertainty by using numerical hydrodynamic models.The procedure enables one to quantify global uncertainty of stage-discharge relationships on the basis of some common working hypotheses: instruments work in ideal conditions; systematic errors are neglected, as well as the presence of wind and sediment transport; geometry of gauge section is stable in time; unsteady effect (loop-ratings), seasonality of the riverbed roughness coefficient, and uncertainty on stage measurements are neglected.
We present an application of the proposed approach to the Cremona rating-curve, a streamgage located along the middle-lower reach of the largest Italian river, River Po.The application enabled us to quantify in 5-8 % of the discharge value the rating-curve uncertainty for the 90 % confidence interval when the curve is estimated by fitting measured discharge and water-level pairs and by honouring an estimate of the cross-section maximum discharge capacity retrieved from a simplified steady-state numerical hydraulic model (referred in our study as Constrained approach to rating-curve estimation).The application also revealed that uncertainty can be much larger when the mathematical expression is identified by fitting the stage-discharge pairs in the range of measurable discharges, which are typically much lower than discharges of interest for flood studies (referred in our study as Traditional approach to rating-curve estimation).In particular, as it was expected and as pointed out also in Di Baldassarre and Claps (2011), the analysis showed that the Traditional approach may be associated with a significant bias, which increases in absolute value as discharge increases beyond measured data (extrapolation).Therefore, our analysis pointed out that rating-curves uncertainty is strongly controlled by the methodology selected to construct the curves themselves, regardless of the mathematical complexity of the expression used to fit the available observations.
The results highlight the significance of rating-curve uncertainty for practical applications, showing, as an example, the propagation of rating-curve uncertainty to calibrated roughness coefficients for hydrodynamic models.Again, limited reliability of rating-curves and streamflow hydrographs may result in calibrated roughness coefficients that are significantly different from values reported in the literature for natural streams.In other words, recent studies point out that roughness coefficients should not be regarded as physically based parameters but rather as statistical parameters that describe riverbed roughness condition and concurrently compensate for the lack of accuracy in the description of riverbed geometry and other simplifying assumptions adopted in practical applications.This compensation may be responsible for unrealistic Manning's coefficients (Horritt and Bates, 2002;Pappenberger et al., 2005;Di Baldassarre et al., 2010).Nevertheless, our analysis showed through a numerical study that adopts as "truth" the output of the same quasi-2-D numerical model for which we then calibrate the roughness coefficients (i.e.no compensation of model errors and riverbed simplification is needed) that the propagation of rating-curve uncertainty alone may be responsible for calibrated Manning's coefficients that deviate significantly from values reported in the literature.
This study is still preliminary as it refers to a specific case study, further applications in different contexts are required to draw general conclusions and to relax some of the assumptions adopted in the study, such as the independence of roughness coefficient of seasonality or flow-depth.Nevertheless, the study provides practitioners with a general numerical procedure to evaluate the global rating-curve uncertainty, which can be easily implemented elsewhere.Finally, as a further asset for the hydrological practice, it is worthwhile to emphasize that the same proposed procedure for the overall rating-curve uncertainty estimation can be directly applied to measurement set of h(t) − Q(t) pairs, really observed at a gauge section, thereby limiting possible bias, systematic errors or simplification related to the application of numerical models.

Fig. 2 .
Fig. 2. Boundary conditions for the hydraulic modelling.October 2000 flood event: flow hydrograph and stage hydrograph observed at Piacenza and Pontelagoscuro, respectively (left panel).Examples of historical flow hydrographs observed at Piacenza and used for generating synthetic measurement campaigns (right panel).

Fig. 3 .
Fig. 3. Variation of calibrated Manning's roughness coefficients (n) at the Cremona cross-section in relation to different hydraulic condition (river water discharge).

Fig. 6 .
Fig. 6.Cremona cross-section: normal rating-curve (blue line); median rating-curve (red dashed line) for Traditional (left panel) and Constrained (right panel) approaches and corresponding 90 % confidence intervals (black lines); for the Constrained approach the diagram also reports the average h max − Q max pair (black point), range of simulated values (bands) and a detailed representation of all h max − Q max pairs (box-plot).

Table 1 .
Calibrated Manning's roughness coefficients for upper and lower reaches, n u and n l , and different models: quasi-2-D model from Piacenza to Pontelagoscuro (Calibration Event-CE); Cremona-Pontelagoscuro model calibrated referring respectively to Traditional rating-curves (5 TRC and 95 TRC), and Constrained rating-curves (5 CRC and 95 CRC).

Table 2 .
Manning's roughness coefficients for main channels in natural streams