Information about rainfall–runoff processes is essential for hydrological analyses, modelling and water-management applications. A hydrological, or diagnostic, signature quantifies such information from observed data as an index value. Signatures are widely used, e.g. for catchment classification, model calibration and change detection. Uncertainties in the observed data – including measurement inaccuracy and representativeness as well as errors relating to data management – propagate to the signature values and reduce their information content. Subjective choices in the calculation method are a further source of uncertainty.

We review the uncertainties relevant to different signatures based on rainfall and flow data. We propose a generally applicable method to calculate these uncertainties based on Monte Carlo sampling and demonstrate it in two catchments for common signatures including rainfall–runoff thresholds, recession analysis and basic descriptive signatures of flow distribution and dynamics. Our intention is to contribute to awareness and knowledge of signature uncertainty, including typical sources, magnitude and methods for its assessment.

We found that the uncertainties were often large (i.e. typical intervals of

Information about rainfall–runoff processes in a catchment is essential for hydrological analyses, modelling and water-management applications. Such information derived as an index value from observed data series (rainfall, flow and/or other variables) is known as a hydrological or diagnostic signature and is widely used in both hydrology (Hrachowitz et al., 2013) and ecohydrology (Olden and Poff, 2003). The reliability of signature values depends on uncertainties in the data and calculation method, and some signatures may be particularly susceptible to uncertainty. Signature uncertainties have so far received little attention in the literature; therefore, guidance on how to assess uncertainty and typical uncertainty magnitudes would be valuable.

Signatures are used to identify dominant processes and to determine the strength, speed and spatiotemporal variability of the rainfall–runoff response. Common signatures describe the flow regime (e.g. flow duration curve, FDC, and recession characteristics) and the water balance (e.g. runoff ratio and catchment elasticity; Harman et al., 2011). Field studies have identified drivers of catchment function, such as a threshold response to antecedent wetness (Graham et al., 2010b; Penna et al., 2011; Tromp-van Meerveld and McDonnell, 2006a), which have been captured as signatures (McMillan et al., 2014). Signatures often incorporate multiple data types, including soft data (Seibert and McDonnell, 2002; Winsemius et al., 2009).

There is a long history of using flow signatures in ecohydrology to assess instream habitat including the seasonal streamflow pattern, and the timing, frequency and duration of extreme flows (e.g. Jowett and Duncan, 1990). Signatures are used to detect hydrological change, e.g. Archer and Newson (2002) used flow signatures to assess the impacts of upland afforestation and drainage. Signatures can define hydrological similarity between catchments (McDonnell and Woods, 2004; Sawicz et al., 2011; Wagener et al., 2007) and assist prediction in ungauged basins (Blöschl et al., 2013). Model calibration criteria using signatures are useful because they preserve information in measured data (Gupta et al., 2008; Refsgaard and Knudsen, 1996; Sugawara, 1979). Signatures used in calibration include the FDC (Westerberg et al., 2011), flow entropy (Pechlivanidis et al., 2012), the spectral density function (Montanari and Toth, 2007), and combinations of multiple signatures (Pokhrel et al., 2012). By using signatures that target individual modelling decisions, model components can be tested for compatibility with observed data (Clark et al., 2011; Coxon et al., 2013; Hrachowitz et al., 2014; Kavetski and Fenicia, 2011; Li and Sivapalan, 2011; McMillan et al., 2011). Hydrological signatures have been regionalised to ungauged basins and then used to constrain a model for the ungauged basin (Kapangaziwiri et al., 2012; Westerberg et al., 2014; Yadav et al., 2007).

Some authors have considered the effect of data uncertainty on hydrological signatures (Kauffeldt et al., 2013), particularly in model calibration. Blazkova and Beven (2009) incorporate uncertainties in signatures used as limits of acceptability to constrain hydrological models. Juston et al. (2014) investigate the impact of rating-curve uncertainty on FDCs and change detection for a Kenyan basin. They show that uncertainty in extrapolated high flows creates significant uncertainty in the FDC and the total annual flow. Kennard et al. (2010) discuss the uncertainties affecting ecohydrological flow signatures from measurement error, data retrieval and preprocessing, data quality, and the hydrologic metric estimation.

We present a short description of data uncertainties relevant to hydrological signatures (see McMillan et al., 2012, for a longer review). In general, data uncertainties stem from (1) measurement uncertainty (e.g. instrument inaccuracy or malfunction), (2) measurement representativeness for the variable under study (e.g. point rainfall compared to catchment average rainfall), and (3) data management uncertainty (e.g. data entry errors, filling of missing values or station coordinate errors). Errors from data management, equipment malfunction or human errors can often be detected and corrected in quality control (Bengtsson and Milloti, 2010; Eischeid et al., 1995; Viney and Bates, 2004; Westerberg et al., 2010). But some data errors, e.g. poorly calibrated or off-level rain gauges, are difficult to correct post hoc (Sieck et al., 2007). The calculation of some signatures requires subjective decisions that introduce extra uncertainty, for example storm identification criteria, data time step, and whether to split the data by month/season (e.g. Stoelzle et al., 2013).

Each uncertainty component requires an error model that specifies the error
distribution and dependencies (e.g. errors may be heteroscedastic and/or
autocorrelated). It is essential that the error model accurately reflects the
uncertainty, rather than simply adding random noise, as hydrological
uncertainties are typically highly structured. Some measurement uncertainties
can be estimated by repeated sampling, whereas representativeness errors are
difficult to estimate. The latter are often epistemic due to lack of
knowledge at unmeasured locations/time periods (e.g. rainfall distant from
rain gauges). The most appropriate method to assess data uncertainty depends
on the information available and the hydrologist's knowledge of the
catchment. For example, the choice of likelihood function may depend on
characteristics of the data errors and the measurement site. Uncertainty
estimation depends on the perceptual understanding of the uncertainty sources
as well as the studied system and there is potential for a false sense of
certainty about uncertainty where strong error model assumptions are made
(Brown, 2004). Juston et al. (2014) refer to

The objectives of this paper were (1) to contribute to the community's awareness and knowledge of observational uncertainty in hydrologic signatures, (2) to propose a general method for estimating signature uncertainty, and (3) to demonstrate how typical uncertainty estimates translate to magnitude and distribution of signature uncertainty in two example catchments.

We used two catchments: the Brue catchment in the UK, and the Mahurangi catchment in New Zealand. This enabled us to compare signature uncertainties in different locations and with different uncertainty sources. Both catchments have excellent rain-gauge networks that allowed us to quantify uncertainty in rainfall data, and there is some existing knowledge of the dominant hydrological processes.

The Mahurangi catchment in New Zealand and the location of the rain gauges and the outlet flow gauge.

The Mahurangi is a 50

The predominantly rural 135

The Lovington discharge station has a crump profile weir for low flows and a rated section above 0.6 m. The whole stage range was gauged and the water was below bankfull level for the chosen period. The stage–discharge relationship is affected by downstream summer weed growth resulting in scatter in the low-flow part of the rating curve.

The Brue catchment in south-west England, and the location of the precipitation and discharge stations. The percent of missing values after quality control is given for each rain gauge.

Uncertainty sources and distributions are application specific, so a general analytic solution for the signature uncertainty is not available. We suggest that Monte Carlo simulation provides a generally applicable and flexible method, by sampling equally likely possible realisations of the true data values (e.g. rainfall or flow series), conditioned on the observed data. Where multiple data sources are needed (e.g. calculation of runoff ratio), paired samples are used. Each sampled data series is used to calculate the signature value, and the values collated to give the signature distribution. This technique has previously been used to determine uncertainty in discharge (McMillan et al., 2010; Pappenberger et al., 2006) and rainfall (Villarini and Krajewski, 2008).

We applied the Monte Carlo (MC) approach to estimate uncertainty in signatures of different complexity. We used signatures that require rainfall and/or streamflow data only. Our method is described in Fig. 3 and has four steps: (1) identification of uncertainty sources in the data and from subjective decisions in signature calculation, (2) specification of uncertainty models for each uncertainty source either from the literature or catchment-specific analyses, (3) Monte Carlo sampling from the different uncertainty models and calculation of signature values for each sample, and (4) analyses of the estimated signature distributions, their dependence on individual uncertainty sources and comparisons between catchments. We analysed both the absolute and relative uncertainty distributions, where the relative uncertainties were defined using the signature value from the best-estimate discharge and precipitation.

Schematic description of the method used for estimation of signature uncertainty.

We first describe the error models for uncertainties relating to rainfall and flow. Further uncertainty sources that are specific to a particular signature are described separately in Sect. 3.2. Table 1 presents a summary of all uncertainty sources together with literature references for the uncertainty estimation methods.

We considered catchment average rainfall estimated from a network of rain gauges, with three main uncertainty sources: point measurement uncertainty, spatial interpolation uncertainty and equipment malfunction uncertainty (e.g. unrecognised blocked gauges). Point uncertainty includes random errors such as turbulent airflow around the gauge (Ciach, 2003) and is usually assessed using co-located gauges. Systematic point errors are also common (e.g. undercatch due to wind loss, wetting loss, splash-in/out). In theory, systematic errors can be corrected for, but this is difficult and the site-specific information required is not always available (Sieck et al., 2007). In this study, we considered random point uncertainty but not systematic components. Interpolation errors occur when estimating catchment average rainfall from the point measurements at the gauges and depend on rainfall spatial variability (affected by topography, rain rate and storm type), density of gauges and network design.

Sources of uncertainty considered in this study and the methods used for estimation.

Point uncertainty was calculated using the formula derived by Ciach (2003)
from a study of 15 co-located tipping bucket rain gauges over 12 weeks:

We considered discharge as estimated from a measured stage series and a
rating curve that relates stage to discharge. This is the most common method
and is used at both our case study sites. The following are the main uncertainty sources.

Uncertainty in the gaugings (i.e. the measurements of stage and discharge used to fit the rating curve). Discharge uncertainty is typically larger; however, during high-flow gaugings, stage can change rapidly and its average may be difficult to estimate.

Approximation of the true stage–discharge relation by the rating curve. This is usually the dominant uncertainty (McMillan et al., 2012), especially when the stage–discharge relation changes over time. In both catchments, low to medium flows are contained within a weir, which constrains the uncertainty. However, for Brue considerable low-flow uncertainty remains as a consequence of seasonal vegetation growth.

We used the voting point likelihood method to estimate discharge uncertainty
by sampling multiple feasible rating curves (McMillan and Westerberg, 2015).
In brief, discharge gauging uncertainty was approximated by logistic
distribution functions based on an analysis of 26 UK flow gauging stations
with stable rating sections (Coxon et al., 2015). This analysis gave 95 %
relative error bounds of 13–14 % for high flow and of 30–40 % for low
flow (noting that the logistic distribution is heavy-tailed). Stage gauging
uncertainty was approximated by a uniform distribution of

Rating-curve uncertainties, including extrapolation and temporal variability, were jointly estimated using Markov chain Monte Carlo (MCMC) sampling of the posterior distribution of rating curves consistent with the uncertain gaugings. The voting point likelihood draws on previous methods that account for multiple sources of discharge uncertainty (Juston et al., 2014; Krueger et al., 2010; McMillan et al., 2010; Pappenberger et al., 2006). The rating-curve forms were based on the official curves, where Mahurangi had a three-segment power law curve and Brue a two-segment power law curve (for the range of flows analysed here). The power law parameters and the breakpoints were treated as parameters for estimation.

Basic rainfall–runoff signatures included in the study. All signatures are calculated on hourly data unless otherwise specified.

A set of signatures describing different aspects of the rainfall–runoff behaviour were calculated (Table 2). We used signatures describing flow distribution, event characteristics, flow dynamics and rainfall; flow timing would be less affected by the data uncertainties studied here. Only data uncertainty (i.e. no subjective decisions) was considered for the basic signatures.

Recession analysis is widely used to study the storage–discharge
relationship of a catchment (Hall, 1968; Tallaksen, 1995), which gives
insights into the size, heterogeneity and release characteristics of
catchment water stores (Clark et al., 2011; Staudinger et al., 2011). We used
the established method of characterising the relationship between flow and
its time derivative. In the theoretical case where flow

Subjective decisions in recession analysis include how recession periods are
defined, the delay after rainfall used to eliminate quickflow, the data time
step, and whether to extend time steps during low flows to improve flow
derivative accuracy (Rupp and Selker, 2006). A moving average can be used to
smooth diurnal flow fluctuations. Options to estimate

We assessed subjective uncertainty in recession analysis by comparing the
distributions of recession parameters

Threshold behaviour in the relationship between rainfall depth and flow contributes to hydrological complexity (Ali et al., 2013) and exerts a strong control on model predictions. Threshold identification depends on both rainfall and flow data, making it a good candidate to test the effect of multiple uncertainty sources. Rainfall–runoff thresholds have been found in many catchments (Graham et al., 2010b; Tromp-van Meerveld and McDonnell, 2006a, b), including the Mahurangi (McMillan et al., 2011, 2014). We only studied threshold signatures in Mahurangi, as Brue did not display any rainfall–runoff threshold.

The signatures that we used were threshold location (in millimetres of rain per event)
and threshold strength. We quantified threshold strength based on the method
of McMillan et al. (2014). Storm events were identified and event rainfall
was plotted against event runoff. Strong threshold behaviour was defined as
an abrupt increase in slope of the event rainfall–runoff relationship. This
attribute was tested by fitting each data set with two intersecting lines (a
“broken-stick” fit), using total least squares to optimise the slopes and
intersect. The corresponding null hypothesis was that the two lines have
equal slopes. This test returns a

We defined events based on McMillan et al. (2011), such that events require
at least 2

The standard deviation of the error in catchment average rainfall resulting from different numbers of subsampled stations was calculated. It was plotted as a function of hourly rain rate using the moving-average window method of Villarini and Krajewski (2008), with a bandwidth equal to 0.7 times the rain rate at the centre of the window (results for Brue in Fig. 4). The errors decreased with rain rate and there was a large initial decrease in the error when the number of subsampled stations increased from 1 to around 5. The point uncertainty only had a small effect on the error standard deviation.

Standard deviation of the relative rainfall error as a function of rain rate for different numbers of subsampled stations for 1000 Monte Carlo realisations for the Brue catchment, with and without point uncertainty.

(

The number of gauges had a large effect on the estimated mean annual
precipitation; if only one rain gauge was used, there was a range of
200–300

Estimated rating-curve uncertainty and uncertainty in flow
percentiles for the Mahurangi (

Discharge calculated using the optimal rating curve for 1998 for
Mahurangi

The estimated rating-curve uncertainty is shown in Fig. 6, with the
corresponding flow percentile uncertainty summarised using boxplots. The
5–95 percentile uncertainty bounds enclose almost all of the uncertain
gaugings, apart from a small number of outliers. Low-flow uncertainty is
larger in Brue where vegetation growth affects the stability of the
stage–discharge relation. High-flow uncertainty is larger in Mahurangi where
fewer, more scattered high-flow gaugings cause a wider range in the
extrapolated flows. Mahurangi has a fast rainfall–runoff response with
little base flow and peak-flow events that are infrequent but have large
magnitudes (up to 11

Flow percentile uncertainties mirrored those of the rating curves, with
larger uncertainties in high-flow percentiles for Mahurangi and larger
uncertainties in low-flow percentiles for Brue (Fig. 6). Uncertainty in mean
discharge was around

Relative uncertainty in basic signatures as a percentage of the
signature values calculated with the optimal rating curve from the MCMC. The
boxplot whiskers extend to the 5 and 95 percentiles, and the box covers the
interquartile range. The signature values for the optimal rating curves are
given at the bottom of the (

For the total runoff ratio, we tested the contribution of each uncertainty
source by including or excluding different sources. We calculated total
uncertainty (Fig. 8c, d, black bars) using different rain-gauge densities.
Total uncertainty was approximately

We tested the effect of data uncertainty on recession analysis results by
plotting histograms of the recession parameters

Uncertainty in the recession descriptors was typically (1) greater for Brue
than for Mahurangi, in particular for hourly flow data, and (2) greater for
hourly flow data than for daily flow data. Recessions are calculated from
flow derivatives and are therefore affected by relative changes in flow
(e.g. channel shape). The linear regression used to calculate the recession
parameters is particularly sensitive to uncertainties in extreme low or high
flows. The low-flow uncertainty at Brue resulting from summer weed growth
creates higher uncertainties at that site. Daily flow values are based on an
aggregation of measured values and are therefore more robust to data
uncertainty. However, using daily data in small catchments can mask details
of the recession shape, as the slope can change markedly during a single day.
In our case, this difference caused shifts in the parameter distributions
between hourly and daily data and would therefore affect our ability to
compare parameter values between catchments. For example,

Histograms of recession parameter distributions, where parameters are calculated using (1) daily flow data, (2) hourly flow data, and (3) hourly flow data where recession parameters are calculated per season and then averaged. Dashed lines show the parameter values from the optimal MCMC rating curve. Distributions are truncated at the 2.5 and 97.5 percentiles.

Recession parameters calculated per season were highly uncertain in Brue
for the

We tested for uncertainty in the estimated threshold in the event rainfall–runoff relationship in Mahurangi using boxplots of the threshold location and strength under different uncertainty scenarios (Fig. 10). The threshold broken-stick fit is illustrated in Fig. 10a for the best-estimate data (in blue) and for an example realisation with uncertainty (in grey).

The threshold was 65

Threshold strength was defined using a change-in-slope statistic where higher
values indicate a stronger threshold. Considering flow or rainfall
uncertainty weakened the calculated threshold. For flow uncertainty this was
due to the optimal rating curve having its first breakpoint and mid-section
slope above the median values of the sampled rating-curve distribution; both
of which were associated with a stronger threshold. As with the

To summarise our results, we tabulated examples of each signature type together with their dominant uncertainty sources and summary statistics of the total uncertainty distribution, for each catchment (Table 3). Our aim is to allow for an easy comparison of the signature uncertainties in our study with those of other studies. We therefore chose commonly used distribution statistics, i.e. the first three distribution moments (mean, standard deviation, skewness) and the half-width of the 5–95 percentile range, which is commonly quoted in uncertainty studies (e.g. McMillan et al., 2012). We hope that authors of future studies will consider using similar statistics, to enable the community to compile a generalised understanding of signature uncertainties across different catchments, scales and landscapes.

Dominant uncertainty sources and uncertainty characteristics.

Uncertainty distributions were highly variable between signatures and
therefore the impact of the uncertainty depends on which signatures are used
(Table 3). There was greater uncertainty in signatures that use
high-frequency responses (e.g. variations over short timescales, thresholds
based on event precipitation totals), subsets of data more prone to
measurement errors (e.g. extreme high and low flows,

Signatures can be designed to be robust to some data uncertainty sources. A clear example is for signatures describing the frequency and duration of high and low-flow events. If these events are defined using a threshold defined as a multiplier of the mean or median flow, they are highly sensitive to rating-curve uncertainty. If, instead, the events are directly defined using a flow percentile threshold, they were little affected by rating-curve uncertainty (see Sect. 4.2.1). This simple change in signature definition reduces sensitivity to data uncertainty. We found that any cut-offs imposed in signature calculation, such as event or recession definition criteria, could have a strong and unpredictable effect on signature uncertainty. For example, rainfall–runoff threshold strength calculations were particularly sensitive to large storm events, which control the gradient of the second line in the “broken stick”. If such events were conditionally excluded (e.g. classified as disinformative and removed when runoff exceeded rainfall; which depends on the rating curve and rain gauge(s) selected), the resulting uncertainty could overwhelm any other uncertainty sources. We suggest that signatures including cut-off type definitions should be carefully evaluated and the cut-offs removed if possible.

The quality of signature uncertainty estimates relies on accurate assessment of data uncertainty and therefore on sufficient information. An example of insufficient uncertainty information would be for a gauge where out-of-bank flows occur, but there is no information on the out-of-bank rating. As discussed by Juston et al. (2014) for rating-curve uncertainty, it is essential to understand whether data errors are random or systematic, aleatory or epistemic. In our study, point rainfall errors were not important in signature uncertainty, but there is scope to improve their representation as systematic or random (e.g. systematic wind-related undercatch, or random turbulence effects). However, quantification of these errors is not straightforward (Sieck et al., 2007).

We recognise that the inferred distributions of signature uncertainty will be sensitive to the assumptions and methods used to estimate distributions of data uncertainty. This introduces some subjectivity into the uncertainty estimation and it is therefore important to make the assumptions explicit and motivate method choices by the perceptual understanding of the uncertainty sources. For example, the optimal methods for estimating rating-curve uncertainty under typical time-varying, poorly specified errors remain an active debate in the hydrological community. Using an informal likelihood, as we did, rather than a formal statistical likelihood can be more robust to multiple epistemic error sources but can also be criticised for not obeying a formal statistical framework (as discussed by McMillan and Westerberg, 2015, and Smith et al., 2008). Future progress in understanding how perceptual models and data jointly contribute to system identification may help to resolve this dichotomy (Gupta and Nearing, 2014). At present, we recognise that uncertainty distributions are more subjective in signatures that emphasise poorly described aspects of data uncertainty such as out-of-bank flows.

For signatures calculated over a long time period, it may be appropriate to incorporate nonstationary error characteristics, such as rating-curve shifts or the example explored by Hamilton and Moore (2012) where the best-practice method for infilling discharge values under ice changed over time. The time period used is important if signatures are used for catchment classification: an unusual event such as a large flood may shift the signature values (Casper et al., 2012). Additional uncertainty sources can be important in other catchments, such as catchment boundary uncertainty and flow bypassing the gauge (Graham et al., 2010a).

Our results are pertinent to any hydrological analysis that uses signatures
to assess catchment behaviour. Examples of applications whose reliability
could be affected by signature uncertainty include testing bias correction
of a climate model using signatures in a coupled hydrological model (Casper
et al., 2012), predicting signatures in ungauged catchments (Zhang et
al., 2014), classifying catchments using flow complexity signatures
(Sivakumar et al., 2013), and assessing spatial variability of hydrological
processes (McMillan et al., 2014). In some cases, absolute signature values
are not used, rather it is the pattern or gradient over the landscape, or
trend over time that is important. Data uncertainties may obscure such
patterns depending on the magnitude of the uncertainty in relation to the
strength of the measured pattern. The range of signature values found by
McMillan et al. (2014) across Mahurangi was large compared to the uncertainty
magnitudes found in this study. This suggests that the conclusions regarding
the signature patterns would still hold, assuming that the uncertainty at the
catchment outlet is representative for the internal subcatchments. Some
subjective uncertainty sources may not be relevant in catchment comparisons,
as choices such as how to define recession periods or whether to do base-flow
separation can be chosen consistently. However, subjective uncertainties can
still change the conclusions drawn such as the cut-offs described above, and
as discussed in Sect. 4.2.3 where daily data suggested similar recession

When signatures are used as a performance measure in model calibration (e.g. Blazkova and Beven, 2009) reliable uncertainty estimates are crucial so that the model is not overfitted. Previous studies have quantified data and signature uncertainty using upper and lower bounds (e.g. fuzzy estimates used by Coxon et al., 2013; Hrachowitz et al., 2014; Westerberg et al., 2011). However, this does not allow for the straightforward estimation of uncertainty in all types of signatures that is made possible by our method of generating multiple feasible realisations of rainfall and discharge time series.

This study investigated the effect of uncertainties in data and
calculation methods on hydrological signatures. We present a
widely applicable method to evaluate signature uncertainty, and show results
for two example catchments. The uncertainties were often large (i.e. typical
intervals of

Although we show that significant uncertainty can exist in hydrological signatures, we do not intend that this paper has a negative message. Consideration of uncertainty is equivalent to extracting the signal from noisy data and not overestimating the information content in the data. As argued by Pappenberger and Beven (2006) and Juston et al. (2013), ignorance is not bliss when it comes to hydrological uncertainty; incorporation of uncertainty analysis leads to many advantages including more reliable and robust conclusions, reduction in predictive bias, and improved understanding. In particular, we hope that this paper encourages others to estimate data uncertainty in their catchments, either individually or by reference to typical uncertainty magnitudes, to design diagnostic signatures and hypothesis testing techniques that are robust to data uncertainty and to evaluate analysis results in the context of signature uncertainty.

The research leading to these results has received funding from the People Programme (Marie Curie Actions) of the European Union's Seventh Framework Programme FP7/2007-2013/ under REA grant agreement no. 329762, by NIWA under the Hazards Research Programme 1 (2014/15 SCI) and by the Ministry of Business, Innovation and Employment, NZ, through contract C01X1006 Waterscape. We thank Alberto Viglione, Hoshin Gupta and Nataliya Le Vine for their constructive reviews which helped to improve this paper. Edited by: N. Romano