Geostatistical prediction of flow-duration curves in an index-flow framework

Abstract. An empirical period-of-record flow–duration curve (FDC) describes the percentage of time (duration) in which a given streamflow was equaled or exceeded over an historical period of time. In many practical applications one has to construct FDCs in basins that are ungauged or where very few observations are available. We present an application strategy of top-kriging, which makes the geostatistical procedure capable of predicting FDCs in ungauged catchments. Previous applications of top-kriging mainly focused on the prediction of point streamflow indices (e.g. flood quantiles, low-flow indices, etc.); here the procedure is used to predict the entire curve in ungauged sites as a weighted average of standardised empirical FDCs through the traditional linear-weighting scheme of kriging methods. In particular, we propose to standardise empirical FDCs by a reference index-flow value (i.e. mean annual flow, or mean annual precipitation × the drainage area) and to compute the overall negative deviation of the curves from this reference value. We then propose to use these values, which we term total negative deviation (TND), for expressing the hydrological similarity between catchments and for deriving the geostatistical weights. We focus on the prediction of FDCs for 18 unregulated catchments located in central Italy, and we quantify the accuracy of the proposed technique under various operational conditions through an extensive cross-validation and sensitivity analysis. The cross-validation points out that top-kriging is a reliable approach for predicting FDCs with Nash–Sutcliffe efficiency measures ranging from 0.85 to 0.96 (depending on the model settings) very low biases over the entire duration range, and an enhanced representation of the low-flow regime relative to other regionalisation models that were recently developed for the same study region.


Introduction
An empirical flow-duration curve (FDC) graphically represents the percentage of time (or duration) in which the streamflow has been equalled or exceeded over a historical period of time (see, e.g.Vogel and Fennessey, 1994).Empirical FDCs are often used to represent the streamflow regime of a given catchment when an adequate number of streamflow observations are available.A deterministic hydrologist would probably refer to an FDC as a key signature of the hydrological behaviour of a given basin, as it results from the interplay of climate, size, morphology, and permeability of the basin; a statistical hydrologist would refer to an FDC as the exceedance probability, or equivalently the complement to the probability distribution function (cdf) of streamflows (see, e.g.Castellarin et al., 2013).
Because of their ability to provide a simple and yet comprehensive graphical view of the overall historical variability of streamflows in a river basin, from floods to low flows, and their characteristic of being readily understandable by those who do not have a strong hydrological background, empirical FDCs are routinely used in several water-related studies and engineering applications such as hydropower generation, design of water supply systems, irrigation planning and management, waste-load allocation, sedimentation studies, habitat suitability, etc. (see, e.g.Vogel and Fennessey, 1995).
The literature reports two different representations of empirical flow-duration curves, depending on the reference period of time (see Vogel and Fennessey, 1994): (i) periodof-record flow-duration curves (POR-FDCs), constructed on the basis of the entire observation period, and (ii) annual flow-duration curves (AFDCs), constructed year-wise.The two representations are complementary to each other and should be selected by practitioners depending on the water Published by Copernicus Publications on behalf of the European Geosciences Union.

A. Pugliese et al.: Geostatistical prediction of flow-duration curves
problem at hand (Castellarin et al., 2004b).For instance, AFDCs are useful for quantifying the streamflow regime in a typical hydrological year, or in a particularly wet or dry year (see Vogel and Fennessey, 1994); POR-FDCs are a steady-state representation of the long-term streamflow regime and can be effectively used, for instance, for patching and extending streamflow data (Hughes and Smakhtin, 1996) or for assessing the long-term hydropower potential of a given site.
In many practical applications one has to predict FDCs at ungauged catchments or catchments for which the available hydrometric information is sparse (see, e.g.Castellarin et al., 2013).This task is often addressed by developing regional models of FDCs.The scientific literature proposes several models, adopting different approaches to the problem: some models regard the curves as the exceedance probability function of streamflows and regionalise the parameters of theoretical frequency distributions (see Fennessey and Vogel, 1990;LeBoutillier and Waylen, 1993;Castellarin et al., 2007;Mendicino and Senatore, 2013); similarly, some others adopt a suitable mathematical expression for representing the curves and regionalise the expression parameters (Franchini and Suppo, 1996;Mendicino and Senatore, 2013); finally, some others do not make any attempt to mathematically represent the curves, they rather standardise empirical curves constructed for gauged catchments that are hydrologically similar to the target site (i.e.catchments that are characterised by a similar physiographic, pedologic, and climatic conditions, also referred to as donor sites; see, e.g.Kjeldsen et al., 2000) by an index streamflow (e.g.mean annual flow), and then average the dimensionless curves to predict the standardised FDC for the study catchment.The averaging procedure may (see, e.g.Ganora et al., 2009), or may not (see, e.g.Smakhtin et al., 1997), adopt a weighting scheme, which gives more importance to donor sites that are more hydrologically similar to the target site.The literature commonly groups these regionalisation procedures into parametric (i.e.procedures that parameterise FDCs and then regionalise parameters, like the first two examples) and nonparametric (i.e.procedure that dispense with a parameterisation of the curves, like the third example, see, e.g.Castellarin et al., 2004aCastellarin et al., , 2013) ) procedures.
It is a common argument that an accurate representation of FDCs for daily streamflows requires probabilistic models (or mathematical expressions) with four or more parameters (LeBoutillier and Waylen, 1993;Castellarin et al., 2007), which control the position, scale, and shape of the distribution.This hampers the construction of reliable regional models, due to the large uncertainty that is commonly associated with regional relationships that express the shape parameters in terms of physiographic and climatic catchment descriptors (see Castellarin et al., 2007).As a result, Ganora et al. (2009) recently revisited the classical approach to FDCs regionalisation based on averaging standardised curves constructed for neighbouring gauged sites (Smakhtin et al., 1997), they pro-posed a mathematical model that enables the user to quantify the dissimilarity between empirical FDCs and associate this dissimilarity with a distance in the multidimensional space of catchment descriptors.An innovative feature of this approach is the possibility to weight each empirical FDC according to the distance between each gauged basin and the target site in the space of catchment descriptors, therefore accounting for the hydrological similarity of the donor sites with the site of interest.Like many of the traditional approaches proposed in the literature, though, the approach proposed in Ganora et al. (2009) both (1) requires a preliminary subdivision of the study area into homogeneous pooling groups of sites (i.e.clustering) and (2) predicts a standardised (i.e.dimensionless) FDC for the target site, which needs then to be multiplied by a dimensional-scale index (e.g. an indirect estimate of mean annual streamflow) in order to be of practical use.Both steps are critical phases of a regionalisation process.In particular concerning step (1), geostatistical regionalisation approaches have been shown to be particularly effective in dispensing with the preliminary identification of the homogeneous pooling group of sites while using regional hydrological information for predicting streamflow indices in ungauged catchments (e.g.flood quantiles, low-flow indices, etc.; see, e.g.Chokmani and Ouarda, 2004;Skøien et al., 2006;Castiglioni et al., 2009Castiglioni et al., , 2011;;Archfield et al., 2013;Laaha et al., 2013); nevertheless, no geostatistical procedure has been developed that specifically addresses the problem of FDC regionalisation, aside from an interpolation of the curves in the physiographic space through a threedimensional kriging, which is not a geostatistical procedure in the strict sense (see Castellarin, 2014).
Our paper focuses on the derivation of a geostatistical technique that addresses both limitations mentioned above for the prediction of FDCs in ungauged sites.We adopt topological kriging or top-kriging, which is a block-kriging with variable support area that interpolates streamflow indices along stream networks (see, e.g.Skøien et al., 2006).Top-kriging has been proved to be particularly successful in predicting point streamflow values (e.g.low flow and flood quantiles, mean annual flood, stream temperatures, etc.) in various geographical and climatic contexts (see, e.g.Merz et al., 2008;Castiglioni et al., 2011;Vormoor et al., 2011;Archfield et al., 2013;Laaha et al., 2013).
We adopt top-kriging as the core tool for predicting standardised (i.e.divided by mean annual flow) and dimensional long-term daily FDCs on the basis of empirical period-ofrecord curves (POR-FDCs, hereafter referred to as FDCs for the sake of brevity) constructed for neighbouring stream gauges.
The idea behind our study is (i) to identify a meaningful empirical point value (or index) that fully characterises the whole empirical FDC; (ii) to model the spatial correlation structure, or the spatial variability, of this index over the study region through top-kriging; and (iii) to assess the capability of this very spatial correlation model to predict FDCs in ungauged sites by weighting neighbouring empirical FDCs.We present two possible applications of the proposed procedure, the first one predicts standardised FDCs, that is FDCs divided by mean annual flow (MAF), the second one predicts FDCs divided by the product between mean annual precipitation (MAP) and drainage area.MAP is generally easier to predict than MAF in ungauged sites, due to the higher density of rain-gauging networks relative to stream-gauging ones.The second application can therefore be used to predict dimensional FDCs in ungauged sites.
The approach is developed and tested through a comprehensive leave-one-out cross-validation procedure for a rather wide geographical region located in eastern-central Italy including 18 unregulated river basins.Castellarin et al. (2007) propose regional models of long-term daily FDCs for this area, which we use in this study as benchmark models for comparing the accuracy and reliability of the proposed approach.

Top-kriging
Top-kriging is a powerful geostatistical procedure proposed by Skøien et al. (2006) which performs hydrological predictions at ungauged sites along stream-networks on the basis of the empirical information collected at neighbouring gauging stations.As kriging techniques, the spatial interpolation is obtained in top-kriging by a linear combination of the empirical values; therefore, the unknown value of the streamflow index of interest at prediction location x 0 , Ẑ(x 0 ), can be estimated as a weighted average of the variable measured in the neighbourhood: where λ i is the kriging weight for the empirical value Z(x i ) at location x i , and n is the number of neighbouring stations used for interpolation.Kriging weights λ i can be found by solving the typical ordinary kriging linear system (Eq.2), with the constrain of unbiased estimation (2b): where θ is the Lagrange parameter and γ i,j is the semivariance between catchment i and j .The semi-variance is also referred to as variogram in geostatistics and represents the space variability of the regionalised variable Z.A peculiar feature of top-kriging is to consider the variable defined over a non-zero support S (i.e. the catchment drainage area) (Cressie, 1993;Skøien et al., 2006); this implies that the kriging system (Eq.2) remains the same, but the gamma values between the measurements need to be obtained by regularization, that is the smoothing effect of support area S on the point variogram, which is computed by applying an integral average of the variable Z over S.After this, the point variogram can be back-calculated by fitting aggregated variogram values to the sample variogram (details can be found in Skøien et al., 2006).

Total negative deviation (TND)
Top-kriging could in principle be directly applied to interpolate single streamflow values associated with a given duration (i.e.streamflow quantiles).Therefore, similarly to what is proposed in Shu and Ouarda (2012), a regional prediction of FDCs could be obtained by repeatedly applying top-krging r times, where r is the number of durations considered to provide an accurate representation of the curve (e.g.15-20, see Shu and Ouarda, 2012), and then by interpolating the r predicted streamflow quantiles to obtain an FDC.Nevertheless, each FDC is a continuum resulting from the complex interplay between climate conditions and geomorphologic catchment characteristics (see, e.g.Yaeger et al., 2012;Yokoo and Sivapalan, 2011;Beckers and Alila, 2004).This continuum would be lost, entirely or in part, by using the approach outlined above; moreover, this prediction strategy might not preserve a fundamental property of FDCs, that is the monotone (i.e.non-increasing in this paper) relationship between streamflow and duration.
Our main goal is to develop a top-kriging procedure that regionalises the whole curve seen as a single object.In geostatistical applications one should define a "regionalised variable" to produce a characterisation of the spatial variability of the investigated phenomenon.As mentioned above, topkriging has been shown to be particularly reliable in predicting point (i.e.single values) streamflow indices in ungauged locations.Therefore, a viable strategy could be to identify a point index that effectively summarises the entire curve, and to compute the top-kriging λ i values of Eq. (2) relative to this index.These values could then be used for averaging neighbouring empirical FDCs and predicting the FDC at the (ungauged) site of interest.This prediction strategy would regard each curve as a single object, and the linear interpolation of the curves (see also Sect. 3) would preserve the monotone relationship between streamflow and duration.
Some studies in the literature suggest to use the FDC slope as an overall index for the curve (see, e.g.Sawicz et al., 2011).We believe though that the definition of such an index is associated with some degree of subjectivity (e.g. which lower and upper durations to consider for the computation of the slope), and may be hard to define in some cases (e.g.ephemeral and intermittent streams).
Focusing on FDCs, Ganora et al. (2009) quantify the hydrological dissimilarity between a pair of catchments as the area between the corresponding empirical standardised (i.e.divided by mean annual flow) FDCs: two hydrologically similar catchments will show similar standardised curves, hence a small area between the curves, whereby two basins that are completely different in terms of hydrological behaviour will be characterised by highly different FDCs, and therefore the area between the curves will be large.Following this background idea, we propose to summarise the FDC through a point index which we term total negative deviation (TND) between a dimensionless (i.e.standardised by a reference streamflow value) FDC and 1, where q i represents the ith empirical dimensionless streamflow value, i is half of the frequency interval between the (i + 1)th and (i − 1)th streamflow values, and the summation includes only i = 1, . . ., m dimensionless streamflow values that are lower than 1 (i.e.negative deviation).m stands for the length of the dimensionless streamflow sample once values larger than 1 are excluded.
Empirical TND values are proportional to the filled areas in Fig. 1, where black thick curves represent the empirical FDCs.More specifically, Fig. 1 represents the dimensionless empirical FDCs that were constructed for three stream gauges (see Sect. 4 for a brief description of the study area) by using two standardisation methods: in one case the curve is standardised by the mean annual flow (standardisation by MAF; TND 1 ; top panels of Fig. 1); in the other case the curve is standardised by MAP * , that is a reference streamflow equal to the catchment area A times the mean annual precipitation MAP (standardisation by MAP * ; TND 2 ; bottom panels in Fig. 1) (see details on standardisation procedure in Sect.3.2).
Even though TND defined by Eq. (3) and illustrated in Fig. 1 does not describe the portion of the curve associated with low durations (high flows), it is very informative on the shape of the FDC, which, in turn, is controlled by climatic, physiographic, and geo-pedological characteristics of the catchment.Catchments that are dominated by rapidly responding near-surface runoff processes have steeper FDC slopes, and therefore larger TND, while FDCs are less steep where slower responding runoff generation processes prevail, and under these circumstances TND will be smaller.This is related to functional similarity: catchments that store and retain more water should have smaller TNDs.The magnitude of TND is related not only to the climate but also to how efficiently the catchment partitions water into runoff.

Construction of empirical FDCs
The construction of empirical FDCs for gauged sites is straightforward: (i) pooling all observed streamflows in one sample, (ii) ranking the observed streamflows in ascending order, and (iii) plotting each ordered observation vs. its corresponding duration.We adopt as duration of the ith observation in the ordered sample in our study the estimate of the exceedance probability of the observation, 1 − F i .If F i is estimated using a Weibull plotting position, the duration d i is where N is the length of daily streamflows observed in a gauged site and i = 1, . . ., N is the ith position in the rearranged sample.
A common representation of FDCs reports log-flows on the y axis and the duration on the x axis (see Fig. 1).Another common representation adopts a log-normal space instead, in which log-transformation of streamflows are still reported on the y axis, while the x axis reports duration expressed as a normal standard variate z, where is the cdf of the standard normal distribution.The combination of the two transformations improves significantly the readability of the FDC (see Fig. 2), the log-transformation enhances the representation of observed streamflows, which usually spans over two or more orders of magnitude, while expressing the duration as a standard normal variate improves the visualization of small and large durations, that is flood and low flows, respectively.

Computation of empirical TND values
According to what we anticipated in Sect.2.2, two different standardisation procedures are considered for computing TND values: TND 1 and TND 2 .
TND 1 TND values are computed after standardisation by mean annual flow (MAF), that is the traditional way to standardise FDCs.

TND 2
TND values are computed for FDCs that are standardised by a rescaled mean annual precipitation (MAP * ).The standardisation is performed by dividing each streamflow value by the empirical catchment-scale MAP value, rescaled to basin size as where A is the catchment area and CF is a unit-conversion factor (e.g. if streamflows are in m 3 s −1 , MAP in mm year −1 and A in km 2 , then CF = 3.171 Once the dimensionless FDC is predicted for an ungauged site, then a dimensional FDC can be obtained by multiplying the curve by a local catchment-scale estimate of MAP * . The idea behind the choice of two different standardisations of FDCs derives from two different purposes: (TND 1 ) MAF standardisation is the traditional choice when an indexflow regionalisation approach, with MAF being the indexflow, is used to regionalise FDCs (see Castellarin et al., 2004b;Ganora et al., 2009).Such an approach, as already mentioned, then needs an appropriate regional model for predicting the index flow in ungauged basins (e.g. a multiregression model), in fact, once a standardised FDC is predicted for an ungauged site, a dimensional FDC can be obtained by multiplying the dimensionless curve by an estimate of MAF for the site of interest.Setting up a regional model for predicting MAF is a critical and delicate step in the regionalisation procedure (see, e.g.Brath et al., 2001;Castellarin et al., 2004a): (TND 2 ) MAP * standardisation enables one to derive dimensionless FDCs to be used for regionalisation, and to predict a dimensional curve, which is ultimately what practitioners really need for addressing the water problem at hand, simply by multiplying the dimensionless FDC by MAP and catchment area.The uncertainty associated with predictions of MAP is generally significantly smaller than the uncertainty associated with predictions of MAF for ungauged sites, in virtue of the large availability of raingauges and the accuracy of geostatistical procedure for interpolating point observations (see, e.g.Brath et al., 2003;Castellarin et al., 2004a).
Concerning the practical computation of empirical TND values, that is TND 1 or TND 2 , the record length generally varies among the available stream gauges.Therefore, before applying Eq. (3) one needs to set a maximum duration d max that can be used in order to compute the TND values consistently for all sites in the region.d max should be set according to the minimum record length in the region (e.g. if the minimum record length in the region is 5 years, one could set Once a suitable reference streamflow is selected for performing the standardisation of the curves (i.e.MAF or MAP * ), one can easily identify the number of durations m for which the empirical dimensionless streamflow values are lower than 1 (i.e.streamflow values lower than MAF or MAP * ) and compute TND according to Eq. (3).For instance, once computed the standard-normal duration z i associated with each standardised and log-transformed streamflow quantile q i , i in Eq. ( 3) can be computed as

Geostatistical interpolation of TND and FDCs
Empirical TND (i.e.TND 1 and TND 2 ) values are site specific and can be interpolated with geostatistical techniques.Top-kriging can be applied as illustrated in the stepwise description by Skøien (2013) and Skøien et al. (2014) through the suite of R-functions included in the R-package rtop, which can be accessed from the Comprehensive R Archive Network (CRAN, http://cran.r-project.org/).The application of top-kriging formally requires exactly the same steps in both cases (i.e. for empirical TND 1 and TND 2 values).
The point sample variogram for each standardisation (see Sect. 3.2) can be computed using the binned variogram technique (see Skøien et al., 2014, for details), for which sample points are aggregated in distance and area classes or bins under the hypothesis of isotropy, i.e. the variogram does not vary with direction.The sample point variogram can then be modelled through a suitable theoretical model (e.g.exponential, Gaussian, spherical, fractal, etc.).Skøien et al. (2006) recommend the use of the exponential variogram.
Once the empirical variogram is modelled, the number n of neighbouring stations on which to base the spatial interpolation is set iteratively by the user on the basis of a first set of preliminary analyses, which aim at identifying the n value that produces the most accurate predictions in crossvalidation (i.e. for predicting TND values in ungauged locations).This means that the local prediction of TND values, i.e. the computation of ordinary linear system in Eq. ( 2), depends on n-dimensional kriging weights.
We assume in our study that the n kriging weights that are computed for predicting TND in ungauged locations can also be adopted for predicting the flow-duration curve in the same locations as a weighted average of n standardised empirical curves as, where λ i are the top-kriging weights resulting from TND interpolation; ψ(x i , d) indicates the standardised empirical FDC for site x i , that is a flow-duration curve in which streamflow quantiles are divided either by MAF or by MAP * ; ψ(x 0 , d) stands for the standardised FDC predicted for site x 0 over the entire duration domain d; and n is the number of neighbouring sites in the vicinity of the site of interest.
It is worth noting that while FDC predictions are performed by using empirical standardised FDCs as a whole (i.e. the prediction is performed over the entire duration interval), the computation of empirical TND values does not consider lower durations (see details in Sect.2.2).Therefore, it will be particularly interesting to analyse the performance of the proposed procedure for predicting high flows.We will assess our assumption relative to a study area which was extensively analysed in previous studies in the context of regionalisation of FDCs (see, e.g.Castellarin et al., 2004aCastellarin et al., , 2007)).

Study area and data
The study region includes 18 unregulated catchments, which previous studies describe as a rather heterogeneous group of sites in terms of physiographic and climatic characteristics (see, e.g.Castellarin et al., 2007Castellarin et al., , 2004a)).Daily streamflow series were obtained for all basins from the stream gauges belonging to the former National Hydrographic Service of Italy (SIMN) over the time period 1920-2000.The length of the observed series ranges from 5 to 40 years (average record length: 18 years).Also, the empirical MAP value relative to each of the 18 catchments was estimated using data collected from a rather dense raingauge network (i.e. 1 raingauge per ≈ 50 km 2 ) during the same time interval of daily streamflow observations.Empirical FDCs were constructed from the daily streamflow series for the 18 catchments as described in Sect.3.1.Empirical TND 1 and TND 2 values were computed for each catchment according to standardisations described in Sect.3.2, and are illustrated in Fig. 3.As shown in the left panel of Fig. 3, empirical TND 1 values increase moving from south-east to north-west.This outcome reflects the lower perviousness of the northern catchments, which are then less capable of storing water volumes and consequently are characterised by steeper empirical FDCs.Moving from southeast to north-west, one can note for TND 2 (right panel of Fig. 3) similar patterns to those observed for TND 1 values, i.e.TND values tend to increase along the SE-NW direction.On the one hand, this general behaviour suggests that in our case study mean annual flow (MAF) is largely controlled by precipitation, on the other hand, karst phenomena associated with the presence of fractured limestones result in an increase of TND 2 for the southern catchments, i.e. sites 3006, 3003, and 3002, for which subsurface flows play a significant role.
Table 1 illustrates the variability over the study region of catchment area A (km 2 ), mean annual flow MAF (m 3 s −1 ), mean annual precipitation MAP (mm), MAP * (m 3 s −1 ), and empirical TND 1 (-) and TND 2 (-) values, by reporting the minimum, mean, and maximum values together with the first, second, and third quartiles of each index.For detailed information on the study area please refer to the Supplement.

Prediction of FDCs in cross-validation
We will refer to the proposed approach as TNDTK (i.e. total negative deviation top-kriging) in the remainder of the paper.This section illustrates in detail the application of TNDTK in cross-validation, describing the accuracy of the procedure when applied in ungauged basins.Table 1.Study catchments: variability of drainage area (A), mean annual flow (MAF), mean annual precipitation (MAP), rescaled mean annual precipitation (AP * ), empirical TND 1 and TND 2 and length of the observed streamflow series (Y); minimum, maximum, mean, first, second (median), and third quartiles of the sample distributions.

Standardisation by MAF
The application of TNDTK to the prediction of FDCs standardised by MAF requires the preliminary application of topkriging to TND 1 values, which we performed by calculating binned sample variogram first, and then by modelling binned empirical data with a 5-parameter "modified" exponential theoretical variogram (a combination of exponential and a fractal model, see details in Skøien et al., 2006).
As an example, Fig. 4 illustrates the comparison between some selected bins of the sample variogram and the regularised semi-variance for those bins (see also Skøien et al., 2014, and Fig. 4 therein).The numbers in the legend refer to different combinations of catchment areas, so that, e.g., 300 vs. 75 means the regularized variogram as a function of distance for two catchments of size ∼ 300 and ∼ 75 km 2 , respectively; while the solid lines represent a regularized variogram of equally sized catchments (∼ 300 km 2 ).In the same figure, the black solid line represents the fitted theoretical point variogram, and its five parameters were obtained through the weighted least squares (WLS) regression method from Cressie (1985), by fitting simultaneously all regularised binned variograms that were computed for vari- ous area classes (see Skøien et al., 2014).Top-kriging was then iteratively applied to the study catchments in crossvalidation to identify the most suitable number of neighbours n.Preliminary iterations indicated n = 6 as a good candidate for the study area (see Sect. 5.5.2).
We then used the kriging weights obtained for predicting TND 1 in cross-validation at each site to estimate dimensionless FDCs.In order to assess the prediction accuracy and to compare the performances of different models, we choose to resample each curve using p = 20 points equally spaced in the log-normal representation (see Sect. 2.2 and Fig. 2), adopting as duration extremes d 1 = 0.00135 (lower bound) and d 20 = 0.9986 (upper bound), where d 1 and d 20 values are selected by referring to the minimum record length in the regional sample, i.e. 5 years.Predictions were performed through a weighted average, as expressed in Eq. ( 8), using the optimal top-kriging cross-validation weighting scheme, i.e. λ i with i = 1, . . ., n, where n = 6.
As mentioned in Sect. 1, a leave-one-out cross-validation procedure (LOOCV) was performed in order to simulate ungauged conditions at each gauged site in the study area and to quantitatively test the reliability and robustness of TNDTK for predicting FDCs in ungauged basins (see examples in Kroll and Song, 2013;Salinas et al., 2013;Wan Jaafar et al., 2011;Srinivas et al., 2008).
The LOOCV can be summarised by the following steps: 1. empirical and theoretical variograms are computed using the entire data set of TND 1 values; 2. one of the gauging station, say s i , is removed from the set of available stations; 3. a top-kriging regional model for predicting TND 1 values is developed using the remaining N site − 1 sites; 4. TND 1 is predicted for site s i as a weighted average of the empirical values computed for n = 6 neighbouring stations (see, e.g.Fig. 5); 5. the weighting scheme computed in step 4 is then used to predict a standardised FDC for site s i through Eq. ( 8); 6. steps from 2 to 5 are repeated N site − 1 times.
The accuracy of the cross-validated standardised FDCs was scrupulously assessed by means of several performance indices and diagrams, which are illustrated in detail in Sect.5.3.The algorithm described above is tailored for the proposed procedure, TNDTK, but one can implement and apply similar resampling procedures to any regional model for simulating ungauged conditions.

Standardisation by MAP *
Top-kriging was applied also to predict empirical TND 2 values as well as FDCs standardised by MAP * .The number of neighbouring stations n, theoretical variogram, and fitting procedure were the same as for standardisation based on MAF.We used a LOOCV analogous to the one described above (i.e.standardisation by MAF) in order to identify the weighting scheme to be used for simulating ungauged conditions for all of the study basins.
Furthermore, in order to obtain dimensional predictions, each estimated curve ψ(x 0 , d) was then transformed into a dimensional FDC, as where MAP * (x 0 ) indicates the local MAP * value.

Reference regional models of FDCs
The same gauged stations and data considered herein were analysed in previous studies that developed regional models of FDCs (see Castellarin et al., 2004aCastellarin et al., , 2007)).This enabled us to identify for both TNDTK applications two different reference regional models for comparing the performance of the approaches.Below we give a brief description of such regional models.

Standardisation by MAF
TNDTK predictions of dimensionless FDCs were compared against the dimensionless curves predicted by two reference regional models, which we also applied in cross-validation through a LOOCV procedure: KMOD and MEAN.unit-mean kappa distribution as parent distribution for representing standardised FDCs (see, e.g.Hosking and Wallis, 1997).Three parameters, namely the parameter of location and the two shape parameters, were estimated by applying an ordinary least squares (OLS) regression algorithm.The scale parameter is derived as a function of the previous three under the hypothesis that the mean of the distribution is equal to one.Castellarin et al. (2007) regressed the parameters estimates against a suitable set of catchment descriptors through a stepwise-regression procedure in order to enable the estimation of the kappa distribution in ungauged sites.KMOD is therefore a traditional parametric regional model which we adopted as the benchmark regional model for predicting standardised FDCs (see for details Castellarin et al., 2007).

MEAN
MEAN is a simple approach to regionalisation, which neglects the physiographic and climatic heterogeneities of the study area, and predicts the standardised FDC for any ungauged site in the region as the average of all available standardised FDCs.We adopted MEAN as a baseline model due to its crude assumption and the resulting low-level accuracy.

Standardisation by MAP *
TNDTK predictions of dimensional FDC were compared with the predictions resulting from two benchmark models, both applied in cross-validation: LLK and KMOD.

LLK
This model, based on an index-flow approach (see Castellarin et al., 2004b), adopts a two-parameter log-logistic (LL) distribution as a suitable distribution for describing the empirical frequency of the annual flow series (i.e.index-flow) and a four-parameter kappa (K) as the parent distribution for dimensionless daily streamflow frequency.Parameters of both distribution were estimated using the routine based on L moments developed by Hosking and Wallis (see Hosking and Wallis, 1997), re-estimated through a constrained sequential quadratic programming optimisation procedure aimed at minimising the squared differences between theoretical and empirical non-exceedence probabilities, and then regressed against a suitable set of catchment descriptors through a stepwise-regression procedure.More details can be found in Castellarin et al. (2007).

KMOD
Same as KMOD for dimensionless FDCs prediction, but using a multiregression regional model to predict MAF as a function of a suitable set of catchment descriptors in ungauged basins (see, e.g.Castellarin et al., 2007 for details).

Performance indices
TNDTK performance in cross-validation is analysed for both standardisation methods (MAF and MAP * ) and compared with the results of reference regional models through several performance indices and diagrams.A deep analysis of model performances in terms of relative prediction residuals, i.e. relative errors between modelled and emprical values (with sign), is presented through error-duration curves.The curves show relative residuals against duration arranged in gray nested bands containing 50, 80, and 90 % of relative residuals, while a solid line illustrates the progression with duration of the median residual.Also, we use as performance descriptors the scatter diagrams between cross-validated and empirical streamflow quantiles associated with the same duration.On the basis of the same information, NSE (Nash-Sutcliffe efficiency) indices for each model are computed, both for natural and log-transformed streamflows.Such diagrams and indices provide a complete representation of the performance of each model in cross-validation for the entire streamflow regime, from low durations (high flows and floods) to high durations (droughts).
Concerning the performances of the model at each site, and in particular the assessment of the number of sites for which TNDTK is more reliable than the selected reference regional models, we adopt an error index that summarises the prediction performance over the entire duration range by deriving the distance between predicted and empirical FDCs, as proposed in Ganora et al. (2009): where p = 20 resampled points, while q k,emp and qk,mod stand for the empirical and predicted streamflow quantiles (dimensionless or dimensional, depending on the application) ranked at the kth duration.

Standardisation by MAF: dimensionless FDCs
Figure 5 (left) reports empirical TND 1 values against their top-kriging predictions in cross-validation.The overall NSE is 0.81.In the same figure one can observe a poor prediction (i.e.significant underprediction) for site 3701, which can be interpreted as a result of the very high empirical TND value obtained for that site (site 3701, TND 1 = 9.8 [-], A = 605 [km 2 ]), the largest in the study region.
Concerning the predictions of standardised FDCs, the error-duration curves of Fig. 6 clearly shows that TNDTK significantly outperforms KMOD and MEAN: the distribution of relative residuals plotted against duration is characterised by narrower bands (50, 80, and 90 % of the relative errors) for the entire duration interval, even though this behaviour is more marked for lower durations.The progression with duration of the median residual (black thick line) in the same figure highlights unbiasedness being close to zero for the entire duration interval.Scatter diagrams between predicted and observed standardised flows indicate high accuracy of TNDTK, with NSE= 0.958 and LNSE 0.96, the latter computed for log-flows.MEAN and KMOD are associated with lower NSE and LNSE values.
Finally, Fig. 7 presents the overall absolute error for each site.In particular in Fig. 7, scatter diagrams of δ mod are illustrated in two panels, where the x axes report errors computed for the proposed model (TNDTK), while the y axes report in turn errors from reference models.In this representation an equivalence between model performances is represented by the solid bisecting line; therefore, if one point falls in the top-left above the 1 : 1 line, TNDTK will provide better predictions than the reference model; otherwise, it will fall below the 1 : 1 line.Figure 7 clearly shows that KMOD is less accurate than TNDTK for 14 out of 18 sites, while MEAN performs the poorest, with 16 out of 18 sites characterised by higher δ values relative to TNDTK.

Standardisation by MAP * : dimensional FDCs
Right panel of Fig. 5 highlights satisfactory performance of top-kriging for predicting TND 2 values in ungauged basins, NSE value is approximately 0.6, and site 3701 still presents an outlying behaviour for the same reason explained before.
Although the cross-validated predictions of TND 2 are less accurate than TND 1 , TNDTK performance for predicting dimensional FDCs is good.Comparing TNDTK with LLK models, Fig. 8 shows for LLK narrower bands for d < 0.8, particularly the band illustrating 90 % of residuals, while in the low-flow range (i.e.0.8 < d < 1) TNDTK shows slightly better performances, resulting in narrower error bands.The bottom panels in the same figure report the scatter diagrams of predicted vs. observed dimensional flows, expressing the goodness and reliability of TNDTK when used for predicting dimensional FDC on the basis of MAP.Even though TNDTK shows an NSE= 0.914, which is lower than the NSE value associated with LLK and equal to KMOD NSE value, TNDTK is associated with the highest LNSE value (i.e.0.922), highlighting a good performance of TNDTK for low flows.Figure 9 confirms good performance of TNDTK against LLK and KMOD, showing in both cases better accuracy for 10 out of 18 catchments.Also, among the 8 catchments for which LLK and KMOD perform better than TNDTK, it is worth noting that performances are practically coincident with TNDTK in 2 cases for LLK (i.e.sites 3006 and 2201) and 3 cases for KMOD (i.e.sites 1004, 2101 and 3006).

Consistency of the kriging weighting scheme
The core assumption of the proposed method is that topkriging weights λs identified for predicting TND values can be used to weight empirical FDCs.In order to test and validate this assumption, we analysed the relationship between such weights and the degree of dissimilarity between empirical FDCs.In particular, we computed for each pair of catchments a dissimilarity metric between catchment i and j , β i,j , proposed by Ganora et al. (2009), which can be expressed as follows: where 365 is the number of points used for the resampling and q i,k and q j,k are the streamflow values associated with

Predictions of dimensional FDCs (standardisation by MAP*)
. Cross-validation of regional models: KMOD (right), LLK (centre), TNDTK (proposed approach, left); error-duration bands reporting the profile of the median relative error (thick black line) and the bands containing 50, 80, and 90 % of the relative errors (grey nested bands) as a function of duration (top); empirical vs. predicted dimensional streamflows (bottom).
duration d k = k 365+1 for sites i and j , respectively.If our assumption is correct, large β values (i.e.dissimilar curves) should be associated with small λ values, and vice-versa.Top-kriging takes into account the nested structure of catchments; therefore, where the upstream-downstream correlation occurs (i.e.similar curve with small β) relative highλ value is expected.
Figure 11 (right panel) plots β i,j values computed with Eq. ( 11) for each pair of basins in the study area, with i, j = 1, . . ., 18 and i = j (i.e.306 points), against the corresponding λ i,j weights obtained by running a TNDTK session with TND = TND 1 and, necessarily, a number of neighbours n = 17 (i.e.all stations need to be considered if we have to compare β i,j with λ i,j for i, j = 1, . . ., 18 and i = j ).The figure also highlights the differences between nested (large black dots) and un-nested (gray circles) catchments pairs.The figure clearly proves that the hypotheses are satisfied: (1) weights λ i,j show a descending pattern as β i,j values increase and (2) none of the nested pair of catchments presents kriging weight λ associated with a high or very high β value (i.e.all nested catchments are on the left-hand side with small β values).

Sensitivity to the number of neighbours n
As mentioned in Sects.5.1.1 and 5.1.2,we set the number of neighbours n = 6 in Eq. ( 8) for performing the prediction of FDCs.We identified this value through a sensitivity analysis, Comparison between TNDTK, KMOD and LLK models in terms of distances between empirical and predicted dimensional FDCs, δ mod (where mod stands for TNDTK, KMOD or LLK); values of δ TNDTK are reported against values of δ LLK (left) or δ KMOD (right) for each study basin; the solid line represents the ratio 1 : 1 between the errors, while in the areas outside the dashed lines delimit the areas where errors for the TNDTK model are twice as large as the LLK or KMOD ones, or vice versa.Points above the solid line represent curves that are better estimated by TNDTK; points above the top dashed line represent curves much better estimated by TNDTK (see also Ganora et al., 2009, Fig. 8).
which was carried out by running multiple top-kriging sessions, each one referring to a different n value.The main outcome of our sensitivity analysis is that the performance of the Nested structure of the study area approach is not dramatically dependent on n, quite the opposite.Figure 10 shows the results of the sensitivity analysis for both standardisations (i.e.MAF and MAP * ) obtained in each session in terms of NSE and LNSE for n, ranging from 3 to 17 (i.e.being 18 the total number of catchments for the study area).The left panel refers to dimensionless FDCs (i.e standardisation by MAF) and shows for n = 6 the best trade-off between NSE and LNSE.Nevertheless, NSE and LNSE are rather high for all n values.Likewise, the right panel refers to the prediction of dimensional FDCs (i.e.standardisation by MAP * ) and it shows that performances in terms of sensitivity of NSE values to n is rather low for the study area, while in terms of LNSE, we obtain slightly better performances that are associated with n ≤ 6.As a result of the analysis we selected n = 6 for all applications for the sake of consistency, even though selecting a different value for n does not impact the results significantly.

Sensitivity to the degree of nesting of the study catchments
From an operational view point it is important to understand if the degree of nesting of the study catchments impacts the performance of the approach.Better performances are to be expected in all those cases in which empirical FDCs can be constructed upstream or downstream the (ungauged) site of interest.In order to quantify this impact we validated TNDTK by removing all catchments that are nested with the catchment of interest.Figure 11 (left panel) shows all nested pairs through a graphical matrix where nested pairs are highlighted with large black dots (catchment IDs are also indicated).First, we identified all nested pairs of catchments (i.e.basin-subbasin relationships).Second, we used a cross-validation procedure similar to the procedure described in Sect.5.1.1, in which, at step 2, we neglected all information collected for the site of interest, but also upstream or downstream from that site.We termed this procedure leave nested out cross-validation (LNOCV).It is worth noting that LNOCV estimates empirical and theoretical variograms at each step of the validation procedure, differently from LOOCV, where they are estimated beforehand conclusively (see step 1 in Sect.5.1.1).
We report here only the results referring to the prediction of dimensionless FDCs (i.e.standardisation by MAF).Results obtained relative to dimensional FDCs (i.e.standardisation by MAP * ) are analogous.The results, shown in Fig. 12, highlight a slight reduction of performances, with NSE and LNSE indices equal to 0.95 and 0.92, respectively (central panel).In particular, looking at the error-duration bands (left panel in the same figure), the distribution of relative residuals presents slightly wider bands and a larger bias for the median line, especially relative to the high durations (low flows).Moreover, comparing the overall error index for each site produced by the two cross-validations (i.e.LOOCV and LNOCV) (right panel), most of the points (14 out of 18) falls above the solid bisecting line, confirming an impoverished prediction capability of the latest approach.Nevertheless, the detriment of performances obtained with LNOCV appears to be limited and associated in particular with the low-flow regime (high-duration values).This was to be expected as this portion of FDC is the hardest to predict (see, e.g.Figs. 6, 8 and 12 and Castellarin et al., 2004a), and therefore not considering catchments having their outlet located upstream or downstream the target site has the strongest effects due to the strong hydrological affinity of these catchments with the target one catchment (i.e. they share local climate as well as physiographic and geological characteristics, see, e.g.Laaha et al., 2014;Castiglioni et al., 2011).The results of the cross-validation show that top-kriging can be effectively applied for predicting standardised FDCs (i.e.flow-duration curves divided by the mean annual flow, MAF) in the study region.In particular, the interpolation strategy applied in this study (termed total negative deviation top-kriging, TNDTK), that is (1) the computation of the streamflow index total negative deviation (TND) for empirical standardised FDCs, (2) the modelling of spatial correlation of empirical TND values along the stream network, and (3) the identification of a linear-weighting scheme for averaging empirical dimensionless FDCs on the basis of the correlation model identified at step (2), results in reliable predictions of standardised FDCs in ungauged sites.
It is worth highlighting that the application of the procedure may produce negative weights (see Fig. 11 for the case in which the number of neighbours in set to n = 17).Negative weights are often the result of the so-called screening effect (i.e.remote data points are screened by a set of closer data locations in front of them; see, e.g.Deutsch, 1996), which can be accentuated by a zero-nugget variogram model, as it is in our case.We did not experience adverse effects associated with negative weights in our analysis, but, in case the presence of negative weights results in non-physical estimates (e.g.negative streamflow values), one may set all weights to be positive through the rtop routine options (see Skøien et al., 2014).
The curves predicted in cross-validation are unbiased for the entire duration range (i.e. from high-to low-flows) and the prediction residuals are as small as, or smaller than, the residuals resulting from the application of traditional regionalisation schemes.Analysing the results in detail, Fig. 7 in-dicates that TNDTK performed significantly worse than the baseline and benchmark regional models in three cases only.The benchmark model (i.e.KMOD) better predicts the FDC for site 3701 (left panel of Fig. 7).As illustrated in the right panel in Fig. 2, site 3701 is associated with the steepest empirical flow-duration curve of the study region and therefore the highest empirical TND value (see Table 1 and Figs. 1  and 5).
The core assumption of top-kriging hypothesises is that hydrological similarity is mainly controlled by spatial proximity, and this may represent an important limitation in some regions where geology and/or morphology have a large impact on streamflows, such that the hydrological regime of nearby catchments may be quite different.This could in principle explain the poor prediction obtained in the study for site 3701, which is characterised by a very limited permeability (i.e. can be regarded as impervious) relative to the surrounding catchments and, consequently, a much steeper empirical FDC than the neighbouring sites.Conversely, information on permeability is explicitly incorporated in the multiregression models included in KMOD (see, e.g.Castellarin et al., 2007).Furthermore, the baseline model MEAN significantly outperforms TNDTK for sites 2502 and 801, and this result can be explained by noting that both sites are associated with empirical standardised curves that are well represented by the average standardised FDC for the study region (see right panel in Fig. 2 and Castellarin, 2014), that is the curve associated with the baseline regional model (MEAN) in cross-validation.
Aside from peculiar cases highlighted above, TNDTK shows a high performance in cross-validation that is likely to result from several advantages of the proposed procedure.TNDTK dispenses with the critical phase of delineating hydrologically homogeneous pooling group of sites (see Castellarin et al., 2004a) by exploiting the spatial correlation structure of the streamflow regime (see Archfield and Vogel, 2010).Nevertheless, the approach does not require to set up multiregression models for estimating the parameters of a mathematical expression (e.g. a theoretical frequency distribution) controlling the shape of the curve, which are often associated with a large uncertainty and limited robustness (see Castellarin et al., 2007); TNDTK predicts the shape of the curve for an ungauged basin through a non-parametric procedure as a weighted average of empirical standardised FDCs (e. g.Smakhtin et al., 1997;Ganora et al., 2009).The weighting scheme also ensures for the predicted curve a nonincreasing (i.e.monotone) relationship between streamflow and duration, which is one of the main properties of flowduration curves.
The study also points out that TNDTK can be used for predicting dimensional FDCs in ungauged sites on the basis of a minimal set of hydrological information, that is (a) empirical FDCs for a group of gauged basins and (b) an estimate of mean annual precipitation (MAP) for all gauged basins in the region, as well as for the target ungauged basin.By comparing Figs. 6 and 7 with Figs. 8, and 9 one may get the impression that a standardisation of streamflows by MAP * reduces TNDTK performance relative to a standardisations by MAF.It is worth pointing out that one cannot directly compare the results for these two cases since Figs.6 and 7 (standardisation by MAF) refer to the prediction of a dimensionless FDCs, while Figs. 8 and 9 (standardisation by MAP * ) refer to the prediction of dimensional FDCs.Moreover, concerning the prediction of dimensional FDCs (standardisation by MAP * ), the similar performances between TNDTK and the benchmark regional models are rather surprising; while the benchmark regional models incorporate a regionalisation of empirical mean annual flows, TNDTK uses only local information on precipitation for predicting a dimensional FDC in the target site.Even though TNDTK does not show a clear supremacy relative to more traditional approaches, it has to be highlighted that its application is rather straightforward and does not require any subjective choice, which, together with the fact that the procedure can be implemented with a limited amount of input data, makes TNDTK a very interesting alternative for predicting dimensional FDCs.

Future analyses
Our study is evidently a preliminary analysis, which tackles the exploration of geostatistical approaches for predicting FDCs.Therefore, the results of our study open up several possible research avenues.In particular, we focus on the prediction of long-term steady-state FDCs, on the basis of period-of-record (POR) empirical FDCs.Applicability of TDNTK to the prediction of annual FDCs for typical hydrologic years, as well as for particularly wet or dry years (see, e.g.Vogel and Fennessey, 1994;Castellarin et al., 2004b), is an open problem that needs to be specifically and quantitatively addressed.Evidently, the proposed approach needs to be further investigated in other geographical contexts.In particular, the application of TNDTK for predicting dimensional FDCs on the basis of catchment-scale MAP values deserves some additional tests that aim at verifying its suitability for significantly different climatic conditions (e.g.arid regions, alpine catchments, etc.), in which the streamflow regime is not heavily controlled by the rainfall regime, as for the considered case study.
Also, future analyses will focus on a comparison between TNDTK with other methods that use weighted combinations from dynamic pooling-groups of sites, such as the region of influence (RoI) approach (e.g.Burn, 1990;Holmes et al., 2002).This will enable a better understanding of the potential of geostatistical techniques and the informativeness of spatial structure of signatures of the streamflow-regime, such as TND, relative to approaches that incorporate other information than spatial proximity when it comes to the prediction of FDCs in ungauged sites (see, e.g.Merz and Blöschl, 2005, for the prediction of flood quantiles).
Finally, we propose to summarise empirical flow-duration curves through the index TND, which expresses the total negative deviation of the curve from a reference streamflow value.We are aware that the proposed procedure needs to be further tested in different geographical and climatic contexts before its general validity can be acknowledged.Also, we believe that the TND index identified in this study incorporates a wealth of hydrological information and has the potential to be extremely useful in a number of hydrological problems other than the prediction of FDCs, such as catchment classification (see Wagener et al., 2007;Di Prinzio et al., 2011) or regionalisation studies (Laaha and Blöschl, 2006;Gaál et al., 2012).Future analyses will specifically address these points.Moreover, future analyses will focus on the identification of a global indicator of the similarity between FDCs to be used to analyse and model geographical correlation between the empirical curves themselves, this would enable one to base the definition of the linear-weighting scheme on a more comprehensive and descriptive indicator of the streamflow regime.

Conclusions
This study explores the possibility to extend the application of top-kriging, which is generally used for spatial interpolation of point streamflow indices (e.g.estimated flood quantiles, low-flow indices, temperature, etc.), to the prediction of period-of-record flow-duration curves (FDCs) in ungauged basins.Top-kriging is used in this study to geostatistically interpolate standardised FDCs along the stream network of a broad geographical area in central-eastern Italy.We identify the linear weighting typical of any kriging procedure by modeling the spatial correlation structure of an empirical streamflow index, which was shown in the study to be particularly useful in describing the daily streamflow regime of a given catchment.In particular, we define the index, which we term total negative deviation (TND), as the overall Hydrol.Earth Syst.Sci., 18, 3801-3816, 2014 www.hydrol-earth-syst-sci.net/18/3801/2014/ negative deviation of an empirical FDC relative to a reference streamflow value used for the standardisation of the curve itself.We consider two different reference streamflow values, that is the mean annual flow (MAF) and catchmentscale mean annual precipitation × the drainage area of the catchment (MAP * ), and we use these streamflow values for standardisation of the empirical FDCs prior to regionalisation.The standardisation based on MAF enables us to develop a top-kriging-based regional model of dimensionless FDCs, while the standardisation based on MAP * enables us to predict dimensional flow-duration curves in ungauged basins via top-kriging.The two regional estimators were cross-validated and compared in terms of prediction performances with other regional models of dimensionless and dimensional flow-duration curves that were previously developed for the study area.The comparison highlights good performances of the proposed procedure, which we termed total negative deviation top-kriging (TNDTK) relative to traditional regional models.TNDTK is unbiased throughout the entire duration interval and characterised by particularly small residuals for high durations (i.e.improved predictions of low flows).Moreover, the prediction accuracy of TNDTK is similar to, or higher than, more complex regionalisation approaches that use multiregression models incorporating information on the permeability, morphology, climate, etc. of the catchment.This result seems to confirm the value of spatial proximity relative to catchment attributes (see, e.g.Merz and Blöschl, 2005) when hydrological predictions in ungauged basins are concerned.
The Supplement related to this article is available online at doi:10.5194/hess-18-3801-2014-supplement.

Figure 2 .
Figure2.FDC representations: log-natural scale (left), log-normal scale (centre); the panels also show a resampling of the empirical curve (circles) which employs 20 equally spaced points in the standard-normal space; standardised empirical FDCs for the study region (right), FDC for sites 3701, 801, 2502 and regional mean FDC are highlighted.

Figure 3 .
Figure 3. Empirical TND 1 and TND 2 values for the study catchments.

Figure 4 .
Figure 4. Sample variogram (points) and regularized variograms as function of distance and area.The black solid line represents the fitted point variogram, the blue line represents regularized variogram of equally sized catchments (∼ 300 km 2 ), dotted lines show the effect of combinations of different catchments sizes in square kilometers (see also Fig. 4 in Skøien et al., 2014).

KFigure 5 .
Figure 5. Top-kriging predictions of TND 1 and TND 2 values in cross-validation; predictions for site 3701 are highlighted.

Figure 6 .Figure 7 .
Figure6.Cross-validation of regional models: MEAN (right), KMOD (centre), TNDTK (proposed approach, left); error-duration bands reporting the profile of the median relative error (thick black line) and bands containing 50, 80, and 90 % of the relative errors (grey nested bands) as a function of duration (top); empirical vs. predicted standardised streamflows (bottom).

Figure 11 .
Figure11.Nested structure of the study area: (left) black dots identify nested pairs (i.e.basin-subbasin relationships); (right) topkriging weights λ i,j obtained for predicting TND 1 vs. the corresponding degree of dissimilarity between empirical FDCs for sites i and j , β i,j , nested pairs are highlighted.

Figure 12 .
Figure 12. Results of leave nested out cross-validation (LNOCV): error-duration bands reporting the profile of the median relative error (thick black line) and the bands containing 50, 80, and 90 % of the relative errors (grey nested bands) as a function of duration (left); empirical vs. predicted standardised streamflows (centre); comparison of overall errors between empirical and predicted dimensionless FDCs, values of δ TNDTK (Sect.5.1.1)are reported against values of δ TNDTK-no nesting (right).

3816, 2014 3812 A. Pugliese et al.: Geostatistical prediction of flow-duration curves
Figure 10.Nash-Sutcliffe efficiency for natural (NSE, filled lines) and log-transformed (LNSE, solid lines) streamflows plotted against the number n of neighbouring stations used for the interpolation.Left panel shows the predictions results for dimensionless FDCs (i.e.MAF standardisation), while the right panel reports the results for dimensional FDCs (i.e.MAP * ).