Comparison of catchment grouping methods for flow duration curve estimation at ungauged sites in France

The study aims at estimating flow duration curves (FDC) at ungauged sites in France and quantifying the associated uncertainties using a large dataset of 1080 FDCs. The interpolation procedure focuses here on 15 percentiles standardised by the mean annual flow, which is assumed to be known at each site. In particular, this paper discusses the impact of different catchment grouping procedures on the estimation of percentiles by regional regression models. In a first step, five parsimonious FDC parametric models are tested to approximate FDCs at gauged sites. The results show that the model based on the expansion of Empirical Orthogonal Functions (EOF) outperforms the other tested models. In the EOF model, each FDC is interpreted as a linear combination of regional amplitude functions with spatially variable weighting factors corresponding to the parameters of the model. In this approach, only one amplitude function is required to obtain a satisfactory fit with most of the observed curves. Thus, the considered model requires only two parameters to be applicable at ungauged locations. Secondly, homogeneous regions are derived according to hydrological response, on the one hand, and geological, climatic and topographic characteristics on the other hand. Hydrological similarity is assessed through two simple indicators: the concavity index (IC) representing the shape of the dimensionless FDC and the seasonality ratio (SR), which is the ratio of summer and winter median flows. These variables are used as homogeneity criteria in three different methods for grouping catchments: (i) according to an a priori classification of French Hydro-EcoRegions (HERs), (ii) by applying regression tree clustering and (iii) by using neighbourhoods obtained by canonical correlation analysis. Correspondence to: E. Sauquet (eric.sauquet@cemagref.fr) Finally, considering all the data, and subsequently for each group obtained through the tested grouping techniques, we derive regression models between physiographic and/or climatic variables and the two parameters of the EOF model. Results on percentile estimation in cross validation show that a significant benefit is obtained by defining homogeneous regions before developing regressions, particularly when grouping methods make use of hydrogeological information.


Introduction
A Flow Duration Curve (FDC) is the cumulative frequency distribution of observed flows during a period of interest (month, season, year, or entire period of record).It plots specified flows against their corresponding probability of exceedance, which can be also interpreted as the percent of time these specified values are equalled or exceeded.FDC is a commonly used tool in water management applications, since it displays the full range of flows, including low flows and flood events (Vogel and Fennessey, 1995;Smakhtin, 2001).In the present study, long-term flow duration curves are considered and derived from observed daily flows available at each site.
There have been numerous approaches proposed for estimating FDC characteristics at ungauged locations, particularly low-flow percentiles, using regression equations under different climates (see Castellarin et al., 2007 for a recent review).Despite their relevance for water management issues, FDCs have until now received very little attention in France.To our knowledge, the present study is the first attempt to develop regional flow duration models in this country.Previous studies have concentrated on mapping mean Published by Copernicus Publications on behalf of the European Geosciences Union.river flow statistics, including long-term mean annual and monthly flows (Sauquet, 2006;Sauquet et al., 2008).A straightforward method using a knowledge of the mean annual flow Q A is to consider percentiles expressed as proportions of the long-term mean flow of the corresponding catchment as variables of interest.In this way, regionalization allows us to focus on the shape of the FDC.The dimensionless FDC and the mean annual flow Q A are estimated separately, and their combination provides the expected percentiles.
This method, known as the "index flow approach", has been previously adopted by numerous authors (e.g.Holmes et al., 2002;Singh et al., 2001;Castellarin et al., 2004;Ganora et al., 2009) leading to various procedures to estimate normalised percentiles.The simplest model assumes that the shapes of the FDC at all sites within the study area show a low variability.In practice, dimensionless FDCs from monitored catchments within the same region are pooled and averaged to create a representative shape.Since the hypothesis of similarity may be too restrictive, an alternative approach is chosen here: a reliable mathematical model with few parameters, which vary in space and which are estimated at gauging stations, is used to approximate the dimensionless FDC.The main advantages of the adopted approach can be listed as follows: -The choice of the index value ensures percentiles that are consistent with the mean annual runoff at ungauged sites, i.e. estimates are expected to be in the range of Q A .
-The regionalization procedure involves only a small number of parameters, which are easy to interpret on a physical basis, therefore reducing the computational effort.
-It enables a distinction to be made between the waterbalance related component (i.e.Q A ) and the characteristic response of the catchment to climate (i.e. the parameters of the shape of the dimensionless FDC), and thus leads to a better identification of the most important sources of spatial variability of FDC properties.
The last step of the procedure involves establishing empirical relationships between the variables of interest and the basin descriptors.Indeed, this approach is by far the most often employed in regionalisation.In practice, empirical formulas, usually established by multiple regression, may perform poorly when applied at a large scale due to the high variability of hydrological behaviours, yielding estimates with large errors.One way to improve the performance is to delineate homogeneous subregions assuming that pooled river catchments with similar hydrological, physiographical and meteorological characteristics will behave in a similar manner, before going on to develop separate regional regressions (Smakhtin, 2001).
The identification of homogeneous regions -both in theory and practice -has received much attention in hydrology, but no general methodology has emerged.Hence, different methods of defining homogeneous regions can be found in the literature, leading to geographically fixed regions (either spatially contiguous or not) or neighbourhoods around each target site.In the neighbourhood approach, each site is assumed to have its own homogeneous region formed by gauging stations.Singh et al. (2001) has provided examples of contiguous regions defined for estimating regional FDCs in the Himalayan region of India based on a pre-existing division into hydrometeorological subregions, while Laaha and Blöschl (2006a) have tested grouping according to seasonality indices in Austria.Geographically non-contiguous regions are usually identified using multivariate analysis such as multiple regression, principal component analysis or classification procedures, all of which incorporate catchment characteristics as well as flow statistics (e.g.Isik and Singh, 2008 in Turkey; Nathan and MacMahon, 1990 in Australia;Laaha andBlöschl, 2006a,b andLaaha et al., 2010 in Austria;Vezza et al., 2010 in Italy andGanora et al., 2009 in northwestern Italy andSwitzerland).Two main neighbourhood methods are commonly employed.Both use auxiliary variables to define a catchment descriptor space where distances are computed: these methods make use of the region of influence, as developed by Burn (1990a,b) (e.g. Holmes et al., 2002 in the UK) and canonical correlation analysis (CCA) promoted by Ouarda et al. (2001).
Since the a priori efficiency of grouping methods for regionalizing FDC characteristics is unknown, we here assess the relative performance of three approaches: (i) contiguous regions obtained manually from expertise; (ii) regions obtained through Classification and Regression Trees algorithm (CART) and (iii) neighbourhoods based on canonical correlation analysis (CCA).The choice of these methods was motivated (i) by a pre-existing partitioning established in France to tackle some basic questions related to the European Water Framework directive, (ii) by published studies demonstrating the potential of CART models for the regionalisation of river flow regimes in France (Snelder et al., 2009) and (iii) by the wish to test a well-established method formerly developed to address issues in flood estimation.
In this paper, we successively investigate two main issues, the first being related to the choice of the most suitable parametric model to fit observed dimensionless FDCs at gauged sites, while the second is linked to the way of defining homogeneous regions irrespective of the interpolation procedure used to estimate FDC characteristics.Regarding the last point, this study is in line with previous benchmark studies on the performance of different grouping techniques for estimating low-flow percentiles (Laaha and Blöschl, 2006b;Vezza et al., 2010).The present paper has the following structure.The study area and data used are first presented in Sect. 2.Then, Sect. 3 compares the various mathematical models tested to approximate FDCs at gauged sites.Following selection of the best performing parametric model, Sect. 4 introduces the variable used for testing homogeneity.

Study area and data
Climate and geology are quite diverse in France (approx.area: 550 000 km 2 ), with a maritime temperate influence in the north and west, and a Mediterranean climate with hot and dry summers prevailing in the south.In these latter regions, rainfall and evaporation drive the seasonal variations of runoff, in contrast to mountainous areas (high-altitude rivers in the Pyrenees and the Alps) where snowmelt-fed regimes are observed.From a geological standpoint, France is composed broadly of two major types of terrain: an impermeable Hercynian crystalline substratum principally located in the north-west (Brittany) as well as in certain mountainous areas (Alps, Pyrenees and Massif Central), and more or less permeable sedimentary rocks (limestone and clay) in flat plain areas (e.g. in the northern part of France, where large aquifers sustain the water flow).
The dataset studied here (Fig. 1) consists of 1080 gauging stations from among more than 3500 stations available in the French database HYDRO (http://www.hydro.eaufrance.fr/).The following selection criteria are imposed to select these gauging stations: (i) no significant human influence on flow, (ii) high quality of measurements, (iii) record covering at least 18 years during the period 1970-2008.To guide the selection process, we gather together and interpret qualitative metadata concerning the degree of human influence on the river flow regime and the uncertainty in discharge observations provided by the monitoring authorities.In addition, we investigate the influence of major reservoirs and water diversions upstream from the gauging stations.Time-series are also examined to detect abnormal temporal patterns or suspicious values in the data.
The final selection corresponds to an average density of about 2 gauging stations per 1000 km 2 .However, the distribution of gauging stations across the country is not uniform, with two notable areas of low station density, in northern France and south Brittany.A total of 40 % of the selected catchments have a record length of between 35 and 45 years in most cases.Continuous observations during the period 1983-2000 are available for 90 % of all selected stations, which ensures the temporal consistency of runoff statistics in terms of climatic variability.The drainage areas vary in size between 1.4 and 109 930 km 2 .Most of the gauged catchments (44 %) have areas ranging from 100 to 500 km 2 .Less than 9 % of the dataset, i.e. 92 of the 1030 gauging stations, are subject to intermittency with a proportion of zero-flow values higher than 1 %.Only five basins display more than The catchment characteristics selected for use in the delineation of hydrological regions and in the development of regression equations are derived from GIS, combining the SAFRAN high-resolution atmospheric reanalysis (Quintana-Seguí et al., 2008;Vidal et al., 2010), a 1-km grid digital elevation model and the associated drainage pattern (Sauquet, 2006).18 catchment characteristics are selected for their possible influence on the shape of the standardised flow duration curve.The variables considered in this study include the drainage area (A), the coordinates of the centre of gravity (X G , Y G ), the mean catchment slope (Slp), the three quartiles of the hypsometric curve (Z 25 , Z 50 and Z 75 ), the mean annual catchment air temperature (T A ), the mean summer catchment potential evapotranspiration (ET summer ) using the formulation suggested by Oudin et al. (2005), the mean annual catchment actual evapotranspiration (AET A ) according to Turc's (1954) formulation, the mean annual catchment precipitation (P A ), the variance of the twelve mean monthly catchment precipitation values (VarP A ), the mean seasonal precipitation (P winter , P spring , P summer and P autumn ), the catchment yield (CY) defined by the ratio (P A -AET A )/Q A and the fraction of the drainage catchment underlain by impermeable substratum (% Imp).
In addition, we use the Hydro-EcoRegion classification (HER) developed by Wasson et al. (2002).The HER delineation was performed by experts incorporating different aspects of the geology, climate, physiography, drainage density, vegetation and topography of France.In particular, the HER classification is the result of the interpretation in  , 1996).The HER was not specifically developed to discriminate river flow regimes.In the absence of quantitative information on hydrogeology, HERs are considered to be the most reliable surrogate.This classification divides France into 22 main regions (HER1) that are subdivided into 112 subregions (HER2).We also compute the dominant class in terms of fraction of the drainage catchment underlain by each HER.

A parametric model for estimating the flow duration curve
As suggested in Sect. 1, the identification of parsimonious models for summarizing FDCs is advantageous in reducing the computational effort involved in the regionalisation procedure (only few parameters are required at ungauged sites to estimate dimensionless FDCs).
Numerous formulas have been suggested to approximate FDCs (e.g.Quimpo et al., 1983;Franchini and Suppo, 1996;Yu et al., 2002;Castellarin et al., 2004;Li et al., 2010).Four parametric functions were tested on the dataset in this study, using the exponential model (Eq.1), the logarithm model (Eq.2), the power law model (Eq.3) and the model suggested by Franchini and Suppo (Eq. 4) .These models approximate FDC at each site i, i = 1, ..., N: (1) where Q p is the p-th dimensionless flow percentile and a(i),b(i) and c(i) are the parameters at location i.
In addition to these four analytical functions, we test a different approach based on a discrete decomposition into the Empirical Orthogonal Function expansion (Holmström, 1963).This mathematical technique, also known as the Karhunen-Loeve transform, aims at extracting common patterns that represent a large fraction of the variability contained in a sample of N time series.EOF analysis has already been used for several purposes in hydrology (e.g.Hisdal and Tveito, 1991;Braud and Obled, 1991;Krasovskaia et al., 1999).In this application, EOF analysis expresses logarithmically transformed FDC as a linear combination of shape functions: where M is the number of flow percentiles describing the FDCs, N is the number of gauging stations, β m is the mth shape function and α m is the weight associated with each m-th shape function.By definition β m , m = 1, ..., M are M orthogonal functions with zero mean.This constraint leads us to introduce an additional term: α m , m = 1, ..., M and γ (i) are the parameters of the EOF model depending on the location of the site i and have to be estimated at ungauged sites.Note that the raw data is logarithmically transformed to avoid negative unrealistic estimates.The advantage in applying this method is that it keeps most of the dataset variance in a limited number of shape functions.It is thus possible to truncate the series expansion to a subset of L < M functions to limit the number of model parameters without significant loss of information.
All models are calibrated using 15 dimensionless percentiles Q p , with respective exceedance probabilities of p = 1, 2,5,10,20,30,40,50,60,70,80,90,95,98 and 99 % of the observed FDCs.Analytical model parameters were optimised on observations by applying ordinary leastsquare procedures on logarithmically transformed data to reduce the influence of the largest observed values.Prior to optimization, dimensionless percentiles equal to zero are replaced by 0.001 to apply the logarithmic transformation.
The EOF decomposition applied on the dataset provides fourteen shape functions characterized by different patterns.The first shape function, contributing 97.2 % to the total variance, represents the most common pattern of French FDCs.The other shape functions account for a negligible part of the total variance and allow readjustment for very particular FDCs patterns.Considering these results, we keep only the first shape function.Thus, the number of parameters for the EOF model is limited to two: the mean of the logtransformed dimensionless percentiles γ and the weight associated with the first shape function α 1 .
The performance of each model is measured by the deviations from the 15 dimensionless percentiles Q p on which the five models are fitted.Unrealistic values (negative) are also replaced by 0.001.The boxplots in Fig. 2 give a graphical overview of the performance of each model.The median and the whiskers of the boxplots give a measure of the bias and the accuracy of the model, respectively.In addition, Fig. 3 shows the fitted curves for four gauged catchments representative of the diversity of FDC patterns within the reference dataset.The lower right panel displays results for one intermittent karstic basin, the Coulon River at Saint-Martinde-Castillon, located in southern France.This basin has a proportion of zero-flows above 20 %.To allow calculations, observed Q p with exceedance probabilities of p > 70 % have been fixed to 0.001.Both logarithm and Franchini and Suppo models provide negative values that are also replaced by default by 0.001.Note that we decide not to model the frequency of zero-flows since highly intermittent rivers are not dominant in the dataset (Sect.2).  to predict values close to zero, and thus to approximate reasonably well FDC at intermittent sites although the parameterizations are not specifically adapted to ephemeral rivers.
Results, considering all the dataset, show that: -None of the models are perfect; in particular, none of the models correctly reproduce low-flow percentiles (relative errors may exceed 150 % for some catchments).We should note that this criterion is very selective for low values (relative errors may reach large values when estimates are divided by a reference value close to zero).
-The biases appear most pronounced for the power-law model (Eq.3); low-flow percentiles as well as high-flow percentiles tend to be largely overestimated.
-Comparable biases are found for the exponential model (Eq. 1) and the Franchini and Suppo model (Eq.4): dimensionless percentiles Q p are underestimated for p ≤ 0.02 and for 0.7 ≤ p ≤ 0.9, whereas Q p is overestimated for p ≥ 0.98 and for 0.1 ≤ p ≤ 0.4.-The relative error range is smaller for the exponential model (Eq. 1) and the Franchini and Suppo model (Eq.4) as regards the two dimensionless percentiles (p = 0.01, 0.02).However, there is a systematic negative bias in the estimation of high-flow dimensionless percentiles.
-Results for the logarithm model (Eq.2) follow a very similar pattern to those for the EOF model (Eq.5): on average, they both overestimate dimensionless percentiles with 0.4 ≤ p ≤ 0.8, while high-flow and lowflow percentiles are underestimated.
The degree of bias differs substantially depending on the fitted model.The power law model (Eq.3) yields the worst estimates in terms of relative error (bias and spread are the largest among the models).Comparable biases are found for the exponential model (Eq. 1) and the Franchini and Suppo model (Eq.4).The EOF model (Eq.5) appears to outperform the other models tested despite poor performance for highflow percentiles.Although it performs nearly as well as the logarithm model (Eq.2), it also produces globally less biased estimates (median relative errors are the closest to zero and most of the interquartile ranges include zero for all the exceedance probabilities).The advantage of the EOF model is probably an improved flexibility (the other models are insufficiently flexible to reproduce possible inflexion points in the observations), since it results from an empirical modelling of the FDC shapes.In view of these results, we finally adopt the EOF model in the following steps.As an illustration, Fig. 4 displays the spatial pattern of the weight coefficient α 1 .The right panel shows how the shape of the FDC approximated by the EOF model evolves as α 1 changes with γ fixed to zero.High values of α 1 correspond to steep slopes of the FDC as observed mainly along the Mediterranean and North-Atlantic coasts, whereas small values correspond to flat slopes of the FDC observed in northern France where the river flow regime is governed by groundwater dynamics.

Variables for testing hydrological homogeneity
The application of grouping methods is conditioned by the prior definition of variables to measure the degree of similarity between catchment behaviour and the level of homogeneity within the region.The most obvious option would be to derive groups based on the two variables γ and α 1 to be interpolated.Nevertheless, this choice is not optimal since these values result from an approximation of FDCs and, on this basis any grouping procedure may misestimate the true hydrological similarity between catchments.In our opinion it is more relevant to handle parameters free from any analytical model.In addition, our choice is preferable for subsequent applications of the obtained clusters.Several possible characteristics directly derived from river flow time series are tested and two variables are finally selected based on their correlation with the shape of the FDC and their interpretation in terms of underlying hydrological processes.The first variable is directly related to empirical properties observed on FDCs.The analysis of observed FDCs Hydrol.Earth Syst.Sci., 15,[2421][2422][2423][2424][2425][2426][2427][2428][2429][2430][2431][2432][2433][2434][2435]2011 www.hydrol-earth-syst-sci.net/15/2421/2011/  suggests that the 10th percentile is a breakpoint delineating two parts of the curve: gradient tends to be higher in the upper branch (10 % < p < 99 %) than in the lower branch (1 % < p < 10 %).On this basis, a concavity index is computed as follows: This descriptor is a measure of the contrast between lowflow and high-flow regimes.experiencing hot and dry summers and intense short rainy events in autumn) and also to catchments with no storage capacity (e.g. on impermeable substratum) resulting in severe low-flow and quick runoff in response to rainfall events.It is noteworthy that IC is well correlated with the parameters of the analytical FDC models (Fig. 4), as well as with the average base flow index (not shown here).
The second variable is the seasonality index.Laaha and Blöschl (2006a) demonstrated the value of such a variable for regionalizing the low-flow percentile Q 95 in Austria.Indeed, grouping based on seasonality indices performs better than alternative groupings, since these indices allow a good discrimination of low flow processes at the regional scale when seasonal variability of runoff is high.Laaha and Blöschl (2006a) have used the ratio of the 95th percentile of the winter (December to March) FDC divided by the 95th percentile of the summer (April to November) FDC.Since our objective encompasses low flows, we use a Seasonality Ratio (SR) based on the medians instead: SR ≈ 1 relates to catchments with nearly uniform flow throughout the year, often when significant groundwater contributions filter out seasonal climatic variability.Catchments influenced by snowmelt-fed processes display SR < 1, whereas this variable is above 1 for typical rainfall-fed catchments with low flow in summer and high flow in winter.SR is used here in conjunction with IC to improve identification of the causes of low seasonal variability in runoff (snow or groundwater storage).The variation in SR is governed by geology and air temperature, and is consequently subject to topographic influences in France.
The two variables IC and SR are the flow characteristics used to delineate homogeneous groups.Methods and results are presented in the next section.

Visual grouping (VG)
Non-overlapping regions of approximately homogeneous low-flow indices SR and IC are first visually identified.The starting point is the division of France into 112 Hydro-EcoRegions (HER2s) at the finest level of resolution (Wasson et al., 2002).The HER2s introduced in Sect. 2 are pooled based on hydrological expert knowledge.
The boundaries of HER2s are first superimposed to the map displayed in Fig. 5.The most similar neighbouring HER2s are progressively pooled by respecting contiguity, while minimizing the dispersion within each cluster and maximizing the dissimilarity between the clusters based on visual inspection.The pooling process is far from straightforward.In particular, due to the uneven density of the reference network, some of the HER2s contain too few stations to relate them unequivocally to other neighbouring HER2s.Hence, we use additional information such as rough descriptions of hydrogeology to merge the ungauged HER2s with one of the adjacent clusters.Lastly, inspection of SR values leads to a partitioning of the preliminary groups into sub-groups of HER2s that are homogenous in terms of seasonality.Figure 6 presents the division of France into the 18 different regions so obtained.Mixed regions may persist due to the heterogeneity at the HER2 scale or due to the merging of HER2s containing a small number of gauged sites to large clusters.The identified regions include from 21 to 138 gauged sites and the average size is 57 (5 % of the dataset).

Regression Tree (RT)
The aim of the analyses via tree-building algorithms is to predict dependent variables from a set of factor effects.Classification and Regression Tree approaches perform successive binary splittings of a given dataset according to decision variables.One advantage of this method is its ability to handle qualitative data (e.g.membership of a specific class).In general, RT leads to a set of "if-then" logical conditions as a basis for classification.The algorithm identifies the best possible predictors, starting from the most discriminating factors and proceeding to the less important controls, to divide the clusters (nodes) into two successive parts.The optimal choices are determined recursively by increasing the homogeneity within the two resulting clusters.For this application, Hydrol.Earth Syst.Sci., 15, 2421Sci., 15, -2435Sci., 15, , 2011 www.hydrol-earth-syst-sci.net/15/2421/2011/ we use the R software package rpart (Therneau and Atkinson, 2010).The decision variables are selected automatically by the algorithm among the 19 catchment descriptors (i.e.including the dominant HER2) to ensure an optimal homogeneity of IC chosen as the dependent variable, in the successive clusters.The only constraint consists of including at least 30 gauging stations in each region.At the end, 22 hydrological regions are identified with a mean number of 54 gauging stations per region (Fig. 7).

Canonical Correlation Analysis (CCA)
Canonical Correlation Analysis (Hotelling, 1936) is a multivariate statistical method suited to study interrelations between two sets of variables.CCA has been previously proposed by Ouarda et al. (2001) as a neighbourhood definition method.CCA provides two sets of canonical variables V j , j = 1, ..., k and W j , j = 1, ..., k obtained as follows: -V j , j = 1, ..., k are linear combinations of k standardized hydrological variables X j , j = 1, ..., k.
-(V j , W j ) have maximum correlation.
-(V i , V j ), (V i , W j ) and (W i , W j ) (i = j ) are uncorrelated.
Theoretical developments show that the weight for V j (resp.for W j ) is the i-th eigenvector −1 , where XY is the k × r covariance matrix and ′ XY the transpose of XY .Canonical variables V j , j = 1, ..., k and W j , j = 1, ..., k can be interpreted as coordinates in hydrological and catchment-related physical spaces, respectively.Knowing Y j , j = 1, ..., r at ungauged location, it is then possible to compute W j , j = 1, ..., k and, by calculating correlation coefficients between canonical variables (V i , W j ), their possible proximity -according to Mahalanobis distance -to the gauged stations in the hydrological space, which delineates the neighbourhood around each site.
CCA has been formerly applied to estimate regional flood frequency (e.g.Ouarda et al., 2001;Chokmani and Ouarda, 2004;Shu and Ouarda, 2007).The present article is probably one of the first published studies on an application of CCA to predict FDCs at ungauged locations.Here, CCA is carried out between the two indicators IC and SR and all the catchment descriptors (except for dominant HER2, since conventional CCA cannot handle qualitative variables).The geological description is thus reduced to the percentage of impervious areas.All combinations of 2 to 18 variables among the 18 catchment characteristics, as detailed in Sect.2, are tested, leading us to adopt a combination of six characteristics which yield the best correlations between the first two pairs of canonical variables, i.e. (V 1 , W 1 ) and (V 2 , W 2 ).These catchment characteristics relate to location (the coordinates of the centre of gravity), climate (the mean annual catchment actual evapotranspiration and the variance of the twelve values of mean monthly catchment precipitation), geology (fraction of the drainage catchment with impermeable substratum) and elevation (the third quartile of the hypsometric curve).
In addition to the variables involved in CCA, we need to define the boundaries of the neighbourhood to exclude gauging stations located too far from the target site.Ouarda et al. (2001) suggested a distance threshold depending on a given confidence level and on target site.Preliminary tests show the difficulty of defining a satisfactory confidence level for our dataset, in particular for highly atypical sites from which too few similar sites can be selected to derive reliable regional regressions.Consequently, we choose here to fix the number of stations contributing to a neighbourhood to 50, i.e. the 50 closest gauging stations to the target site, to allow objective comparisons with the results of the two other grouping methods.

Results
Figures 6 and 7 present maps obtained by VG and RT, respectively.One colour is assigned to each reach of the main river network (i.e.all locations draining more than 50 km 2 ).It is not possible to display results from CCA on a map since each site has its own neighbourhood.A comparison between the two maps suggests that:  -The two procedures based on the same auxiliary variables lead to different divisions.The spatial pattern provided by RT is patchier than the one obtained by VG: small tributaries may belong to different classes than the main stream they flow into.The relative influence of location on class allocation is naturally moderate since mountainous basins in the Alps and Pyrenees are pooled together.This result is in direct agreement with the previous studies dedicated to flood quantile estimation (Merz and Blöschl, 2005;Ouarda et al., 2001), which concluded that geographical proximity does not involve hydrological similarity.
-Common geographical groupings can be found e.g. in the north of France (in brown in Fig. 6 and in cyan in Fig. 7) and in the west of France (in orange in Fig. 6 and in dark blue in Fig. 7), supporting visually the fact that the two divisions are not totally inconsistent.
To supplement this analysis, we examine the empirical distributions of both SR and IC per region (identified by a letter on the x-axis).Figure 8 shows the box plots.There is no clear difference between the spread of SR and IC.The absence of any significant improvement in terms of homogeneity within each group (regarding the interquartile provided by the empirical distribution of each variable) or discrimination between groups (regarding the differences between the medians of each groups for each variable) is due to the valuable information contained in the Hydro-EcoRegions.Both methods lead to two very distinct regions with high values for IC.To provide proof, the membership with respect to HER clusters is chosen to define the first splitting variable (Fig. 9).
Regarding CCA, we compare our results with published studies in terms of correlation structure.Figure 10 indicates moderate correlations between the canonical variables: r 1 = 0.71 between W 1 and V 1 and r 2 = 0.57 between W 2 and V 2 .
These values are lower than those obtained in regional flood quantile estimation by Ouarda et al. (2001) in the Province of Ontario (Canada) (r 1 between 0.959 and 0.960 and r 2 between 0.279 and 0.422), by Haché et al. (2002) in the Saint-Maurice River region (Canada) (r 1 = 0.986 et r 2 = 0.842) and by Ouarda et al. (2008) in Mexico (r 1 = 0.966 and r 2 = 0.247).
In these studies, an analysis of the weights associated with the hydrological variables X and the catchment descriptors Y in the linear combinations shows that the high correlation coefficient r 1 depends mainly on the strong link between one T -year flood quantile expressed in m 3 s −1 and the drainage area A. This reflects the dependence of the productivity of the basin in terms of volume to the catchment size.On the contrary, the correlation coefficient r 2 is very weak in most cases, which illustrates the difficulty of identifying relevant basin descriptors to explain the residual spatial variability.As a result, the identification of neighbouring catchments using the Mahalanobis distance leads to cluster catchments of equivalent size (the weight of the second pair of canonical variable (= r 2 2 ) is practically negligible in the calculation of the distance) which is evidently the most important factor of similarity between catchments (but probably not sufficient to ensure homogeneity).
Here, two dimensionless variables (SR and IC) largely free of the scale effect are considered as the set of hydrological descriptors X.Even if we can expect a small influence due to the size of the catchment on the flatness of the FDC, i.e. on IC, results show that the correlation between IC and the drainage area in the dataset is very weak and that the Hydrol.Earth Syst.Sci., 15,[2421][2422][2423][2424][2425][2426][2427][2428][2429][2430][2431][2432][2433][2434][2435]2011 www.hydrol-earth-syst-sci.net/15/2421/2011/ introduction of A among the basin descriptors Y does not significantly improve the correlations between the two first canonical variables.The highest correlation coefficient observed is no more than 0.34 between SR and the first quartile of the hypsometric curve.The context for the definition of the first pair of canonical variable in our application is thus close to the conditions met by Ouarda et al. (2001Ouarda et al. ( , 2008) ) and Haché et al. (2002) concerning the second pair of canonical variables.We fail to find any combination of catchment descriptors that is strongly correlated with the two parameters SR and IC.As a consequence, there is no longer any guaranteed correspondence between the hydrological space and the catchment-related physical space defined by CCA.
In this study, we do not use statistical tests (e.g.ANOVA, Laaha and Blöschl, 2006a) on the clusters of gauged basins to check homogeneity in terms of FDC characteristics.Contrary to other applications (e.g. in Regional Flood Frequency Analysis, where a measure of regional heterogeneity is used to validate the derivation of a representative pooled growth curve), we consider that statistical homogeneity (i.e.low variability around the mean values) is not a necessary condition for ensuring accurate quantile estimates.Indeed, an efficient interpolation technique (e.g. an empirical formula) to predict the river flow characteristics of interest could compensate for the effect of heterogeneity within the groups.Here, clustering is a way of removing the large-scale variability due to dominant factors that are possibly difficult to identify (e.g.hydrogeological properties).Then, interpolation procedures are aimed at modelling the unexplained residual spatial variability at a finer scale whatever the homogeneity.The following section presents the proposed method to develop regional regressions for each grouping procedure.We compare the relative performances of each grouping technique in terms of prediction of dimensionless FDCs.

Method
The homogeneous regions are now identified.Multiple regressions can be developed that model the relations between the EOF model parameters and catchment descriptors.We also investigate both linear and power-form models: λ ′ j Y j (10) Parameters λ j , j ∈ [0, 18] and λ ′ j , j ∈ [0, 18] are fitted to observations for each homogeneous group by the ordinary least-squares method (using log-transformed data to fit power-form models).
To define the most appropriate model for each region, we test all possible combinations including one to four variables among the 18 quantitative variables, selecting the 10 best regression models in terms of the adjusted correlation coefficient.These models are then refined/filtered using an interactive procedure: (i) outliers using Cook's distance are first removed, (ii) the statistical properties of residuals (including normality and homoscedasticity) are checked by visual inspection (only for the first two grouping methods) and (iii) the robustness of each empirical formula is finally assessed by leave-one-out cross-validation.The final models are selected in view of the best value of the correlation coefficient obtained by leave-one-out cross-validation.

Results
To assess the performance of the a priori region delineation, we develop a global regression using the whole dataset of available gauging stations and the procedure described in Sect.6.1.The descriptors involved are the elevation exceeded in 25 % of the catchment, the mean annual catchment air temperature, the catchment yield and the fraction of the drainage catchment with impermeable substratum.Note that two of these descriptors reflect the importance of geological properties in accounting for the variability of the EOF model parameters on the large scale.An analysis of the predictive models derived from the VG and RT approaches demonstrates that: -Linear and power-form models are found to be suitable in a similar proportion of cases to fit the data.-The performance of the regression as well as the set of relevant descriptors may vary substantially from one region to another.R 2 ranges from 0 to 0.86, with the median equal to 0.41.Most of the regressions involve four relevant basin descriptors.
-The four most important explanatory variables regarding α 1 are the catchment yield CY, the drainage area A, the y-coordinate of the centre of gravity YG and the percentage of A with impermeable substratum % Imp.On average, all three variables are involved in three empirical formulas out of ten.Their presence is partly justified: YG may reflect the progressive influence of the Mediterranean climate on flow variability from North to South; CY and % Imp characterize more or less directly the effect of geology (all things considered, the higher the fraction of impervious area, the sharper should be the FDC); lastly, the relevance of A can be justified if we assume that the flatness of the FDC probably increases with the size of the basin due to larger storage capacities and combinations of different river flow patterns originating from upstream tributaries.
The global predictive performance of each method in crossvalidation (i.e. for all the sample set) is assessed using the root mean square error (RMSE) and the correlation coefficient R 2 of the regression between observed and predicted values for the EOF model parameters, γ and α 1 .In addition to these statistics, scatter plots are drawn and visually inspected to compare the spread of the predictions.on the left and α 1 on the right).Each point is related to one gauging station.A one-to-one line (in red) is added to each graph.Absolute relative errors are also computed for each of the 15 selected dimensionless percentiles Q p , and their empirical statistical distributions are summarised by box plots in the lower panel.
Figure 11 presents the cross validation results for the national regression.As expected, the scores are unsatisfactory: dispersion is high around the one-to-one line (R 2 < 0.20 for both EOF model parameters) and the low-flow percentiles are poorly predicted.By comparison, the next three figures (Figs. 12 to 14) illustrate the performance of the three tested grouping methods, suggesting that: -The regional regression based on the three grouping approaches is better than global regressions as in Laaha and Blöschl (2006b) and in Vezza et al. (2010); results for all models follow a similar pattern in terms of relative error on dimensionless percentiles: the highest errors are obtained for the lowest values.
-VG performs nearly as well as RT, with comparable R 2 and RMSE; however, we should note that estimations by the VG approach are probably more heteroscedastic (the spread of errors increases along with α 1 ).RT yields slightly more accurate quantile estimates than VG when comparing the spread of the relative absolute errors, i.e. the length of the whiskers of the box plots in Figs.11  and 12.
Absolute relative error (%) Exceedance probability Fig. 13.Results for the regional regression model applied to groups derived from RT.
-CCA only slightly outperforms global regression.This finding is astonishing since CCA is known as a very efficient regional estimation method.
To understand the unexpected performance of CCA, additional computations are carried out to compare the expected neighbourhoods -which are ideally defined in the hydrological space -with those defined by CCA.We first verify that the regional regressions obtained with the expected neighbourhoods are suited to estimate the EOF model parameters.
The results show very satisfactory performances (R 2 reaches 0.63 and 0.69 for γ and α 1 , respectively).This great difference between performances is probably due to the fact that the neighbours selected by CCA are almost never those expected: for the 50 closest gauged basins, there is only a weak concordance between the theoretical neighbourhoods and those predicted by CCA.This confirms that the correlation between canonical variables is not strong enough to guarantee the correspondence between the physiographical and hydrological spaces and thus ensure the efficiency of CCA.As mentioned above, this probably reflects the lack of efficient catchment characteristics needed to strengthen the link between the two spaces.Clearly, these characteristics are explicitly linked to hydrogeology, since the application of the two other methods differs only by the introduction of such a variable (i.e.dominant HER2

Conclusions
In this study, a regionalisation method is proposed to estimate flow duration characteristics.The developed approach assumes that the mean annual flow Q a is known before estimating FDCs at ungauged sites.Efforts are therefore concentrated on estimating the shape of the normalised FDC using a large data set of FDCs derived from 1080 gauging stations.First, a parametric and parcimonious model based on EOF decomposition is developed to fit the observed shapes of the FDC.A comparison with other models cited in the literature demonstrates that the EOF model leads to the best estimates at gauging stations.One reason could be that, conversely to the empirical approach, analytical formulas are not flexible enough to accommodate the full range of observed shapes.Thus, it would be unrealistic to support the idea of a single parametric model adapted to all hydrological conditions.
In a second step, different grouping techniques are compared for identifying homogeneous regions and developing separate regression models.Two of the grouping procedures, VG and RT, which show comparable performances, demonstrate a significant advantage for the development of regional regressions.It is noteworthy that the RT classification procedure has the advantage of being automatic and objective, whereas heterogeneity may persist in the VG groups, which could explain its lower ranking (2nd).Nevertheless, a large part of the variance remains unexplained.Further research could be devoted to the interpolation of the residuals.We could apply techniques such as adapted kriging (Sauquet, 2006), Top-Kriging (Skøien et al., 2006) or physiographical space-based interpolation (Castiglioni et al., 2009) for this purpose.
The third and last grouping method, CCA, performed poorly, despite the great flexibility in neighbourhood selection, i.e. a neighbourhood is defined individually for each target site.These unexpected poor scores for CCA may result from the difficulty of obtaining a sufficient correlation link between hydrological and physiographical spaces in the absence of relevant characteristics to describe the hydrogeological properties within the catchments.Indeed, for the other two grouping techniques, hydrogeology is summarized by a single qualitative variable, i.e. the class of the dominant HER2, which provides sufficient information to increase homogeneity within regions and ensure more efficient regional regressions.As a result, the application of CCA in predefined regions with homogeneous hydrogeological properties should be investigated to compare equitably CCA to other methods on the same basis.

Fig. 1 .
Fig. 1.Study area and gauging stations identified by their respective centres of gravity (black squares).

Fig. 2 .
Fig. 2. Empirical distribution of relative error for each percentile and each model.The boxplots are defined by the first quartile, the median and the third quartile.The whiskers extend to 1.5 of the interquartile range; open circles indicate outliers.

Fig. 4 .
Fig. 4. Spatial distribution of weight α 1 observed at gauged catchments identified by the location of their centre of gravity.

Fig. 5 .
Fig. 5. Spatial distribution of concavity index IC observed at gauged catchments identified by the location of their centre of gravity.
Figure 5 presents a map of the concavity index in France including the location of the selected stations.The parameter takes values between 0 and 1.Values close to 1 are observed where large aquifers (e.g. in northern France) and storage in snow packs (e.g. in mountainous areas) moderate the variability of daily flow.Values close to 0 are related to catchments exposed to contrasted climate (e.g.small catchments in the Mediterranean area www.hydrol-earth-syst-sci.net/15/2421/2011/ Hydrol.Earth Syst.Sci., 15, 2421-2435, 2011

Fig. 8 .
Fig. 8. Empirical distributions of the two hydrological indicators for each cluster according to VG (a) and RT (b).

Fig. 9 .
Fig. 9. Regression tree model (the numbers at each node of the tree and the name of the first splitting variables are reported in the boxes and in the middle of the branches, respectively).

Fig. 10 .
Fig. 10.Correspondence between position of the gauged sites in the hydrological space and the catchment descriptors space -V 1 and V 2 (resp.W 1 and W 2 ) are the two first canonical variables of hydrological space (resp. of basin descriptors space).

Fig. 12 .
Fig. 12. Results for the regional regression model applied to visual grouping.

Fig. 14 .
Fig. 14. Results for the regional regression model applied to neighbourhoods derived from CCA.