Forest cover influence on regional flood frequency assessment in Mediterranean catchments

The paper aims at evaluating to what extent the forest cover can explain the component of runoff coefficient as defined in a regional flood frequency analysis based on the application of the rational formula coupled with a regional model of the annual maximum rainfall depths. The analysis is addressed to evaluate the component of the runoff coefficient which cannot be captured by the catchment lithology alone. Data mining is performed on 75 catchments distributed from South to Central Italy. Cluster and correlation structure analyses are conducted for distinguishing forest cover effects within catchments characterized by hydromorphological similarities. We propose to improve the prediction of the runoff coefficient by a linear regression model, exploiting the ratio of the forest cover to the catchment critical rainfall depth as dependent variable. The proposed regression enables a significant bias correction of the runoff coefficient, particularly for those small mountainous catchments, characterised by larger forest cover fraction and lower critical rainfall depth.


Introduction
Flood peak assessment is fundamental for planning and design structural and non-structural flood risk mitigation actions.This is achieved by flood frequency analysis, which aims at estimating the probability distributions of flood peaks, so that the flood magnitude for any design return period can be easily determined.A direct assessment of Correspondence to: F. Preti (federico.preti@unifi.it)flood frequency and magnitude is only possible for those limited number of catchments where stream flow gauges have been operating for a significant number of years.For ungauged catchments, the approach commonly employed in engineering hydrology is the regional analysis, which exploits the hydrological similarities among catchments and the scaling properties of flood statistics for exporting the information available in gauged catchments to ungauged catchments (e.g.Cunanne, 1988;Gupta and Waymire, 1990;Stedinger et al., 1993).The method of regional flood frequency analysis most widely applied by hydrologists and engineers is the index flood method, originally introduced by Dalrymple (1960).This method is based on the identification of homogeneous regions, where the probability distributions of the annual maximum floods are assumed invariant except for a site-specific scale parameter known as the index flood.For any site, the flood peak discharge with an assigned return period is defined as the product of two terms: a dimensionless probabilistic growth factor and a site-specific index flood.The statistics of the normalized annual maximum flood series within the homogeneous region are pooled to define the dimensionless probabilistic growth curve, which is assumed invariant within the homogenous region.The index flood is generally assumed coincident with the mean of annual maximum of flood peaks, while a few flood estimation procedures adopt the median (e.g.IH, 1999).The literature contains numerous studies on the identification of homogeneous regions and the estimation of dimensionless probabilistic growth curve, while relatively fewer studies can be found on estimating the index flood, particularly in ungauged catchments.In fact, while a simple arithmetic mean of the available observations can provide a direct estimate of the index flood in gauged catchments, indirect methods are required Published by Copernicus Publications on behalf of the European Geosciences Union.
F. Preti et al.: Forest cover influence on regional flood frequency assessment in Mediterranean catchments for estimating the index flood at ungauged sites.Many indirect methods are based on multiregression models linking the index flood to a selected set of catchment descriptors (e.g.Kjeldsen and Jones, 2010), representing morphological, climatic and land use catchment features.These regression methods take generally little consideration of the physical phenomena underlying the transformation of rainfall in runoff and of the dominant flood generating mechanisms, although more recent studies evidenced that flood generating mechanisms can be relevant in flood regional analysis (Iacobellis and Fiorentino, 2000;Mertz and Blöschl, 2003).Other indirect methods provide an estimate of the flood index based on a conceptual description of the hydrological response of the basin to intense rainfall events.Within the flood assessment procedures employed in Italy, an indirect estimation method largely applied is based on a conceptual model structured according to the well-known rational formula (Rossi and Villani, 1992).Other conceptual approaches have been also proposed, based on an analytical derivation of the probability distribution of floods (e.g.Becciu et al., 1993;Brath et al., 2001;Bocchiola et al., 2003).The index flood estimation method based on the rational formula implicitly assumes that the average value of the annual maximum peak discharges is related to the average value of the annual maximum rainfall depth within a critical time interval, which is assumed equal to a characteristic time scale of the catchment response.A key parameter is the runoff coefficient, which conceptually represents the fraction of the total rainfall contributing to the flood peak response.Runoff coefficient values are derived by regional models with respect to selected catchment descriptors that can be easily distinguished at regional scale.A general tendency is to employ the catchment lithology as principal catchment descriptor of the runoff coefficient, while assuming negligible the additional information attached to the land-cover patterns for the assessment of flood extreme values, at least in rural catchments (e.g.Iacobellis and Fiorentino, 2000;Brath et al., 2001).
As result of the index flood approach, the flood frequency curves within homogenous regions are shifted according to the value of the index flood, while keeping the same shape.
Another regionalisation procedure is based on the use of rainfall-runoff models (e.g.Maidment, 1993).In this case, annual maximum rainfall depths of various durations are treated through a regionalisation procedure, such as the index value approach, whereas the discharge with an assigned return period is estimated by applying a rainfallrunoff model.This second type of regionalization procedure based on rainfall-runoff models is more effective in representing the flood frequency distribution in regions that, although being homogeneous in terms of annual maximum rainfall depths, include catchments with significantly different flood frequency curves as result of different rainfall runoff processes controlling the flood response.In engineering applications, the most widely used rainfall-runoff model is still the rational formula, often applying the runoff coefficient as function of the return period (Schaake et al., 1967;French et al., 1974;Pilgrim, 1989;Cannarozzo et al., 1995).In this case, the runoff coefficient does not simply represent a runoff rainfall ratio, rather it assumes the role of a probabilistic factor controlling not only the position but also the slope and the curvature of the flood frequency curve, by means of a (generally, non-linear) transformation of the catchment rainfall frequency curve for a catchment characteristic time scale.In some studies, the dependency of the runoff coefficient from the return period has been explored by a specific probabilistic model, considering the runoff coefficient independent from the rainfall depth (Gottschalck and Weingartner, 1998;Franchini et al., 2005).
The dependency of the runoff coefficient with the return period is consistent with the consideration that, particularly in rural catchments, there is not a perfect correspondence between annual maximum peak discharges and annual maximum rainfall depths (e.g.Hiemstra and Reich, 1967;Franchini et al., 2000;Haschemi et al., 2000).The maximum annual flood peak is in fact controlled by catchment antecedent conditions and thus can be also generated by rainfall events with catchment average depth even below the annual maximum.This aspect cannot be represented with a constant runoff coefficient, as it occurs in traditional applications of the rational formula.
Provided that vegetation patterns can have a significant influence on the catchment antecedent conditions as well as on other rainfall runoff processes in rural catchments, in this study we explore to what extent forest cover can be employed to predict the runoff coefficient, in the framework of a regional flood frequency analysis based on the rational formula coupled with a regional analysis of annual maximum rainfall depths.
The paper is structured as follows: the following section presents the regional flood frequency analysis; the third section illustrates the available data set; the fourth section explores the dependency of the runoff coefficient from the forest cover; in the fifth section we present a new regression model of the runoff coefficient accounting for the forest cover; the sixth section is devoted to the discussion and the last section to the conclusions.

Regional flood frequency analysis based on the rational formula
The flood peak for a given return period T , Q T , can be defined as follows: where K Q,T is the dimensionless probabilistic growth factor of the floods for a return period equal to T and µ Q is the index flood.
A regional flood frequency analysis based on the index flood assumes that the dimensionless probabilistic growth Hydrol.Earth Syst.Sci., 15, 3077-3090, 2011 www.hydrol-earth-syst-sci.net/15/3077/2011/ factor, controlling the slope and the curvature of the flood frequency distribution, is invariant within the homogenous region, while the index flood, controlling the position of the flood frequency distribution, is variable and can be predicted by catchment specific descriptors.A common approach for estimating the index flood µ Q in Italian ungauged catchments is based on a conceptual model structured according to the well-known rational formula: where A is the catchment area, while φ is the runoff coefficient for the index flood (0 < φ ≤ 1), i.e. the ratio of the mean flood runoff to the mean rainfall depth, assumed to be independent of rainfall intensity and duration; µ[h A (t c )] is the catchment areal rainfall depth within a critical duration t c , obtained by multiplying the point depth-duration-frequency curve µ[h(t c )] (referred to the centre of the storm) with the area reduction factor κ A , which is expressed as function of the catchment area and the critical duration t c (e.g.Brath et al., 2001).
The runoff coefficient φ is estimated by regression models against selected catchment descriptors, with invariant parameters within homogenous regions.These regional models are calibrated with data of gauged catchments, where runoff coefficients (φ obs ) are computed by inversion of Eq. ( 2) with index flood values ( μQ ) assessed by arithmetic average of the annual flood peak experimental values: It is important to observe that φ conceptually describes both the fraction of rainfall volume retained by the soil and the vegetation (i.e. the transformation of the total rainfall in net rainfall as result of processes such as infiltration, canopy interception and surface detention), and the dampening effect of the catchment, which implies the reduction of the flood peak as compared to the net rainfall intensity (for example, if the basin is modeled as a linear reservoir with lag time equal to t L , the flood peak reduction with respect to a constant rainfall intensity is equal to 1 − e −t c /t L ).
As anticipated above, regional regression models of the runoff coefficient generally exploit only classes of catchment lithology as predictor variables.At regional scale, the lithological features of the catchments can be grouped according to different classes corresponding to different degrees of permeability, such as (e.g.Fiorentino and Iacobellis, 2001): (1) highly permeable lithoid complexes constituted by sediments and rocks with porosity based permeability, rocks with fissure-based permeability, and those having a mixed permeability; (2) lithoid complexes with medium permeability constituted by permeable lithologies which outcrop on a steep surface or lithologies more or less fractured and filled by clayey material; and (3) impermeable lithoid complexes represented by clayey formations.
In the case of a regional flood frequency analysis coupling a regional model of the annual maximum rainfall depths and a rainfall-runoff model structured according to the rational formula, the flood peak Q T for a given return period is estimated in ungauged catchments as follows: where h A,T (t c ) is the catchment areal rainfall with return period T , expressed as the product of the dimensionless probabilistic growth factor of the rainfall for a return period equal to T , K R,T , and the index value µ [h A (t c )]; C T is the runoff coefficient for a return period equal to T .Sample values of the runoff coefficient (C T ,obs ) can be derived by inverting Eq. ( 4) applied to gauged catchments, where enough data are available for a direct assessment of the flood frequency distribution, defined by K Q,T and μQ .In particular, by combing Eqs. ( 1) and ( 4), C T ,obs can be expressed as follows: Equation ( 5) shows that C T is the product of two terms: the first is the runoff coefficient for the index flood; the second term is the ratio of the rainfall and flow probabilistic growth factors, i.e. it describes the transformation of the slope and curvature of the flood frequency distribution with respect to the annual maximum rainfall depth frequency distribution.
It is important to note that C T is not restricted to those values commonly attached to the traditional runoff coefficient as φ (0 < φ ≤ 1).According to Eq. ( 5), C T is a probabilistic factor, which in principle could assume values even greater than 1, as for example when the flood frequency curves exhibit a significantly greater curvature than the corresponding rainfall frequency curves (Franchini et al., 2005).
We call C L the value of the φ coefficient estimated by regression models accounting for the catchment lithology only, calibrated at regional scale against the observed values φ obs , as defined by Eq. (3).Then we express the component of the runoff coefficient not explained by the cathment lithology, for a generic return period T , as follows: In the hypothesis that φ is fully described by C L , C T is only representative of the ratio of the rainfall and flow probabilistic growth factors.

Available data
We explore the possibility to explain C T as a function of the forest cover, in combination with other catchment descriptors that are already employed within the flood assessment procedure based on the rational formula, for a reference return period (T ) of 20 years.We perform this analysis by examining the available data concerning 75 gauged catchments in Central and Southern Italy (Fig. 1): 34 catchments in Toscana, 17 in Lazio, 12 in Campania and 12 in Sicilia.These catchments belong to different rainfall and flood homogeneous regions, as delineated by regional frequency analyses.The climate is quite variable among the study catchments, owing to significant variation in the geographic latitude.Sicilia is characterized by semiarid or dry subhumid climate, with mild, not very rainy winters, and warm and very dry summers.As one proceeds north (toward Campania, Lazio and Toscana), the climate turns to wet subhumid and humid, with a marked seasonality, characterized by very wet winters and dry summers.
We selected the catchment mean elevation (Z m ) above catchment outlet as this is one of the three catchment descriptors appearing in the Giandotti empirical formula employed for computing the catchment concentration time (e.g.Brath et al., 2001): where L (km) is the main river length.C L is computed by multiregression equations against the catchment fractions with different degree of permeability, as derived from lithological maps at regional scale.Within the sample catchment set, S p explains almost 80 % of the overall variability of C L among the examined catchments.For this reason we selected only S p among the parameters describing the lithological classes within the study catchments.
The forest cover fraction S b is evaluated as the average value from land use maps of the same period analysed for assessing the observed runoff coefficient, mostly collected from 1960 and 1990 (e.g.Ferro and Porto, 2006).The spontaneous vegetation and land cover in Central and Southern Italy is quite consistent with climatic features and morphological characteristics of the territory.Arid and semiarid zones are characterized by scarce vegetation, which gradually turns into subhumid Mediterranean undergrowth and pasture land, to finally reach the mountain woods of humid and hyperhumid areas (Fiorentino and Iacobellis, 2001).
In the following section we examine the data set to disclose the dependency of C = C T =20 from the forest cover fraction and the selected set of hydro-morphological parameters.

Data mining
The investigation of forest cover influence on C is based on three sequential steps: (1) preliminary data analysis, to explore the hydro-morphological data distributions;

Preliminary data analysis
The following preliminary analyses have been conducted: -histogram analysis; -non-parametric correlation analysis of the hydromorphological variables; -dependence analysis of C from each of the hydromorphological variables and the forest cover fraction.
Histograms show the large variability of the hydromorphological features of the examined catchments (see Table 1 and Fig. 2).A large number of catchments have an extent smaller than 1000 km 2 and a surface fraction with high permeability (S p ) smaller than 20 %.The correlation analysis has been performed through the Spearman rank instead of the standard Pearson coefficient to better capture possible non-linear regressions (Wilks, 1995).
Many of the analyzed hydro-morphological variables exhibit significant cross-correlation (see p-values in Table 2), while none of these variables appears significantly correlated with C. As expected, there is a significant positive correlation between A, t c , and h c .In fact, smaller catchments corresponds to lower order basins located in mountain areas, characterised by higher slope, smaller concentration time and smaller critical rainfall depth.
We also conducted an explanatory analysis among terns of variables C − Y − S b , where Y represents one of the selected hydro-morphological variables A, Z m , t c , h c and S p , taken in turn.In Fig. 3, the contour maps represent the variability of each hydro-morphological variable with respect to the S b values along the x-axis and the variable C along the y-axis.
The contour maps show that a general dependence applicable to the entire data set cannot be found.These configurations indicate the need to explore the dependence between S b and C within group of basins exhibiting some hydrological similarities.

Cluster analysis
The k-means method has been chosen in this study as a simple solution, characterized by short computation times, for the characterization of possible hydrological similarities (MacQueen, 1967)  above which further clustering does not add much to the catchment classification.We identified 2 clusters for the catchment extent A and three clusters for each of the remaining 5 parameters.When we analyzed all parameters (HP), we selected 2 clusters.Tables 4 and 5 show mean, standard deviation, sample size and range of the parameters within each cluster.The differences between clusters have been also verified by a statistical test on the mean value of the hydro-morphological parameters belonging to each cluster, with 0.01 significance level.All clusters, except in HP case, are significantly different in mean, i.e. the null hypothesis that they belong to the same population can be rejected with a significance level equal to 0.   within each cluster is identified by the quantiles 0.05 and 0.95 of the corresponding sample distributions in each cluster.These ranges do not overlap when the clusters are identified by analyzing one parameter at a time, except for the clusters identified with the parameter A only.When all parameter are considered in the clustering process (HP), the distinction of each cluster is more difficult, since value ranges overlap, as we might expect by examining the corresponding mean and std values (see Table 5).

Correlation structure
We calculated the Spearman's rank correlation and the corresponding significance level (p-values) between S b − C for each combination of parameters and for each cluster indentified (Table 6).Significant correlation occurs for the cluster with the largest number of samples for each fixed parameter.This suggests that for those clusters with a limited number of samples, the correlation might be underestimated.For the HP case no significant correlation has been identified.We also assessed the dependence between C and S b by linear regression, in order to evaluate the component of the   runoff coefficient which could be explained by S b .The goodness of fit of the linear models are estimated by the coefficient of determination (R 2 ).The regression analysis is applied to three different sets: (i) including all (S b , C) couples belonging to the examined cluster (LRtot); (ii) including (S b , C) couples which values of the corresponding hydromorphological parameters are below its intra-cluster average (LRinf); (iii) including (S b , C) couples which values of the corresponding hydro-morphological parameters are above its intra-cluster average (LRsup).Results are listed in Table 7.
Figure 5 shows the computed regression lines for the selected clusters.
The possibility to explain C with S b is highly variable, particularly in the case all catchments belonging to a cluster are included in the regression analysis (LRtot).Higher R 2 can be gathered if only those catchments in the lower range of the corresponding parameter values are included in the regression analysis (LRinf).The best fitting is obtained for LRinf within CL-3 fixed (Z m ), CL-2 fixed (h c ) and CL-3 fixed (S p ).The regression lines show that C is always decreasing for increasing forest cover fraction, consistently with the expected result that forest cover contributes to the decrease of flood peaks.The highest linear correlation is observed within CL-2 fixed (h c ).It is also interesting to observe that there is a high linear correlation for all three sub-sets of CL-2 fixed (h c ), with consistent regression coefficients m and q.Moreover, within CL-2 fixed (h c ) it is also possible to observe the highest non-parametric correlation with C (see Table 6).
Hydrol.Earth Syst.Sci., 15, 3077-3090, 2011 www.hydrol-earth-syst-sci.net/15/3077/2011/These results suggest the possibility to explore new regression models for predicting the runoff coefficient C T =20 , accounting for the forest cover S b , at least within catchments classes which can be considered homogeneous with respect to h A,T =20 (t c ).

A new simple conceptual model for runoff coefficient assessment
Predicted values of the runoff coefficients from the catchment lithology (C L ) are generally negatively biased with respect to the observed runoff coefficient C obs .Moreover, the absolute difference C obs − C L tends to be higher for catchments with smaller concentration time, as illustrated in Fig. 6.This suggests that there are larger margins for correcting the prediction of the runoff coefficients in catchments with smaller concentration time, which are generally those basins characterized by larger forest cover fractions, as these catchments are of lower order and higher slope, mostly located in mountainous areas.
Following the results of the previous cluster analysis, we explore the possibility to correct the bias C obs − C L with a runoff coefficient C T =20 expressed by a simple regression model of the forest cover S b scaled by the critical rainfall depth h A,T =20 (t c ): According to Eq. ( 8), the runoff coefficient estimated from the catchment lithology only (C L ) is reduced by a factor V b,T =20 S b / h c , which conceptually could be interpreted as an additional abstraction loss of the total rainfall due to the storage capacity of the forested fraction of the catchment, with a specific volume equal to V b,T =20 .Moreover, recalling Eq. ( 5), the term describes the ratio K Q,T =20 /K R,T =20 , if we assume that C L represents the ratio between the corresponding index values.
The optimal value V b,T =20 can be assessed by least squares method applied to a linear regression model with the dimensionless ratio S b / h A,T =20 (t c ) as independent variable Figure 7a shows the fitted linear regression model with the corresponding scatter plot.The standardized residuals appear normally distributed, as depicted in Fig. 7b and c.The optimal V b,T =20 results equal to 13.9 mm, with a determination coefficient (R 2 ) equal to 0.357 (see Table 8).The null hypothesis of no linear correlation is rejected with a p-value below 0.0001.The 95 % confidence interval spans 6 mm around the expected value.
If we restrict the analysis to catchment clusters selected with respect to t c as illustrated in the previous section, the prediction performance improves with small changes in the optimal V b,T =20 .For example, as illustrated in Table 8, limiting the analysis to CL-1 fixed (t c ), the optimal V b,T =20 is equal to 15.3 mm with a R 2 equal to 0.447.
The prediction performance also improves if we restrict the analysis to smaller catchments, still with slight changes in the predicted V b,T =20 (see Table 8): for catchment areas smaller than 500 km 2 , we get V b,T =20 = 15.7 mm and R 2 = 0.392; for catchment areas smaller than 100 km 2 , we get V b,T =20 = 15.2 mm and R 2 = 0.565.
Figure 8 compares the observed (C T =20,obs ) to the predicted runoff coefficients, C L and C T =20 .Prediction where N is the total number of catchment data and err i represents the deviation between the observed and the predicted runoff coefficients, err i = (C obs − C T =20 ) or err i = (C obs − C L ).As illustrated in Table 8, all three performance statistics indicate that there is an improvement of the predicted runoff coefficient with C T =20 .The bias correction is particularly high in catchments with higher S b / h A,T (t c ) ratios. Figure 9 shows how the (negative) bias correction tends to zero as the catchment concentration time increases.Higher catchment concentration time values correspond to higher S b / h A,T (t c ) ratios, which belong to small mountainous catchments, with higher slope and higher forest cover, which are characterized by small concentration time and small critical rainfall depth.

Discussion
The effect of forest cover on flood regime has been largely studied, but it is still a controversial argument (e.g.Sorriso-Valvo et al., 1995;Robinson, 1989;Robinson et al., 2003;Cognard-Plancq et al., 2001;Cosandey et al., 2005;Bathurst et al., 2011).Despite the public perception that forests reduce flood hazard, there is a large sector of the scientific community asserting that forest cover, although being relevant Hydrol.Earth Syst.Sci., 15, 3077-3090, 2011 www.hydrol-earth-syst-sci.net/15/3077/2011/ F. Preti et al.: Forest cover influence on regional flood frequency assessment in Mediterranean catchments 3087 within the hydrological cycle and in catchment response to small storms, does not mitigate significantly floods during extreme rainfall events (e.g.Calder, 2007;van Diijk and Keenan, 2007).This opinion is also prompted by influential United Nation Policy documents (e.g.FAO and CIFOR, 2005;Hamilton, 2008), which confine the public perception to a misconception conceived by those who are not directly involved in studying hydrological extreme events, including environmentalists and conservation agencies.Experimental studies show that forest cover reduces the annual catchment discharge as result of increased rainfall interception, increased transpiration and lower soil moisture regime during interstorm periods and higher permeability of forest soil (e.g.Bosch and Hewlett, 1982;Cornish, 1993;Rowe and Pearce, 1994;Stednick, 1996;Fahey and Jackson, 1997;Bruijnzeel, 2004).On the other hand, it is more difficult to assess the impact of forests on floods catchment response response to rainfalls with low frequency, due to lack of experimental data for quantifying the effect of forests on the catchment response to rainfalls with low frequency (Nelson and Chomitz, 2007).Moreover, the effect of forest cover on flood peaks is difficult to be isolated, being the flood discharge influenced by other factors, such as initial catchment conditions, forestry and agricultural activities, road constructions, etc. (Moore and Wondzell, 2005).Alila et al. (2009) pointed out that some contrasting conclusions about the relation between forests and floods is the result of catchment paired studies, which do not account for the effect of forest cover on the non-linear dependency between magnitude and frequency of floods.Alila et al. (2009) challenged decades of peer reviewed paired watershed study literature, arguing that the experimental design and statistical methods used in paired watershed studies have overlooked a fundamental part of the physics of the relation between forests and floods, namely that forest affects not only the magnitude but most importantly the frequency of floods.Moreover, because of the strong non-linearity of the flood frequency distribution, small changes in the magnitude of extreme floods can translate into large changes in the corresponding return periods (Alila et al., 2009(Alila et al., , 2010)).
Previous studies paid much attention to the effect of forest cover changes.In fact, forest patterns have been subjected to significant changes worldwide, with different trends, depending specifically on local socio-economic and environmental factors.There are some areas of the world where forest cover has been reducing as results of logging and land claim for agriculture or urban infrastructures.Other areas, such as Mediterranean landscapes, forest patterns are experiencing a significant expansion in the last decades, as consequence of the abandonment of the agricultural lands in marginal areas, mostly located in hilly and mountainous areas, providing space for the natural development of forest (Mazzoleni et al., 2004).
It is important to point out that forest cover change is a source of non-stationarity and its effect on catchment response cannot be analyzed in studies such as the present one.In fact, herein we explore the role of the forest cover on the runoff coefficient as defined in a regional flood frequency analysis, based on the application of the rational formula coupled with a regional model of the annual maximum rainfall depths, which inherently assumes that catchments are in stationary conditions.
Within this framework, the runoff coefficient looses its original meaning of output-input ratio of a rainfall-runoff model, as it occurs in the original interpretation of the rational formula, while it assumes the role of a probabilistic factor, which describes the ratio of the flood peak to the maximum annual rainfall depth for a given return period, i.e. the ratio between two random variables paired by the same frequency as dictated by the corresponding cumulative probability distributions.Thus, the runoff coefficient affects not only the first moment of the flood frequency distribution, as it occurs within the regional flood frequency analysis according to the index flood method, but also the higher moments with respect to those of the maximum annual rainfall depth.
We explored the possibility to exploit the forest cover fraction (S b ) to explain the part of the runoff coefficient which is not already described by C L , which is the runoff coefficient defined on the basis of the catchment lithology only and employed for identifying the index flood.The k-means cluster analysis evidenced the possibility to explain an additional component of the runoff coefficient (C obs − C L ) by the forest cover fraction S b , scaled with the corresponding critical areal rainfall depth, h A,T (t c ).In this analysis, we avoid considering other potential catchment descriptors, beside those already employed for computing the index flood, in order to keep the overall flood assessment procedure as simple as possible.We proposed a linear function of the S b / h A,T (t c ) ratio for correcting C L , with just one additional empirical parameter, V b,T .Although the interpretation of V b,T as an additional abstraction loss attached to the forest fraction for different return periods might be suggestive from a conceptual point of view, it is important to keep in mind that V b,T is an empirical parameter which values are strictly valid for the region where have been estimated, based on the local rainfall and discharge frequency distributions.Moreover, beside the ratio S b / h A,T (t c ), other factors might contribute to the observed difference (C obs − C L ), as this is subjected to several error sources, such as: the model structure, resembling the rational formula, adopted for describing the correspondence between the annual maximum rainfall depth and the peak flow for a given return period, as depicted by Eq. ( 5); the assumption that a catchment critical storm duration can be identified by a catchment morphological parameters as in Eq. ( 7), assumed representative of the catchment concentration time; the regional model for representing the maximum annual rainfall; the approach employed for defining the areal rainfall reduction factor κ A .
The regression model suggests that the effect of forest cover decreases with catchment extent.In fact, for a given  (Blöschl et al., 2007).
It is interesting to observe that the prediction performance of the runoff coefficient also improves without a significant change in the estimated V b,T =20 , if we exclude from the analysis those catchments with an extent larger than those limit values (e.g.500 km 2 ), above which a model structure as the rational formula is not considered appropriate.
From a flood risk perspective, according to the proposed model for predicting the runoff coefficient, neglecting the effect of forest cover can correspond to a significant change in the return period attached to a predicted flood peak value.Figure 10 shows the relative changes in the estimated return period of a 20 year return period flood peak when neglecting the contribute of the forest cover, as function of the forest cover fraction S b scaled by the dimensionless number C L h A,T (t c )/V b,T , for a generic catchment with a dimensionless flood growth factor as defined for the Campania region (Rossi and Villani, 1992).For example, neglecting the effect of a 30 % (scaled) forest cover fraction can lead to underestimating the corresponding return period by almost 3 times, i.e. attributing a return period of 7 years to a flood peak of 20 years.This figure shows that relatively small changes in the estimated flood peak magnitude can translate into surprisingly large changes in their return periods, as a consequence of the strong non-linearity of the flood frequency curve (Alila et al., 2010).
The range of the uncertainty bounds attached to the estimated V b,T =20 value is quite small (6 mm) if compared to the uncertainty attached to the critical areal rainfall depth h A,T (t c ).The prediction performance of the observed runoff coefficient improves if V b,T =20 is calibrated for catchment clusters, as those indentified by the k-means procedure with respect to the catchment concentration time t c .This suggests the possibility of regionalizing the parameter V b,T with respect to selected catchment descriptors, by also examining the spatial distribution of the selected catchment cluster.

Conclusions
In a regional flood frequency analysis based on the application of the rational formula coupled with a regional model of the annual maximum rainfall depths, the runoff coefficient assumes the role of a probabilistic factor, being defined by the product of two components: the first is the runoff coefficient of the index flood; the second is the ratio of the rainfall and flow probabilistic growth factors and is dependent from the return period.
In this paper we evaluated the effect of forest cover on the second component, provided that the first component is assessed by the catchment lithology only.
The results of a k-means cluster analysis applied to a data set of 75 catchments distributed from South to Central Italy, evidenced that the second component of the runoff coefficient can be partly explained by the forest cover fraction, scaled with the corresponding critical areal rainfall depth.Thus, we proposed a linear regression model to improve the prediction of the runoff coefficient, exploiting the ratio of the forest cover to the catchment critical rainfall depth as dependent variable, with just one additional empirical parameter.The proposed regression enables a significant bias correction of the runoff coefficient, particularly for those small mountainous catchments, characterised by larger forest cover fraction and lower critical rainfall depth.
In this study we restricted our analysis to a reference return period of 20 years.Preliminary investigations limited to the Toscana region, show that the parameter regression model is not sensitive to the return period as compared with the uncertainty attached to the estimated parameter.In this case, for higher return periods, the forest cover fraction, as introduced in the proposed regression model, does not provide any additional information on the ratio between the probabilistic growth factors of the flood peak and the maximum Hydrol.Earth Syst.Sci., 15, 3077-3090, 2011 www.hydrol-earth-syst-sci.net/15/3077/2011/ annual rainfall depth, respectively.This does not necessarily mean that the effect of forest cover on flood peaks is not relevant for higher return periods.In fact, owing to the correlation between the forest cover fraction and the other hydromorphological parameters employed in the examined flood regionalization procedure, the forest cover can also indirectly influence the ratio of the corresponding index values and thus the position of the entire flood frequency distribution (Alila et al., 2010).Further studies will be addressed to verify, over a larger data-set, both the efficiency of the proposed regression model for different return periods and the sensitivity of the parameter V b,T , in order to develop a regional model of its spatial variability to be integrated into the regional flood assessment procedure.

-
catchment mean elevation (Z m ) above catchment outlet; -catchment critical storm duration, assumed equal to the concentration time (t c ); -catchment critical rainfall depth (h c ) for a reference return period of 20 years (h c = h A,T =20 (t c )); -catchment fraction with highly permeable lithoid complexes (S p ); -catchment fraction with forest cover (S b ); -runoff coefficient of the index flood estimated by regional regression models accounting for the catchment lithology only (C L );

Fig. 2 .
Fig. 2. Histograms of the hydro-morphological parameters: catchment extent (A); elevation above catchment outlet (Z m ); concentration time (t c ); maximum annual rainfall depth within a time interval equal to t c and a return period of 20 years (hc); catchment fraction with highly permeable lithoid complexes (S p ).

Figure 3 .
Figure 3. Contour maps of S b with respect to ∆C for different hydro-morphological variables: catchment extent (A), elevation (B), concentration time (C), reference rainfall intensity (D) and permeable lithoid fraction (E).White dots indicate the observed values.

Figure 4 .
Figure 4. Example of silhouette plots for cluster analysis . The method performs an unsupervised classification based on the frequency distribution of the hydro-morphological variables.Catchments are grouped according to the k-means cluster analysis following two different procedures: (i) clustering based on individual hydromorphological variables, in order to assess the role of each parameter in the S b − C relation; (ii) clustering including all parameters (hereafter indicated as HP case), to explore the effect of the reciprocal interaction among different parameters in the S b − C relation.Catchment clusters are indentified by maximizing the mean of the silhouette plot (Sh), which is a distance metric based on the squared of the Euclidean distance.Sh indicates the distance of each catchment value within a given cluster from the catchment values belonging to other clusters and ranges from +1 to −1.Sh provides an indirect measure of the inter-cluster separability (Rousseeuw, 1987): +1 suggests a correct catchment clustering, whereas −1 indicates a possible misclassification.Examples of silhouette plots are depicted in Fig. 4.Table3shows the Sh for a number of clusters between 2 and 5, for each of the parameter examined and for the HP case.The values in bold indicate the Sh threshold values

Figure 3 .
Figure 3. Contour maps of S b with respect to ∆C for different hydro-morphological variables: catchment extent (A), elevation (B), concentration time (C), reference rainfall intensity (D) and permeable lithoid fraction (E).White dots indicate the observed values.

Figure 4 .
Figure 4. Example of silhouette plots for cluster analysis Fig. 4. Example of silhouette plots for cluster analysis.

Figure 5 .
Figure 5. S inf represents those (Sb, ∆C) couples which values of the corresponding hydromorphological parameters are below its intra-cluster average.S sup represents those (Sb, ∆C) couples which values of the corresponding hydro-morphological parameters are above its intra-cluster average.Regression lines between ∆C and forest cover fraction for selected clusters of the study catchments: i) including all (Sb, ∆C) couples belonging to the examined cluster (LRtot); ii) including only S inf couples (LRinf); iii) including only S sup couples (LRsup).

Fig. 5 .
Fig. 5. S inf represents those (S b , C) couples which values of the corresponding hydro-morphological parameters are below its intra-cluster average.S sup represents those (S b , C) couples which values of the corresponding hydro-morphological parameters are above its intracluster average.Regression lines between C and forest cover fraction for selected clusters of the study catchments: (i) including all (S b , C) couples belonging to the examined cluster (LRtot); (ii) including only S inf couples (LRinf); (iii) including only S sup couples (LRsup).

Figure 6 .Fig. 6 .
Figure 5. S inf represents those (Sb, ∆C) couples which values of the corresponding hydro-3 morphological parameters are below its intra-cluster average.S sup represents those (Sb, ∆C) 4 couples which values of the corresponding hydro-morphological parameters are above its 5 intra-cluster average.Regression lines between ∆C and forest cover fraction for selected 6 clusters of the study catchments: i) including all (Sb, ∆C) couples belonging to the examined 7 cluster (LRtot); ii) including only Sinf couples (LRinf); iii) including only Ssup couples (LRsup).8 9 10

30Figure 10 .Fig. 10 .
Figure 10.Changes in return period of a 20 year return period flood as function of the forest 1 cover fraction Sb scaled by the dimensionless number ( ) , , / L A T c b T C h t V for a generic 2 catchment with a flood probabilistic growth factor as defined for the Campania region.3 4 5 6

Table 1 .
Mean (µ) and standard deviation (std) of the hydromorphological variables for the examined catchments.

Table 2 .
Correlation analysis among the hydro-morphological variables for the examined catchments: Spearman rank (r k ) correlation in the lower triangular matrix; corresponding p-values (p) in upper triangular matrix.

Table 3 .
Sh values for different clustering levels.In bold Sh threshold value above which further clustering is not acceptable.

Table 4 .
Catchment clustering based on the analysis of single parameters.Statistics of the parameter values in each cluster: mean (µ), standard deviation (std), number of samples (n), and value ranges corresponding to the quantiles 0.05 and 0.95 (Min and Max, respectively).

Table 5 .
Catchment clustering based on the analysis of the entire set of parameters (HP).Statistics of the parameter values in each cluster: mean (µ), standard deviation (std), number of samples (n), and value ranges corresponding to the quantiles 0.05 and 0.95 (Min and Max, respectively).

Table 6 .
Spearman's rank correlation S b − C and corresponding p-values, within each cluster for a given parameter set.In bold those with p-values below 0.05.

Table 7 .
Results of linear regression analysis: slope (m) and intercept (q) of the regression lines; coefficient of determination (R 2 ).

Table 8 .
Results of the regression analysis with four different catchment data sets: all catchments; catchments included in the cluster CL-1 fixed (t c ); catchments with an area A < 500 km 2 ; catchments with an area A < 100 km 2 .

Table 9 .
Performance statistics of the predicted runoff coefficients by exploiting: the catchment lithology only (C L ); both the catchment lithology and the forest cover (C T =20 ). ), as the catchment size increases, the critical rainfall depth h A,T (t c ) increases and therefore the relative contribute of the forest cover on the runoff coefficient reduces.This result is consistent with the observation that the effect of forest cover decreases in larger catchments, i.e. in catchments characterized by larger concentration time, where forest cover is more fragmented and the flood response is dominated by other hydrological and hydraulic processes