ON ACCURACY OF UPPER QUANTILES ESTIMATION

Flood frequency analysis (FFA) entails estimation of the upper tail of a probability density function (PDF) of annual peak flows obtained from either the annual maximum series or partial duration series. In hydrological practice the properties of various estimation methods of upper quantiles are identified with the case of known population distribution function. In reality the assumed hypothetical model differs from the true one and one can not assess the magnitude of error caused by model misspecification in respect to any estimated statistics. The opinion about the accuracy of the methods of upper quantiles estimation formed from the case of known population distribution function is upheld. The above-mentioned issue is the subject of the paper. The accuracy of large quantile assessments obtained from the four estimation methods are compared for two-parameter distributions log-normal, log-Gumbel and their three-parameter counterparts, i.e., three-parameter log-normal and GEV distributions. The cases of true and false hypothetical model are considered. The accuracy of flood quantile estimates depend on the sample size, on the distribution type, both true and hypothetical, and strongly depend on the estimation method. In particular, the maximum likelihood method looses its advantageous properties in case of model misspecification.


Introduction
Flood frequency analysis (FFA) provides information about the probable size of flood flows.The estimates of the quantiles of maximum flows obtained in this way have many practical applications.This information is required for designing hydraulic structures, determining the limits of flood zones Correspondence to: I. Markiewicz (iwonamar@igf.edu.pl) with varying degrees of flood risk, estimating the risk of exploitation of floodplains, as well as for the valuation of the contributions of many branches of the insurance market.FFA provides support for the governing bodies of water resources in decision-making processes and plays a very important role in reducing the flood risk.
The flood frequency analysis boils down to the estimation of the upper tail, i.e., the upper quantiles of the probability density function of the annual (or partial duration) maximum flows, and the distribution function assumed is the statistical hypothesis.The problem of flood frequency modelling refers to the choice of the probability distribution describing the annual peak flows along with the method of estimation parameters and, thus, quantiles of this distribution.This issue is called the distribution and estimation (D/E) procedure.The accuracy of quantile estimate is measured by the mean square error (MSE) and the bias (B).In a classical hydrological approach, the properties of the estimation methods are analysed under the assumption that the hypothetical distribution adopted is true.In the literature, there are several papers concerning an analysis of the accuracy of the estimates of large quantiles for the selected probability distribution (e.g., Landwehr et al., 1980;Kuczera, 1982;Hoshi et al., 1984).The properties of the estimation method observed for some distribution are often automatically generalized to other distributions.In the literature, three estimation methods have been usually compared, including the method of conventional moments (MOM), the method of linear moments (LMM) and the maximum likelihood method (MLM).In this paper, another method is proposed for the comparative analysis; it is the method built on the mean deviation (MDM).Due to the analytical intractability of the mean deviation in statistics, this method was not yet widely applied in the FFA.However, using the simulation techniques can cope with this inconvenience.The application of the MDM to the estimation of the flood quantiles has been proposed in Markiewicz et al. (2006) and Markiewicz and Strupczewski (2009).
As the objective function for the selection of the probability density function to the data should be the best fit of the distribution to empirical data, primarily in the range of the upper quantiles, making allowance for low quality of largest sample data.Moreover, no simple statistical model can reproduce the dataset in its entire range of variability.This would require the use of too many parameters that cannot be estimated reliably and efficiently from a data series which is usually of relatively small size.The probability of the correct identification of density function on the basis of short hydrological samples is very low, even in the ideal case, when a set of alternative distributions contains the true density function (e.g., Mitosek et al., 2006).Therefore, the traditional approach based on the knowledge of theoretical distribution is not acceptable.In papers by Strupczewski et al. (2002a,b) and Weglarczyk et al. (2002), the asymptotic bias of quantile in the case of assuming the wrong distribution has been derived for various estimation methods and for selected pairs of probability functions.If the hypothetical distribution is a true one, then for a given estimation method, the bias of quantile estimate results from a finite random sample on the basis of which we assess the value of a quantile, but when the hypothetical distribution differs from the true one, the total bias of quantile estimator also includes the error resulting from the model.
The aim of the study is to show that the theoretical properties of various estimation methods vary significantly when the choice of a hypothetical distribution is incorrect, which is very likely in the realities of hydrology.The paper is organized as follows.After providing some introduction to the topic, the four estimation methods and the probability distributions analysed in the paper are presented in Sects. 2 and 3, respectively.The next section provides studies on the accuracy of upper quantile estimates for these two-and threeparameter distributions under the assumption of true hypothetical distribution.A similar discussion, for the case of false hypothetical distribution, is presented in Sect. 5.The paper is concluded in the final section.

Estimation methods
Several systems of summary statistics describing the properties of a random sample have been developed.Based on different principles they provide, in particular, the measures of location, dispersion, skewness and kurtosis, which consecutively serve for identifying and fitting PDFs.It is convenient to use dimensionless versions of the summary statistic sets in the form of summary statistic ratios.They measure the shape of a distribution independently of its scale of measurement.Among the systems of summary statistics, the most popular are the system of conventional moments and that of linear moments (L-moments).The L-moments create an attractive system because their estimators, in contrast to the classical moments estimators, are not biased and the sampling L-moment ratios have very small biases for moderate and large samples (e.g., Hosking and Wallis, 1997).For both the system of conventional and that of linear moments, the measure of location is expressed by the mean (µ ≡ λ 1 ), and the measures of dispersion and skewness are presented in Table 1.
For the estimation of statistical characteristics, the method of moments (MOM) (e.g., Kendall and Stuart, 1969), and the method of linear moments (LMM) (e.g., Hosking and Wallis, 1997), have been alternatively used.The MDM is an innovative method based on applying the mean deviation δ µ about the mean value (µ) as a measure of dispersion (see Table 1), with the mean as a measure of location and δ S as a measure of skewness (Markiewicz et al., 2006;Markiewicz and Strupczewski, 2009).The complement to the estimation methods based on distribution characteristics is the MLM (e.g., Kendall and Stuart, 1973), which is based on the main probability mass.The MLM is sometimes regarded as the most appropriate method because it allows us to obtain the asymptotically most efficient estimators.However, the MLM involves relatively large computational difficulties and that, the maximum likelihood, estimators do not always exist.

Probability distributions
The true probability distribution, which reflects the time series of extreme flows for a given gauging station, is not known.The study on a distribution form which would describe the observed data series is the subject of many papers, such as Jenkinson (1969) or NERC (1975).The hydrological report of the World Meteorological Organization from 1989 (Cunnane, 1989) shows that the most commonly used and recommended were Gumbel and log-normal distributions.Nowadays, the researchers of hydrological extreme events recommend the use of the heavy-tailed distributions for modelling the annual maximum flows (e.g., FEH, 1999;Rao and Hamed, 2000;Katz et al., 2002).However, as yet, the certificate of a heavy tail of hydrological variables is not sufficiently convincing (e.g., Rowinski et al., 2002;Weglarczyk et al., 2002).The heavy-tailed distributions have conventional moments only in a certain range of shape parameter values and the range decreases with growing moment order.Since the hydrological samples of peak flows are usually of a relatively small size, in order to estimate many parameters reliably and efficiently, both two-and three-parameter distributions are used in FFA, where in the three-parameter distributions the third parameter (ε) serves as lower bound (e.g., Rao and Hamed, 2000).In the paper, to assess the accuracy of the estimates of high quantiles, two two-parameter distributions have been selected, i.e., log-normal 2 (LN2) and log-Gumbel (LG) and their three-parameter counterparts, LN3 and GEV.Density functions of distributions are shown in Table 2.Both two-and three-parameter log-normal distributions represent the classical (albeit borderline) type of distribution, while the LG and GEV are heavy-tailed.

True hypothetical distribution
Since the true probability distribution of an observed peak flow series is not known, it would seem that the choice of a hypothetical distribution is the key point to the accurate estimation of high quantiles.However, as discussed in this section, the case where the assumed distribution is consistent with the real one, shows that the ranking of the methods with respect to the accuracy of large quantile estimate strongly depends on the type of the true distribution and its shape.
The issue is analysed in the example of the quantile x =0.99 , otherwise known as the quantile 1%.This is likely the most commonly estimated design value for the dimensioning of hydrological structures and it defines the flow value which is exceeded, on average, once every 100 years.

Simulation experiment
Simulation experiments are performed for two-parameter distributions, LN2 and LG, the variation coefficient C V (C V = σ/µ) varying from 0.2 to 1.0 , with any mean µ > 0. The N-element samples are considered for N = 20(10)100.In each case, 20 000 random samples are generated.The value of x F=0.99 is calculated using four estimation methods under the right assumption that the population is log-normal and log-Gumbel distributed, respectively.The accuracy of the quantile x F=0.99 estimates is expressed by the relative root mean square error (δ RMSE) and the relative bias (δ B): The results of the experiment are presented in Tables 3  and 4 for LN2 and LG distributions, respectively.For the sake of brevity, the selected sample sizes are shown in the tables, i.e., N equals 20, 60 and 100.In all tables, the best values of the relative RMSE and B on each row are bolded.In the asymptotic case, i.e., for N → ∞, δ RMSE x0.99 and δ B x0.99 converge to zero.The quantile value in the first column is the true value.
For three-parameter distributions, LN3 and GEV, the mean equals zero, the standard deviation equals one and various values of skewness coefficient C S C S = µ 3 /µ 3/2 2 are assumed for the Monte Carlo experiment.The results are shown in Tables 5 and 6.The range of C S value considered here is conditioned by the existence of skewness coefficient www.hydrol-earth-syst-sci.net/14/2167/2010/ Hydrol.Earth Syst.Sci., 14, 2167-2175, 2010   for GEV distribution, which takes values greater than 1.1396 (e.g., Markiewicz et al., 2006, p. 394).Moreover, the maximum likelihood estimation of GEV distribution is not always satisfactory and for some samples it appears that the likelihood function does not have a local maximum (Hosking et al., 1985).In our simulations of the GEV distribution, this non-regularity of the likelihood function causes occasional non-convergence of the modified Powell hybrid algorithm (More et al., 1980;IMSL, 1997) that is used to maximize the log-likelihood.The last column in Table 6 shows the reliability of MLM for the GEV distribution.

Accuracy of upper quantile estimates for two-parameter distributions
For both distributions, LN2 and LG, for any value of variation coefficient, the method of moments gives the greatest bias, and the higher the C V value the greater the bias.
For small samples of 20 elements from LN2 distribution, the relative bias of quantile x 0.99 estimated by MOM increases from −1.57% for C V = 0.2 to −11.41% for C V = 1.0 (Table 3), while for LG distribution, these values are, respectively, −3.41% and −22.27% (Table 4).For small values of C V (C V = 0.2) the difference between δ B x0.99 from the MOM and from the second method in terms of high bias, i.e., the MLM, is not large, but the distance increases with increasing C V value.For the two distributions, the output of the method of maximum likelihood converges to those of the MDM and LMM, which are the best among the four estimation methods studied; in most cases they give a relative bias lower than 1% in absolute value.A clear negative MOM detachment from other methods is observed for bias of LG.It remains large even for statistically large sample, particularly for a large C V population value.It is also worth noting that the MDM produces a competitive bias to LMM.The relative root mean square error of quantile 0.99 estimate is the smallest for MLM both for LN2 and LG distribution except for LN2 with C V = 0.2 and N = 100 where it is the largest one.Among the methods built on summary statistics, i.e., MOM, LMM, MDM, the method MOM produces the smallest δ RMSE x0.99 for small samples (N = 20) of LN2 distribution, regardless of the value of C V and for   log-Gumbel distribution and C V ≥ 0.6, regardless of the sample size.While in other cases considered, the MOM is the method that yields the highest root mean square error among the four estimation methods studied.
It is worth noting that for the heavy-tailed distributions of large C V value (i.e., with large skewness as well this time), the bias of the MOM estimator of the standard deviation, and consequently of the 1% quantile, decreases very slowly with increasing sample size.This is clearly evident in Table 4.

Accuracy of upper quantile estimates for three-parameter distributions
The strong inferiority of MOM in respect of the relative bias of x 0.99 quantile assessment, which has been observed for LN2 and LG distributions, does not occur in the case of tree parameter LN3 and GEV.For 20-element samples, the absolute value of δ B x0.99 obtained from MOM is similar to the analogical value obtained from MLM both for LN3 distribution (Table 5) and GEV (Table 6).Then, with increasing sample size, the absolute δ B x0.99 decreases significantly in the case of MLM and slightly in the case of MOM.For LN3 and the analysed range of C S , the LMM and MDM yield significantly smaller δ B x0.99 than MOM and MLM, regardless of the sample size, while in the case of GEV this regularity is not observed.The δ RMSE x0.99 of MLM is worth the special attention.Comparing the two-parameter distributions, the addition of the location parameter to the distribution characteristics effects in degradation of MLM position in the δ RMSE ranking both for the LN3 and GEV distributions.The MLM losses its first place in all cases except the large samples (N = 100) of the LN3.However even then, the superiority of the MLM over the three other methods is very small.For both distributions and small samples (N = 20), the ML-estimates of x0.99 have the highest δ RMSE of all four estimation methods considered and the differences between the quantile assessments obtained from the MLM and other three methods are considerable.For example, in the case of LN3 distribution of C S = 2.0 and N = 20, the relative root mean square error of x0.99 obtained from the MLM is 21.13%, while δ RMSE x0.99 from the MOM, LMM and MDM are only 11.68%, 14.18% and 14.12%, respectively (Table 5).For the GEV distribution, analogical values of δ RMSE x0.99 are 82.61%,36.23%,53.69%, 45.23% for methods MLM, MOM, LMM and MDM, respectively (Table 6).The first place of MOM in δ RMSE ranking is observed for LN3 and GEV with population C S = 2.0, while if C S = 4.0, the method based on mean deviation is the best with respect to δ RMSE of quantile 0.99 estimate.  of a false hypothetical distribution seems to be more realistic.
The error of the estimate of quantile 0.99 differs significantly for particular options of true and hypothetical distribution assumed, giving an evidence of strong influence of the type distribution, both true and hypothetical, on the accuracy of the estimators of large quantiles.

Simulation experiment
The Monte Carlo experiment is carried out similarly as in the case of true hypothetical distribution; however, the hypothetical distribution is incorrectly assumed.Therefore, two options for two-parameter distributions are considered, i.e., T = LN2, H = LG (Table 7) and T = LG, H = LN2 (Table 8), and two options for three-parameter PDFs, i.e., T = LN3, H = GEV (Table 9) and T = GEV, H = LN3 (Table 10).Note that the bias for the asymptotic case (N → ∞) can be obtained analytically for two-parameter distributions, see Strupczewski et al. (2002a,b) and Weglarczyk et al. (2002), while for three-parameter PDFs analogical values have not been derived yet.For the option of false hypothetical distribution, if the sample converges to infinity, the bias is the total error of quantile estimate, see Tables 7 and 8.
Hydrol For the option T = GEV, H = LN3, the relative bias of the estimate of quantile 0.99 is the largest for MOM, followed by MLM, and then by MDM and LMM (Table 10).The rank of estimation method in respect of δ RMSE value strongly depends on sample size.For the considered range of C S and N = 20, the sequence of methods from that which gives the smallest δ RMSE x0.99 to this which gives the highest δ RMSE x0.99 is as follows: MOM, MDM, LMM and MLM, while for N > 60 the order is opposite.

Conclusions
Since the upper quantiles are design values for the dimensioning of hydrological structures, the accuracy of their estimates is a major and extremely important issue for flood frequency analysis.The studies presented in this paper show that the accuracy of the estimates of flood quantiles depends on the sample size, type of distributions, both real and hypothetical, and strongly depends on the method of estimation.Therefore, the properties of estimation methods cannot be generalized in respect to distribution type or sample size, even if the hypothetical distribution is true.However, it is worthy to note that for two-parameter distributions, in the case of model misspecification, the MLM yields the highest bias of upper quantile estimates regardless on the sample size, while the MOM the smallest one.The correct identification of the distribution on the basis of short data series is not possible in hydrological reality.This finding essentially diminishes the practical usefulness of MLM in hydrological extremes analysis because its efficiency may not compensate for the (frequently) huge bias produced by the assumption of a false PDF in the region of high non-exceedance probability quantiles which the user is often interested in.It marks a departure of hydrological extreme value analysis from the classical, statistical theory of extremes whose core is maximum likelihood method.The person making the choice of the distribution and estimation (D/E) procedure, e.g., explorer, hydrologist, designer, should be aware of the impact of the procedure selection on the value of desirable estimate.Presented in this paper a comparative analysis of large quantile estimates obtained by various methods of estimation under the assumption of true or false, but close to the true type of distribution, can be a source of information about the properties of selected D/E procedures.The studies on the estimation methods of flood quantiles when the hypothetical model is untrue should be continued.Despite a century of research, the problem of flood flows modelling is still open.

Table 2 .
Probability density functions of log-normal and GEV distributions.

Table 4 .
Relative accuracy [%] of x0.99 for a sample from LG, assuming LG model.
to other three methods and the method based on mean deviation ranks very well.For 100-element samples, the relative bias of x0.99 obtained from MLM is 35.48%, while δ B x0.99 from the methods MOM, LMM and MDM are only −12.36%, 3.257% and −0.470%, respectively.The analogical values of δ RMSE x0.99 are 55.85%, 28.67%, 27.29%, and 23.75% for MLM, MOM, LMM and MDM, in turn.