References

HESS

Hydrology and Earth System Sciences

HESS

Hydrol. Earth Syst. Sci.

1607-7938

Copernicus Publications

Göttingen, Germany

10.5194/hess-14-1909-2010

Exploiting the information content of hydrological ''outliers'' for goodness-of-fit testing

Laio

¹ Allamano

¹ Claps

DITIC, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy

12 10 2010

14 10 1909 1917

2010

This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this licence, visit https://creativecommons.org/licenses/by/3.0/

This article is available from https://hess.copernicus.org/articles/14/1909/2010/hess-14-1909-2010.html

The full text article is available as a PDF file from https://hess.copernicus.org/articles/14/1909/2010/hess-14-1909-2010.pdf

Validation of probabilistic models based on goodness-of-fit tests is an essential step for the frequency analysis of extreme events. The outcome of standard testing techniques, however, is mainly determined by the behavior of the hypothetical model, FX(x), in the central part of the distribution, while the behavior in the tails of the distribution, which is indeed very relevant in hydrological applications, is relatively unimportant for the results of the tests. The maximum-value test, originally proposed as a technique for outlier detection, is a suitable, but seldom applied, technique that addresses this problem. The test is specifically targeted to verify if the maximum (or minimum) values in the sample are consistent with the hypothesis that the distribution FX(x) is the real parent distribution. The application of this test is hindered by the fact that the critical values for the test should be numerically obtained when the parameters of FX(x) are estimated on the same sample used for verification, which is the standard situation in hydrological applications. We propose here a simple, analytically explicit, technique to suitably account for this effect, based on the application of censored L-moments estimators of the parameters. We demonstrate, with an application that uses artificially generated samples, the superiority of this modified maximum-value test with respect to the standard version of the test. We also show that the test has comparable or larger power with respect to other goodness-of-fit tests (e.g., chi-squared test, Anderson-Darling test, Fung and Paul test), in particular when dealing with small samples (sample size lower than 20–25) and when the parent distribution is similar to the distribution being tested.

References 1

Ahmad, M., Sinclair, C., and Spurr, B.: Assessment of flood frequency models using empirical distribution function statistics, Water Resour. Res., 24, 1323-�1328, 1988.

Barnett, V. and Lewis, T.: Outliers in statistical data, Springer Series in Statistics, John Wiley and Sons, 1994.

Bayliss, A. and Reed, D.: The use of historical data in flood frequency estimation, Tech. rep., Centre for Ecology and Hydrology, 2001.

Bryson, M.: Heavy-tailed distributions: properties and tests, Technometrics, 16, 61–68, 1974.

Chowdhury, J., Stedinger, J., and Lu, L.: Goodness-of-fit tests for regional generalized extreme value flood distributions, Water Resour. Res., 27, 1765–1776, 1991.

D'Agostino, R. and Stephens, M.: Goodness-of-Fit Techniques, Marcel Dekker Inc, New York, 1986.

Di Baldassarre, G., Laio, F., and Montanari, A.: Design flood estimation using model selection criteria, Phys. Chem. Earth, 34(10–12), 606–611, <a href="http://dx.doi.org/10.1016/j.pce.2008.10.066">https://doi.org/10.1016/j.pce.2008.10.066</a>, 2008.

Falk, M. and Reiss, R.: Independence of Order Statistics, Annals of Probability, 16, 854–862, 1988.

Fill, H. and Stedinger, J.: L-moment and probability plot correlation coefficient goodness-of-fit tests for the Gumbel distribution and impact of autocorrelation, Water Resour. Res., 31, 225–229, 1995.

Fiorentino, M., Versace, P., and Rossi, F.: Regional flood frequency estimation using the two-component extreme value distribution, Hydrolog. Sci. J., 30, 51–63, 1985.

Frances, F.: Using the TCEV distribution function with systematic and non-systematic data in a regional flood frequency analysis, Stoch. Hydrol. Hydraul., 12, 267–283, 1998.

Fung, K. and Paul, S.: Comparison of outlier detection procedures in Weibull or Extreme-Value distribution, Commun. Statist. Simula. Computa, 14, 895–917, 1985.

Grubbs, F.: Procedures for detecting outlying observations in samples, Technometrics, 11, 1–21, 1969.

Gumbel, E.: Discussion of the Papers of Messrs. Anscombe and Daniel, Technometrics, 2, 165–166, 1960.

Hershfield, D.: Estimating the probable maximum precipitation, J. Hydraul. Div. ASCE, 87(HY5), 99–106, 1961.

Hershfield, D.: Method for estimating probable maximum precipitation, J. Am. Water Works Assoc., 57, 965–972, 1965.

Hosking, J. and Wallis, J.: Regional Frequency Analysis: An Approach Based on {L}-Moments, Cambridge University Press, 1997.

Hosking, J., Wallis, J., and Wood, E.: Estimation of the Generalized Extreme Value distribution by the method of the probability weighted moments, Technometrics, 27, 251–261, 1985.

Kendall, M. and Stuart, A.: The Advanced Theory of Statistics, Charles Griffin and Company Limited, 1979.

Kottegoda, N. and Rosso, R.: Statistics, probability, and reliability for civil and environmental engineers, McGraw-Hill, International Edition, 1998.

Koutsoyiannis, D.: Probable maximum precipitation, \urlprefix<a href="http://www.itia.ntua.gr/getfile/116/5/documents/2000HydrometP% MP.pdf">http://www.itia.ntua.gr/getfile/116/5/documents/2000HydrometP% MP.pdf</a>, 2000.

Laio, F.: Cramer-von Mises and Anderson-Darling goodness of fit tests for extreme value distributions with unknown parameters, Water Resour. Res., 40, W09308, <a href="http://dx.doi.org/10.1029/2004WR003204">https://doi.org/10.1029/2004WR003204</a>, 2004.

Laio, F., Di Baldassarre, G., and Montanari, A.: Model selection techniques for the frequency analysis of hydrological extremes, Water Resour. Res., 45, W07416, <a href="http://dx.doi.org/10.1029/2007WR006666">https://doi.org/10.1029/2007WR006666</a>, 2009.

Laio, F., Allamano, P., and Claps, P.: Interactive comment on "Exploiting the information content of hydrological "outliers" for goodness-of-fit testing" by F. Laio et al., Hydrol. Earth Syst. Sci. Discuss., 7, C2227–C2230, 2010.

Mitosek, H., Strupczewski, W., and Singh, V.: Three procedures for selection of annual flood peak distribution, J. Hydrol., 323(1–4), 57–73, 2006.

Moore, D.: Goodness-of-Fit Techniques, chap. Tests of the chi-squared type, Marcel Dekker, New York, 1986.

Rossi, F., Fiorentino, M., and Versace, P.: Two-component extreme value distribution for flood frequency analysis, Water Resour. Res., 20, 847–856, 1984.

Stedinger, J., Vogel, R., and Foufoula-Georgiou, E.: Handbook of Hydrology, chap. 8: Frequency analysis of extreme events, McGraw-Hill, New York, 1992.

Strupczewski, W., Singh, V., and Weglarczyk, S.: Asymptotic bias of estimation methods caused by the assumption of false probability distributions, J. Hydrol., 258, 122–148, 2002.

Vogel, R.: The probability plot correlation coefficient test for the normal, lognormal, and Gumbel distributional hypotheses, Water Resour. Res., 22, 587–590, 1986.

Vogel, R. and McMartin, D.: Probability plot goodness-of-fit and skewness estimation procedures for the Pearson type 3 distribution, Water Resour. Res., 27, 3149–3158, 1991.

Wang, Q.: Unbiased estimation of probability weighted moments and partial probability weighted moments from systematic and historical flood information and their application to estimating the GEV distribution, J. Hydrol., 120, 115–124, 1990.

Wang, Q.: Approximate goodness-of-fit tests of fitted generalized extreme value distributions using LH moments, Water Resour. Res., 34, 3497–3502, 1998.