Review of “Accounting for Hydroclimatic Properties in Flood Frequency Analysis Procedures” by Joeri B. Reinders and Samuel E. Munoz, submitted to Hydrology and Earth System Sciences.
Review date: 2023-09-02.
This is an interesting and generally well-written paper on a topic continuing practical interest and need based on research that appears to be mostly well done. Another reviewer has stated that the regions are rather large and likely heterogeneous; I agree that may be a concern, but my major concerns are prior to that issue, since they relate to station selection and handling of the peak-flow data from those stations. Some of my concerns may turn out to be void when further details on how these issues were handled are added to paper, but the lack of such detail is a problem in itself.
My two primary concerns are as follows:
1. Are stations selected filtered for the effects of regulation (for example by flood-control reservoirs) and urbanization? The effects of urbanization are addressed in the introduction, but then the topic disappears from the paper and no mention of filtering for such effects is made. Reservoirs, especially those designed for flood control and any reservoir will large storage capacity are well-known to have substantial effects on peak-flow distributions (see, for example, FitzHugh and Vogel, 2011), and stations with substantial reservoir effects thus should also be filtered out of the dataset. (A related concern is that such stations often have trends due to changes in urbanization over their period-of-record or because one or more reservoirs was built during the period-of-record.) The dataset at the Zenodo link under the Data availability statement includes several stations with which I am familiar that have substantial effects from urbanization and/or regulation, but that dataset has 4202 stations in it, while the authors say they used a dataset of 1538 stations, so maybe they did filtering that is not discussed. (The Data availability statement itself says the file containing the 1538 stations used is available at the Zenodo link; that is as it should be, but the statement appears to be false.)
An important practical issue in doing such filtering is how. The simplest suggestion I have is to use the GAGES-II dataset (Falcone, 2011). It includes basin characteristics and basin boundaries for more than 9000 gauging stations in the United States, including information on dams and other regulation, land development and impervious surfaces. More than 2000 of the stations are designated as being of “reference” condition, meaning having the least effects from human disturbance. However, the authors might also use the given characteristics to select stations using a somewhat different criterion.
2. How did the authors handle the various “peak streamflow qualification codes” that are associated with USGS peak-flow data? Those of concern for this research are code 1, which indicates that the discharge value is a maximum daily average (which implies the true instantaneous peak is likely higher); code 4, which indicates that the discharge is less than the indicated value which is the minimum recordable discharge at the site, code 8 which indicates that the discharge is actually greater than indicated value; code 7, which indicates that the peak is an historic peak; and code O, which indicates an “opportunistic” value not from systematic data collection. The code 7 and code O peaks can be simply removed (code O because they are non-systematic by definition and code 7 because they are typically very large peaks whose values were inferred from historic records and are thus also non-systematic); for the other codes, authors should consider the sensitivity of their results to the censoring that is indicated, and act accordingly. For example, it would bias the record to remove the code 4 peaks since they are generally the smallest peaks, but the values provided are biased upwards; in this case, one relevant question is how sensitive the results are to biases in such small peaks.
A few additional methodological suggestions are:
1. The major impediment to making these results actionable is that no attempt is made to determine whether it is preferable to log-transform the flood peaks or not. To address this question, among other possible considerations, it seems that a primary need is the calculation of a goodness-of-fit measure; for example, chapter 5 of Hosking and Wallis (1997) discusses one such measure. I think this matter should at least be acknowledged in the paper.
2. In response to reviewer 1, the authors suggested they would add an analysis of the effect of catchment area as an appendix, but apparently they did not. Based on the discussion of this basin property in the introduction, which agrees with my thinking, I think these results should indeed be added.
3. Regarding the issue of regions that are rather large and likely heterogeneous, I don’t feel that refining regions is a crucial issue, but I would point out that Hosking and Wallis (1997, ch. 4) provides L-moment-based methods for testing the homogeneity of regions. A related comment is that I’m not sure the inclusion of Alaska and Hawaii is helpful: they don’t have many gauges and have rather different climates than areas of the conterminous United States.
4. As pointed out by Reviewer 2, consideration of PILFs is a powerful tool for focusing on the upper tail of flood distributions and could presumably be applied to other distributions in addition to LP3 (though it never has been to my knowledge). With the current focus on the complete flood distribution, avoiding its use is acceptable in my opinion. (But on the other hand, for non-extreme floods, fitting to a parametric distribution isn’t needed for at-site flood frequency as interpolation could be used.)
5. Determination of L-moments and L-moment ratios implies determination of distribution parameters. Are these sensible? (For example, for non-log-transformed data, are location coefficients non-negative?)
Beyond these methodological concerns and suggestions, I have several concerns regarding the presentation, mostly due to incompleteness:
1. Dataset table in data archive:
a. Provide definitions / units for the columns of dataset.
b. Limit table to stations actually used in the analysis.
2. In introduction:
a. Address quantile-dependent effects of urbanization on peaks (see for example, Konrad, 2003, and Over et al., 2016, and references therein).
b. Address effects of reservoirs (see for example FitzHugh and Vogel paper cited above and references therein).
c. Here or in the discussion, address the issue of possible trends in the flood-peak data used.
3. Data section
a. Using the longest record for “each independent USGS hydrologic unit” to avoid bias toward heavily sampled rivers (lines 115-6) sounds reasonable, but what is an “independent USGS hydrologic unit”?
b. Suggest comparing the Koppen climatology to the flood climatology of Hayden (1988).
c. The Koppen climate and Psc values were determined for each record were said to have been determined “by proximity” (line 126), which is quite vague. The proximity of what to what? Please explain more completely. If based on the location of the streamgauge, the authors should consider that the climate at the streamgauge location may be significantly different than the climate experienced by the watershed as a whole, depending on the size and other properties such as elevation range of the watershed. One easy modification would be to use the properties at the basin centroid, the location of which is given in the GAGES-II dataset cited above.
4. Presentation of L-moment analysis (Section. 2.2):
a. Define L-moment ratios L-Cs and L-Ks in terms of underlying L-moments.
b. Variables in table 1 are not defined or are defined poorly (Notes at bottom have several typos).
c. How were sample L-moments determined? (Note that there are small-sample biases in the “simple” versions: Hosking and Wallis, 1997, section 2.7.)
d. For the fits to log-transformed data, what did you do about the real-space values less than 1 whose logs are negative? And what was done with zero-valued peaks?
5. Accuracy of claims made:
Two places there are statements about showing that (or whether) hydroclimatic data can improve extreme flood probability estimates (lines 271-2, where this is stated as the main objective of the study), and lines 354-5, where it is stated that “probability model selection can be improved when it is based on the hydroclimatic properties of the basin”. However, strictly, I don’t think this was done, as no improvement in goodness of fit was provided (compare methodological suggestion 1.). It would be more accurate to say that it was shown that model selection can be guided by the use of hydroclimatic classification or similar language.
I also added some editorial comments to a copy of the manuscript, which is attached.
References cited:
Falcone, J., 2011, GAGES-II: Geospatial Attributes of Gages for Evaluating Streamflow, https://water.usgs.gov/lookup/getspatial?gagesII_Sept2011, https://doi.org/10.3133/70046617.
FitzHugh, T.W. and Vogel, R.M. (2011), The impact of dams on flood flows in the United States. River Res. Applic., 27: 1192-1215. https://doi.org/10.1002/rra.1417.
Hayden, B.P., Flood climates, in Flood Geomorphology, edited by V.R. Baker, R.C. Kochel and P.C. Patton, pp. 13-26, John Wiley and Sons, New York. 1988.
Konrad, C.P., 2003, Effects of urban development on floods: U.S. Geological Survey Fact Sheet FS-076–03, 4 p., https://pubs.usgs.gov/fs/fs07603/.
Over, T.M., Saito, R.J., and Soong, D.T., 2016, Adjusting annual maximum peak discharges at selected stations in northeastern Illinois for changes in land-use conditions: U.S. Geological Survey Scientific Investigations Report 2016–5049, 33 p., https://doi.org/10.3133/sir20165049. |