General comments
The authors have improved the article and incorporated some suggestions from reviewers. However, further improvements described below are needed before this article can be accepted for publication. I have also attached a Tracked Changes document to this round of revisions for writing suggestions and minor technical comments. I’ve incorporated my feedback on the authors’ responses to my initial comments below, including some places where I agreed with their responses.
Effects of nonstationarity on flood series moments ---------------------------------------------------
The authors described nonstationarity as being outside the scope of their paper in their response to my initial review. While I agree that it is not the goal of their paper, I still think a quick investigation as to whether trends might affect estimates of sample moments is imperative, especially given that snowmelt comprises a major control on flood generation in 3/5 regions they examine.
This could be limited to an analysis of the effects of trends in the MAF on estimates of the CV following the conditional moments framework of Serago and Vogel (2018). If a trend is not accounted for, the CV can be overestimated since the overall variance of the peak flows will also include the variance explained by the trend.
The effects of trends in the mean and variance on CS can be mathematically derived but given general estimation challenges with at-site skewness arising from sampling variability, it seems like a less essential endeavor for this study.
I strongly recommend that any choice to retain the assumption of stationarity for any region should be supported with at-site trend analyses of these site records as well as any prior literature, including studies that examine trends over periods of record of more than 50 years (to avoid confounding apparent trends with artifacts of inter-decadal variability) and studies specifically focused on snow trends.
Finally, I agree with the authors that the autocorrelation of the annual flood series is often weak and, consequently, that adjusting significance inferences for persistence is of second-order importance for their continental-scale investigation. However, it should be noted very briefly to avoid any inappropriate uptake of this work.
Sample moment estimation biases under stationarity ------------------------------------------------
While the authors point out some good reasons for not pursuing further efforts to correct common product moment estimators for bias, I think that the authors should pay a little more attention to this. In their revised manuscript, the authors write “while the estimation uncertainty of the mean is small, the uncertainty and bias of the estimators of CV and CS (equations 3 and 4) can be substantial. Ye et al. (2020) illustrate the uncertainty and bias in the estimation of CV”. This does not provide readers with an idea of the magnitude of this bias nor a sense of how to determine it should they be concerned about it for a practical application.
The authors are correct in observing that the specific methods for bias correcting CV estimates that Ye et al. (2020) employ (recommended in prior round of review) assume distributions other than the GEV, a distribution whose prominence in many parts of Europe has been previously established. However, Ye et al. (2020) provide these methods as examples and make it clear that the bias of the common product common CV estimator is not specific to any theoretical probability distribution.
Ye et al. (2020) cite the following general relation between the bias of the product moment estimator of the CV and the population CV, population CS and record length from Breunig (2011):
Bias(CV_est) = CV_true^(3/2)/N * [3*sqrt(CV_true) – 2*CS_true]
Indeed, this equation is difficult to apply for CV bias correction without knowing the true value of the CV unless Monte Carlo experiments requiring distribution assumptions are simulated. Yet, it is possible to use this equation to assess the general magnitude of CV estimation bias by examining ranges of plausible values of the true CV and true CS based on a priori knowledge of sites in a region.
For instance, using the 75% values of the estimated CV (0.61) and CS (1.69) as true values, one obtains the following correction factor for a 50-year peak-flow series:
(0.61)^(3/2)/50*[3*sqrt(0.61) – 2(1.69)] = -0.009 = -0.9%
For the 25% estimated CS (0.62), this rises to just 1.1%. Unfortunately, I cannot compute the bias over the full range of estimated CV and CS values (since only 25%/50%/75% are reported in Table 1). The authors may also want to expand this range given that CS and CV informing this range are estimated values, not true ones.
However, it could end up that the bias in the CV is relatively minor for the range of CV and CS in the study. In this case, the authors could state that after testing plausible CV_true and CS_true values reflecting the range of sites in their study, the adjustments to the CV general did not exceed a low percentage (e.g. 10%), therefore making it reasonable to use common product moment estimators of the CV for the sake of comparing their work with the body of literature that uses this estimator.
It would also be nice to mention that future work should involve the generation of GEV-based bias correction factors using an approach similar to the one that Ye et al. (2020) undertook with the lognormal, kappa, and Wakeby distributions.
With regards to skewness estimation bias, the authors could explore using the GEV-based bias correction method from Carney (2016), although this is really a second-order issue given the pronounced effects that sampling variability can have on skewness coefficient estimates.
OLS regression model assumptions -------------------------------------------------------------------
In their response to initial feedback, the authors are correct in stating that regression coefficient estimates are unbiased even when the assumptions of normality and heteroscedasticity are violated. I viewed the need for these assumptions to be evaluated as requisite for making hypothesis testing-based inferences using p-values and other standard error-based criteria. However, if the goal is to understand the range of coefficient magnitudes without making hypothesis-oriented inferences, then ignoring these assumption evaluations is less critical. However, the authors should make an explicit statement if this is a scope limitation that they would like to establish. If the take this approach, it is yet another reason for them to apply an all subsets modeling strategy in lieu of the stepwise one that they reported, which presumably uses a statistical significance-based criterion in adding and removing variables from the multivariate regression models (see below).
I appreciate the information in the appendices, and the importance of including the variance inflation factors to prevent excessive multicollinearity among explanatory variables as well.
Model building and variable selection ------------------------------------------------------------------
The authors used a stepwise [forward] selection process to build their multivariate regression models. The leaps R package has an all subsets routine that they could use to check if they missed any strongly performing models by using a stepwise selection process, which does not evaluate all possible combinations of explanatory/predictor variables.
Temperature as a proxy for snowmelt -----------------------------------------------------------------
The authors include two temperature variables (min winter temp and min spring temp) as proxies for snowmelt impacts on annual peak flows. Negative coefficients on their relationship with peak flows assume that colder winters and springs lead to greater snowmelt contributions to flooding. However, how well correlated are winter/spring temperatures with both seasonal snowpack and the rate at which it melts?
At a minimum, the authors should describe some constraints regarding the collection of consistent snowpack depth and snow cover data in Europe as well as limiting assumptions of their use as proxies and efforts to circumvent them. Other examples of regional regression studies using temperature variables as snowmelt proxies would also be helpful.
Soil moisture data biases ------------------------------------------------------------------------------
The authors write that “Soil moisture (SM) was taken from the CPC Soil moisture database, which contains model-calculated soil moisture values. Fan and Van Den Dool (2004) discuss some biases of the soil moisture data set, which may distort some of the findings here.” However, the authors do not comment further on any of these biases/distortions and their implications regarding inferences on process controls of annual peak flows.
Seasonality analysis not well integrated into manuscript narrative ----------------------------------
The seasonality analysis is interesting its own right, but it could be better integrated into the manuscript. A phrase or sentence in the abstract should mention this purpose considering the amount of text devoted to this component of the study.
Writing --------------------------------------------------------------------------------------------------
The authors should aim to reduce the length of the article to roughly 10,000 words by combining sentences, getting rid of unnecessary phrases and wordy language and possibly reducing the discussion of CS given the challenges that sample variability poses to at-site skewness estimates. See the Tracked Changes document for additional writing suggestions.
A minor comment on reproducibility -------------------------------------------------------------------
I still think it is valuable from a reproducibility perspective to identify the 22 catchments you omitted from your study due to insufficient covariate data. This does not have to be overemphasized, as you could add a quick list to your supplemental material or data repository. |