2nd Review of the manuscript ‘Defining Flood Seasons Globally using Temporal Streamflow Pattern' (hess-2015-140) by D. Lee, P. Ward, and P. Block
General Comments:
The manuscript has been revised thoroughly by the authors and some of the previous comments of the reviewers. This improved the clarity of what the authors had actually been doing, but also raised some more points to comment.
Overall, there are still some key aspects that need to be addressed before the paper can be published (together with some other minor aspects mentioned below).
1. The title of the paper has been changed to ‘Defining Flood Seasons Globally using Temporal Streamflow Patterns’.
However, with the clarifications and explanations on the methodology provided by the authors after the first review, this title does not seem appropriate!
The methodology presented enables the identification of the ‘high flow’ seasons and NOT the ‘flood season’! So the using ‘flood seasons’ in the title is misleading.
For many regions in the world, the ‘real floods’ do actually not occur during the high flow season (in the manuscript called ‘flood season.’)
This is actually also corroborated in several of their results. E.g. Figure 9, in which (although difficult to tell from the colour code used in that plot) more than 1/3 to 1/2 of all pixels fall in the evaluation class of ‘low’ to ‘poor’ (based on their own ‘subjective’ classification scheme (P7 L 17-18)).
Therefore, the paper is far from ‘defining flood seasons globally’!
Additionally, I would suggest, that the title should indicate that the ‘global’ scale of the paper comes from model output, as the actual data available for checking the results has strong spatial biases.
2. The authors highlight their new technique to define ‘flood seasons’. However, no formal testing of the new technique is performed, in which the technique would be compared to other already well established techniques that aim to define the flood season.
Generally, the method is similar to a ‘peak over threshold approach’ but instead of considering independent peaks, all daily flows above a volume based threshold are used to define the ‘flood season’ (which is actually the ‘high flow season’, or the ‘high flow spell’).
Therefore, the method simply identifies the month with the highest number of days above a flow quantile (here the upper 95% percentile). It is not clear why, the authors call it a new ‘volume and magnitude’ based technique. Additionally, a proper testing of the sensitivity of the results obtained to the threshold selected is essential when presenting a ‘new method’.
3. Additionally, the editor’s requests ‘What are the advantages to defining the new measure PM (and FS), in relation to existing published measures of flood seasonality? What shortcomings did the previous existing published measures have, and to what extent do your new measures overcome those shortcomings? have not been addressed!
The authors only compare their method to other approaches, by cross-correlating the identified peak months with other techniques.
Although the correlations applied are similar (Table 1) (which would be expected by using the same dataset and techniques that aim to identify similar features of the flow regime), a correlation between the different classification techniques cannot be used to justify the superior performance of their new method!
Correlation should never be confused with causation!
Instead, differences in the obtained correlation with any of the other classification techniques could actually mean that these techniques are superior in their performance in capturing the peaks!
I urge the authors to follow the well-established research approach of first testing all these techniques and then select the most appropriate technique for the rest of the analyses (which might be the new method, but maybe the older techniques perform better (one cannot tell from the current manuscript)). Instead of coming up with a new method first and then not thoroughly evaluating if the new method actually is better!
Specific Comments:
Section Abstract:
P2 L3-4: Please rephrase, as the sentence currently gives the idea that only the new approach of defining flood seasons is ‘objective’ and not the other methods. I think the current approach is as ‘objective’ as the other methods. So I would restrain from using the word ‘objective’ here.
P2 L9: I disagree that the defined flood seasons represent well the actual flood records from DFO. This is only achieved when the minor secondary flood seasons are included later in the manuscript! Please rephrase.
P2 L12-15: This is a false claim. The identified seasons (which are the peak month +- 1 month) do certainly not help to improve the understanding of flood frequency, trends and interannual variability. Please remove.
P4 L3-12: In this paragraph, the shortcomings of the previous studies are presented. High emphasis is placed on the issue of clustering. However, for most of the studies this is not the main aim, instead they also show very distinct seasonality patterns. Addition, the other studies are criticised for ‘not being representative of local scale conditions ‘ and that the current study is addressing ‘basin and even grid cells’. However, the analysis of this study is not ‘local’ either and the approach used to define a sub-basin’s months of flood peak (P 9 L10-17), local conditions are also lumped and lost as well. Therefore I suggest, rephrasing the entire paragraph and discussing the differences in the methods (how the flood seasons are defined) instead, with a focus on the outcomes of the flood season and not how the results are being applied to cluster regions in previous studies (which is only the second research step in most of the studies).
P4 L27: Please specify what ‘relaxing the criteria’ means. What options have been assessed?
P 6 Section 3.1: The first paragraph that briefly reviews existing methods and explains the similarities and differences with the new approach, is still very confusing.
Please re-write the section again in a more structured manner. Especially, please clarify what is meant by streamflow volume and magnitude (water level?)
P7 L 12: After reading the previous section it is still not clear why the PAMF ‘inherently contains magnitude and volume properties’.
P7 L 20- P8 L2: The description of other methods should go a to a paragraph before the ‘new method’ is presented
P8 L3-17: See ‘General comment Number 3’. Cross correlation between different techniques can never ‘indicate some success’ of one method compared to the other! Correlation is not causation!
P10 L 22-25: I suggest moving the discussion of Figure 7 further down in the document and discuss Figure 4, 5 and 6 first. Additionally, change ‘United States and Canada’, to ‘North America’ to be consistent with the labels in Figure 7.
P 10 L27-30: It is mentioned that low PAMF values are computed for the US and Europe and this is attributed to be due ‘at least in part, to reservoirs and dams along the Mississippi, Missouri and Danube’. This might true for the observed flows, however, with the modelled streamflow this should not be the case.
However, from Figure 5 for example on can see that the model obtains even lower PAMF values in Europe! So certainly, the human impact does not play a role for the PAMF value to be low!
As the analyses has a global focus, an in depth discussion (with more focus on spatial location) on the obtained differences in PM and PAMF is needed!
Additionally, for Europe and the US, only a few stations are actually located on the strongly anthropogenic impacted Rivers mentioned before. Therefore, the poor performance shown with the PAMF might also be a shortcoming of the method and needs to be discussed further!
P 11 L1-20: The discussion focusses too much on the areas where performance of the flood season can be considered acceptable. Areas with stronger differences such as Australia and South America are currently ignored and have to be discussed as well!
Additionally, it is highlighted that 40% of the models and the data share the same peak months. This is a quite low performance; however, the authors are not critical about this low outcome at all.
Overall, I think in discussion the results, the authors should aim for a more balanced assessment of the good and less successful outcomes of their method!
P 12 L1: I strongly object the authors claim that there is a ‘striking similarity’ between the DFO and the modelled season and that this ‘further supports the model’s ability to appropriately identify the PM spatially’. Please rephrase.
P13 L27: Please rephrase ‘streamflow magnitude and volume characteristics of floods’ to something that explains the method better.
P14 L 31-P15 L 2 : Please rephrase, as there are only 40% of the peak months that are appropriately identified correctly, which is not an ‘indication of strong agreement between model and observed flood season’ and the flood records of the DFO are also not ‘well represented’, to be more realistic about the outcomes of the study.
P15 L 24-25: Please rephrase, as the model does not ‘enable the complete flood season identification globally’. There are many locations on the globe where, there are problems and low performance as indicated by low PAMF values. Therefore, the method is not globally applicable. It would be better if it would be highlighted where the flood season can be expected to be well represented by the model!
Figures:
Figure 8: I suggest merging Figure 8 and Figure 9 into one Figure with 2 panels to allow a better interpretation of the results. Having Figure 8 and 9 together helps to interpret the reliability of the months defined in Figure 8 ).
Figure 9: From the current way of plotting, it is difficult to distinguish the different reliability classes as defined on page 7. For example, based on the classes defined beforehand, central Europe and most of Australia has a poor reliability. I suggest showing the PAMF not as gradual colours but actually show the colours according to the reliability classes defined beforehand so the results can be interpreted accordingly. This should also be better discussed in the text. |