Comment on hess-2021-67

This paper presents a new count-based method to identify episodes of clustered extreme precipitation events and quantify their contribution to large precipitation accumulations. There are a number of potential benefits of this approach relative to existing approaches including i) the lack of a need to make assumptions about the underlying statistical distribution, ii) an ability to identify individual clustered episodes, iii) a framework that allows for quantifying contribution of clustering to total precipitation, iv) its global applicability, and v) ready extension to other extreme phenomena. The work is therefore scientifically significant and should be well read by the community.

This paper presents a new count-based method to identify episodes of clustered extreme precipitation events and quantify their contribution to large precipitation accumulations. There are a number of potential benefits of this approach relative to existing approaches including i) the lack of a need to make assumptions about the underlying statistical distribution, ii) an ability to identify individual clustered episodes, iii) a framework that allows for quantifying contribution of clustering to total precipitation, iv) its global applicability, and v) ready extension to other extreme phenomena. The work is therefore scientifically significant and should be well read by the community.
The methods are valid, though I do have requests to elaborate further on the data and methods (see specific comments below).
The presentation quality is good. The manuscript is well written and the figures are clear. The abstract can be understood without reading the main paper. The work is well motivated with strong reference to prior studies concerning the clustering of climate extremes. I particularly appreciate the section comparing this new method to the more traditional dispersion metric. I also appreciate that the code is publicly available and easily accessible.
The subject matter is appropriate for HESS and is worth being published after my comments below have been addressed.

Specific comments
How adequate are the daily ERA5 precipitation data in capturing the extremes for this study? This is mentioned in passing in the main text, but I think the use of ERA5 needs further justification, citing of the literature, and a statement about whether the authors expect results to change using gridded observations or surface station data. Line 74: Please explain what you mean by 'timing'. Are you referring to the time of day, or time of the year? I think we need more explanation about why timing errors are so critical for this study to justify excluding the tropics. The runs declustering step needs more justification. I can understand its purpose for the case of slow-moving synoptic cyclones. But for the case of a multi-day sequence of afternoon severe convective storms, these are multiple events that are clustered rather than a single event. I appreciate the discussion on lines 277 to 287 of whether seasonality affects your Sf metric. But I still don't understand how seasonality is not a problem with your method. Using an annual percentile means that in cases with strong seasonality your episodes will mostly occur in the wet season. Does this mean that the method can't say anything about the role of clustering in drier seasons? Why can't a seasonally varying percentile be used? Figure 8, showing the intersections between clustering and large precipitation accumulations, is perhaps the key results figure of the paper. The regional differences are intriguing and the reasons for this regional variability likely depends on the regional climate processes. Can you suggest a few? Did you see any 40-year trends in extreme event counts or large precipitation accumulations? Do the dates of the episodes mostly fall in the latter half of the 40-year period? It could be interesting to map the ratio of the numbers of episodes in the first 20 years vs. the final 20 years, and whether the contribution of clustering changes across the two periods. This is a suggestion for additional analysis and is not required in the revision. My understanding is that the method converts the precipitation data to binary, and therefore loses information on the magnitude of the individual extreme precipitation events. If this is correct, then I don't think it would be too much additional data processing to additionally retain magnitude information. In doing so, many other scientific questions could be pursed. For example, you could look at sequencing and explore statistically significant differences in the magnitudes of the 1 st , 2 nd , 3 rd events within an episode, and how this varies regionally. I'm not suggesting you add this to the paper, but maybe note this as a potential extension in the Discussion. I think the paper would be stronger with a more in-depth discussion of how this method can aid physical process understanding of the clustering mechanisms. You go some way down this route in Fig. 11 but there is a lot more that could be done. For example, you could look at scalings between clustering and temperature or other environment variables. Or you could map out the time window length of the strongest clustering to get clues about contributing processes. Again, I'm not suggesting you do these analyses for this paper, but some further discussion about the ways the method aids process understanding is needed.

Technical Corrections
Fig3 and Fig4 could be merged into a single figure. I didn't see Table 3 referenced anywhere in the main text.