Submit to statistical journal

While (1) is not within the target scope of HESS, but should be submitted to a statistical journal like all the methods it is compared against (L130), (2) presents a method that is of interesting to many readers. Hence, I will not comment, here, on (1) but only on (2). Although the authors claim a generality of their ‘past’ scheme, I see several points why it is difficult to apply at data I am working with. I have several major concerns:


Submit to statistical journal
Thomas Wutzler (Referee) Referee comment on "Technical note: A procedure to clean, decompose and aggregate time series" by François Ritter, Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2021-609-RC2, 2022 The technical notes presents two studies: (1) the LogBox outlier detection (2) the past data aggregation scheme While (1) is not within the target scope of HESS, but should be submitted to a statistical journal like all the methods it is compared against (L130), (2) presents a method that is of interesting to many readers. Hence, I will not comment, here, on (1) but only on (2). Although the authors claim a generality of their 'past' scheme, I see several points why it is difficult to apply at data I am working with. I have several major concerns: (a) clarify and discuss the assumptions on the data series.

(b) Discussion on generality vs expert knowledge
Considering that the method is not as generally applicable, I have doubts that HESS is the proper place to publish the "past" method. If a techincal note is prepared for (2) for HESS it should be submitted as a new manuscript rather than a revision.
Thanks for making the source code and data available. I could reproduce the results an the plots.
(a) The basic assumption is that the dataset is an additive signal of a long-term trend, a periodic anomaly (termed "cylce" in the manuscript) that does not change with time, observational noise and outliers. As demonstrated, the method is already useful for such cases. However, of to be even more useful, the authors should think about about extending the method to infer or take into account changes of the anomaly with time. At least they need to give the possibility to the user to supply a mask slicing the time series into chunks where the anomaly can be assumed constant, e.g., stacking winter/spring/summer into different stacks.
In the first of the three examples, the daily cycle of temperature (luckily) does not vary. But what about synoptic cycles? During clear-sky weeks the daily temperature cycle will be larger than on cloudy weeks. For signals influenced by vegetation, the cycles will differ with phenology, etc.
I tried applying the method to several soil respiration time series of the publicly available COSORE dataset. I got it to work technically, but was not able to properly detect outliers and aggregate to annual values. For some series, there was probably too few data within periods (Vern series: 4hourly measurement within a daily period), for others the properties of the signal changed too strongly with season (Migliavacca series).
The authors need to better clarify the assumption and limitations of the method. The method is not as general as claimed in the first version of the manuscript.
(b) The application of case-specific outlier-detection and aggregation is discussed as being a thing one wants to avoid. However, usually researchers know their data quite well and know their distributions, stability over time, problematic periods, changes in measurement equipment etc. It needs a more balanced discussion on the value of consistency for metaanalysis and usage of expert-knowledge.
Outlook: many observational time series come in replicates. Can you think of ways to extend the method to use information across the replicates?