Reply on RC1

L21: Mixing past and present. Missing word: ... parameters [that ?] reasonably represent ... Reply: This sentence will reviewed. L22: The average KGE seems by far higher than 0.7 (see figure 3) Reply: The average KGE is actually >0.9, we will replace 0.7 with the appropriate value. L25-27: On the recommendation for reducing uncertainties, see SP5 (Same remarks for the last sentence of the conclusion L525-526). Reply: We will make the appropriate revision to L25-27.


General comments
The authors propose an automatic approach to infer and classify the dynamic characteristics of karst systems based on the analysis of the recession hydrograph. The approach introduces variability through multiple recession extraction methods, fitting individual recession segments and optimization approaches with various degrees of freedom for a fast and slow flow recession model. Recognition of the duality of porosity and processes in karst systems leads to a vision of a generally accepted two-reservoir conceptual model. The desire to develop a method of recession analysis that is consistent with this view is well justified. The automated and multi-angle traits of the approach are indeed essential to cope with the known biases of single and visually supervised approaches to recession analysis. In my opinion, this work will improve the robustness of the comparison of the dynamic characteristics of karst hydrological systems. Furthermore, I believe that tools for comparing hydrological systems are an area of research that will continue to develop and contribute to a better understanding of hydrological systems. For these reasons, I consider that the article is following the scope of HESS.
In general, the paper is intelligible. The results are interpretable and make sense. The introduction correctly sets the context, and the discussion provides elements that facilitate the interpretation of the results. Nevertheless, I find that these two sections do not motivate well enough the choices that have been adopted and do not give a good account of the strengths and weaknesses of the approach. As a result, the modeling philosophy and epistemological approach are a bit blurry and sometimes inconsistent (see specific issues). Clearly stating these choices would allow the discussion section to be more structured, sharpened, and provide better perspectives and research avenues. Currently, I found the latter to be relatively weak or absent. Still, I enjoyed the approach and the content, particularly the idea of varying the degrees of freedom in the model optimization. I feel that this is a reliable piece of work that I could reuse myself and will certainly be useful to other researchers. Therefore, I consider the scientific quality to be good. Since it is a paper that proposes a method, I also regret the lack of additional material that would facilitate re-implementation (input/output and/or code recipes). I would also have appreciated having access to a bit more details about the results, not especially in the paper itself but in the appendix or the supplementary materials, about the seasonality of parameters and the statistics of recession segments.
The figures are clear and good. The introduction and discussion sections are ok but could probably be improved and sharpened. But in my opinion, this is more a question of substance than form (see above and specific issues). However, I regret the long and complex sentence constructions, plus a large number of grammatical errors. I think they are below acceptable standards and could be easily removed with a good grammar/spellchecker. Reply to General Comments: We thank reviewer 1 for her/his useful and valuable comments that will help to improve the manuscript, in particular its readability. We will carefully rework the entire text in order to improve its structure and remove typographical errors. We will improve the introduction and discussion sections by accounting more detailed on the core strengths and weakness of the recession analysis approaches. We will make the R-codes for recession analysis as well as input data (spring hydrographs) publicly available via our Github repository. We will provide further details of the results of our analysis in the appendices. All suggestions will be taken into account. According to her/his comments, we will perform the following changes.

Specific issues
As stated in my general comment, I have very few concerns about the approach itself (except perhaps SI-2), and I have a favorable opinion about the paper. Still, I am concerned about how the methodological choices are introduced and motivated and how the results are interpreted. It lacks coherence and proper motivation, which in some respects affects the clarity of the reading, might suggest that the method is inappropriate, and prevents the development of clear research perspectives. I present below the nature of my concerns and my suggestions on how they can be addressed in a motivated and coherent rationale as rooms for improvement through possible minor revisions of the introduction and discussion sections. There is a first inconsistency in the motivation for fitting individual recession events. It is said that fitting individual recession events would allow capturing the variability for better inference of the structural and dynamic traits of a system (L48-51) and would be more consistent because the parameters obtained from the entire set of recession points could not be actually representative of individual cases (L216-217). Yet, in the other section, the authors tend to interpret the mean of all parameters and indicators as representative of the karst system. In the meantime, the spread of the parameters is deemed both necessary and valuable as it reflects dynamic properties but, paradoxically, also considered as unwanted uncertainty that we shall reduce for having a set of "representative" parameters (L25-27, L450-460, L525-526). I discuss the author's suggestion for reducing uncertainties in SI-5. For now, l continue with the value of the individual segments approach.
Historically, recession analysis methods applied on all points pursuit the same goal of inferring the dynamic and structural properties of the system. It is also possible to have uncertainty bounds, analytical or bootstrapped ones, for estimated parameters. So, why go for individual recession fitting if you can already depict the average and the variability of the estimated parameters? The answer is that those who have adopted this approach have learned something about the dynamics of the system. The parameters were projected on the temporal/seasonal axis, and interesting patterns related to the seasonality or the antecedent conditions of hydrological variables appeared [See 1-5]. In doing so, the individual segment approaches proved themselves useful and showed that the low dimensional models that we commonly use (Mangin is no exception) are actually underfitting recession dynamics [6].

Reply to SI-1 (Paragraph 1 and 2)
We thank the reviewer for the useful and critical reviews. We agree that the premise given for individual recession segment (IRS) analysis seems inconsistent with our suggestion for reducing parameters uncertainties. However, L48-51 and L216-217 were meant to provide justification for our methodological choice of using the IRS analysis approach. Our focus is to explore the possibility of identifying intrinsic karst aquifer dynamic with IRS analysis. Even though karst systems are very heterogeneous, yet the systems' geometry are somewhat fixed -although dissolution of carbonate rocks would alter the geometry, but this happens on a century to millennia time scale. So we expected a systematic pattern in parameters variability, which should be linked to the system's dynamics. We identified that more of the variability found resulted from varying length of recessions and that was why we gave recommendation (L450-460) to consider grouping recession events by length in order to identify actual variability resulting from system's heterogeneity. Our main concern for providing such suggestion is for users (e.g. water resource managers) who ideally would rely on specific values for resource planning and management.
Therefore, I would encourage the authors to report the seasonal variation of the estimated parameters because it is a valuable insight gained through the individual segments approach and, moreover, a legitimate motivation in favor of the approach. They should capitalize on it and promote this feature as an additional strength of the approach. Furthermore, no paper, to my knowledge, reports the seasonal variability of η and ε for karst systems, same for i and K. For now, the importance of seasonality is barely mentioned despite that it is recognized as a source of variability (L435-437), and its contribution is not pictured or quantified. I understand that the paper is focused on proposing a method, but a graph about the seasonal dynamics of parameters or the classification is not expansive and could be placed in the supplementary materials.

Reply to SI-1 (Paragraph 3)
The length of recession is directly related to duration/intensity of precipitation event preceding the recession event. This means, longer recession in summer and shorter one in winter. In fact, differing length is actually influence of seasonal variation. Hence, we completely agree with the reviewer recommendation to explore the parameter variability along seasonal dimension. Therefore, we will provide additional analysis regarding parameters and seasonal variability and we will review lines L25-27, L450-460, and L525-526. Additional discoveries from this analysis will be included in the appendices as well.

SI-2: Motivation for the Mangin model and framework.
The hypotheses behind the Mangin model are not discussed enough, and the classification framework is not sufficiently described in section 2.2. The hypothesis of the duality of porosity and processes in karst systems is well described. However, the hypothesis of the matrix flow following the linear recession equation of Maillet (L170, Eq. 3) is not. Nonlinear equations are common to describe baseflow, including in karst systems [7]. Some comments on why the Maillet approximation can be used would have been appreciated, if not theoretical, with an empirical justification of its application showing that Mangin's framework is widely used and will benefit from more robustness.

Reply to SI-2 (Paragraph 1)
We will expand the discussion on the Mangin model in section 2.2 and add a new subsection to specifically introduce the Mangin's classification framework.
Besides, the authors paradoxically rely on the Askoy & Wittenberg (A&W) REM that explicitly states that baseflow is nonlinear (L141, Eq. 1). At first sight, it appears as a rationale problem. However, if the importance of the classification Mangin framework is more emphasized in the objective of the paper and better described in section 2.2 (or best, in its own section), using the linear model that is part of this classification framework of the authors' choice make sense. Also, even if the Mangin baseflow model is linear, extracting recession with the nonlinear method of A&W could be justifiable, because Mangin's model with both components and sufficient degrees of freedom, is in fact, a nonlinear recession model. Still, it would have been more consistent, in my opinion, to use A&W approach with the linear Maillet Eq.3, eventually, by choosing a higher CV to compensate the lack of degrees of freedom and the fear that the outcome would be too restrictive on the number of recession point.

Reply to SI-2 (Paragraph 2)
Once again, we thank the reviewer for pointing this out. It is correct that A&W extracts baseflow recession points by fitting non-linear model. However, Eq.1 (L141) becomes linear if the value of b = 1(Aksoy and Wittenberg, 2011). In fact, we set b to 1 to make sure we have that consistency with the linear approach of Mangin. We will add one or two sentences to paragraph 3 of section 2.1 to clarify this.

SI-3: Motivation/Discussion of the REMs
In the discussion (L411-412), the authors mention that: "Overall, the adapted REMs and the introduced three POAs provide range of results that adequately represents the karst systems. This suggests that the modified REMs are well suited for application to karst spring recession analysis". I found the conclusion a little too optimistic and leaving little room for improvement. Plus, I feel that it is contradictory with the suggestion of focusing on different segment lengths to reduce uncertainty (L455-56). Why not say that REMs are unsuited then?
Three REMs are selected on the honorable basis of what previous authors have suggested (L160 , Table 1). Despite my concerns about the A&W REMs (SI-2), it is quite common to refer to and implement these former REMs in recent papers [6, and 8-9 as referred by the authors]. Thus, it is justifiable. However, this is not a strong rationale. None of the authors of REMs had the primary objective of providing a method for recession extraction. They had to, but their focus was on recession analysis, not extraction. Also, they were not aware of the recent discovery inference problems associated with the extraction method (e.g., [8]). In fact, they look at recession from the narrow and specific scope of the criteria set in Table 1. Using them broaden the scope of the analysis, but still, these are three specific angles of approach to recession extraction, and it is delicate to affirm that they adequately represent the karst systems.
My point is that a noticeable perspective for improvement would be to rely on one or more generic method that allows varying the criteria of selections (beginning/end of recession and tolerance to anomalies). It would allow inferring the statistics for ranges of criteria instead of a few predetermined ones. Coupling a generic method to the POAs could certainly improve the framework in the future.

Reply to SI-3
We thank the reviewer for the valuable suggestion. It is true that the REMs were developed to suit catchments with specific properties. We also accept that there is potential to improvement and further development of karst-specific recession extraction method. We will review the statements of L411-412 accordingly and discuss aspects of recession extraction that could benefit from further improvement. However, we would like to mention that recession selection criteria used by the REMs could still be modified by a user to suit specific requirement.

SI-4: Degrees of freedom and data points
There is a concern about the parsimony of the approach and the limited number of points in the recession segments [6]. Instead of three models, one for each study site, more than two thousand models were fitted, in which the degrees of freedom is very close to the number of data points. So far, I believe that the results and their interpretation make some sense, and so, I am willing to believe that what is captured is meaningful.
I could have recommended relying on a nonlinear two parameters equation to add another degree of freedom to the late recession (the nonlinear exponent, SI-2), but should we? The natural question is how far we could go in terms of degrees of freedom and when equifinality will produce unreliable meaningless parameters? I understand that this is a separate question that would deserve another paper, but it is worth mentioning it.

We thank the reviewer for providing a very useful and intriguing insight. In terms of recession data points, we selected recession events longer or equal to 7 days period (L284-286), so we have enough data points for the parameters optimization. The whole analysis (recession extraction and parameter optimization) is not computationally expensive; it took couple of minutes to run the analysis in R.
We also considered introducing a fourth Parameter Optimization Approach (POA) by a using nonlinear model for the slow flow recession. However, this would result in too many dimensions of REMs and POAs; also, it would be inconsistent with the Mangin classification framework we used. Nevertheless, we will be exploring this in a different study.
Also, I miss some more statistics in Table 4 (L292), such as the average number of recession points per event, which would help appreciate this issue. Also, perhaps statistics about the flow Q and dQ/dt captured by the REM methods would also be interesting.

Reply to SI-4 (Paragraph 2)
We thank you for pointing this out. We will expand Table 4 by providing the suggested statistics of the extracted recession events.

SI-5: Uncertainties and the subsequent suggestions
I have experienced contradictory opinions about the uncertainties. On the one hand, they are considered essential because they reflect the actual dynamics of karst. On the other hand, the authors wish to reduce them and offer suggestions for doing so. Also, attention should be paid to using "reliable parameters" when we know that recession models are underfitting recession dynamics (SI-1). The term "robust parameters" seems more appropriate.

Reply to SI-5 (Paragraph 1)
Our motivation for using IRS analysis is to capture the karst system dynamics reflected by spring discharge (Jachens et al., 2020;Kovács et al., 2005). As discussed in our reply to specific comment 1 (Paragraph 1&2), we found the variability of parameters estimated with the IRS analysis to be too large and we believe this can be reduced in a systematic way or better quantify. We address issues relating to our suggestion for reducing the uncertainty in our reply to subsequent paragraphs where the reviewer further discusses them. We will replace "reliable parameters" with "robust parameters" in the manuscript.
The authors refer to suggestions in the abstract and conclusion, but nothing is said explicitly about them in these sections. The suggestions are two, not very well evidenced and hidden in lines L455-460 : (i) focusing on segments of different recession lengths, (ii) using the master recession curve (MRC). I found none of them to be very relevant nor developed. With one (i), it is basically said that REMs are inappropriate (SI-3), and, by focusing on one type of segment length, some processes are dismissed, and you reduce uncertainty by occulting the natural variability that you aim at capturing. Perhaps, presenting the results per season would be more appropriate (SI-1) to reduce the uncertainty of sensitivity to initial conditions.

Reply to SI-5 (Paragraph 2)
As discussed in our reply to Paragraph 3 of specific comment 1, our suggestion to group recession events by length is encompassed by considering seasonal variability. We agree that exploring the parameters variability with respect to seasonality will provide better approach to explain/account for the parameters uncertainty. Therefore, we will implement this change as we already mention in our reply to specific comment 1.
Regarding the use of MRC (ii), I am not sure I understand what the authors mean. I understand that since the MRC is an average behavior, fitting this average will produce a single estimate of the parameters, which will naturally reduce the uncertainty. Well, this is not reducing, it is still occulting, which is why I believe the authors ultimately reject their own suggestion in the last sentence (L459-60). Note that the MRC approaches with uncertainty quantification do exist (e.g. [10]), so what is said in L99-101 and L459-60 is not correct. It is entirely possible to consider an approach to MRC that takes variability into account. That said, I would not recommend it. I don't see the point of building an MRC, which is a delicate process when you have a model that you can calibrate directly on the whole hydrograph. I understand that some of the references cited do, but it doesn't make much sense to me. The MRC, in my opinion, is an empirical method that is applied when one wants to abstain from model fitting because of their too restrictive and inappropriate hypotheses on the dynamic (see the introduction of [11]). The authors' approach is hypothetical-deductive and relies on intelligible parameters to classify the dynamics. The references to the empirical MRC are, in my opinion, inappropriate and confusing.

Reply to SI-5 (Paragraph 3)
Thank you for pointing out that there are MRC approaches with uncertainty quantification. However, we do not know of any MRC analysis approach for multisegment (quick and slow flow) recession event that quantifies uncertainty. Having mention this, we will review the statements made in L99-101 and L459-60. Our suggestion of MRC in L457-458 read thus "Another way of coping with this problem is to consider master recession curve analysis which is often criticize for its inability to adequately represent storage variability (Gregor and Malík, 2012;Kovacs, 2021;Kresic and Bonacci, 2010)". By this statement, we do not intend to mislead the reader that MRC approach is better. We only provide this suggestion as an alternative approach if one does not want to be worried with issues of parameters variability arising from IRS analysis. However, we understand the concerns of the reviewer and we will rework L457-458 to reflect our main intention for providing the suggestion.
In L459-460, we quote "However, since per event analysis is useful for better understanding of the system's dynamic, defining a systematic approach to quantify parameter uncertainty will help to increase the confidence of individual recession segment analysis". With improved modeling approaches and computational ability available nowadays, we believe a systematic uncertainty quantification can be incorporated into IRS analysis. We are presently working on a separate study that integrate parameter uncertainty with IRS and if possible, we will be glad to share this with the reviewer in the future.
I think better suggestions would most likely come from criticizing the approach, the Mangin model, and the classification framework. If one projects data from a complex, high dimensional reality onto a lower-dimensional model and classification framework, there will be uncertainties that are both natural variability and projection artifacts. If they are too large, the message is that the model or framework is not helpful, and another projection must be found. Mangin developed his framework without taking uncertainties into account. Now that the authors have done so, relevant questions arise: should the dispersion of the parameter distribution, which reflects the sensitivity of the watershed, be taken into account in the watershed classification framework? How can this be done? Furthermore, the i indicator was found to be a poor discriminant for the classification of the three study sites. Is this a poor indicator that should be removed from a classification framework, or can we expect that with other study sites, i may show more distinct and clear patterns? Should a nonlinear recession be considered in place of the Maillet equation? These are just a few interesting questions that arise when the authors allow themselves to critique the analytical framework.
Questioning it is not shooting oneself in the foot. I think the Mangin model and framework is a recognized and intelligible way of framing the dynamics of the karst system, and the authors have succeeded in providing a more robust way of doing this. The authors could stress this success while recognizing that they have also highlighted potential limits. Even so, the proposed automated framework could also be used more widely to assess the relevance of the Mangin model and classification framework in the future to propose relevant improvements. Finally, their approach is also transferable to other models and classification frameworks that will be developed in the future.

Reply to SI-5 (Paragraph 4 and 5)
We thank the reviewer for the interesting and intriguing questions raised about the Mangin classification framework. In fact few authors (e.g. El-Hakim and Bakalowicz, 2007) have also identified some limitations of the Mangin framework and modified the classification scheme. Almost in every studies where Mangin classification has been used, estimation of K and i were based on few recession events, usually the longest summer event(s). So the robustness of the classification framework has not really been tested with a more dynamic sample of recession events as we have done in this study. Therefore, with respect to the suggestion of the reviewer, we will provide additional discussion to highlight the questions raised about the classification and possibility for improvements.

SI-7: Reproducibility
This issue is not related to the introduction or the discussion section but to the lack of supplementary materials. The authors propose a technique. This technique is itself motivated by the idea of reducing user error and bias. Although the mathematics remains clear and straightforward, providing input/output files, code recipes, or examples will significantly help future users ensure that the technique is correctly applied and, therefore, be more in phase the paper's purpose.

Reply to SI-7
As previously mentioned in our reply to the general comments, we will provide the R-code and input datasets with usage instructions via our GitHub repository.

Comment on the title
The title is straightforward. However, "efficient" and "accurate" are not the best term.
Efficient can mean many things, for instance, "computationally efficient" or well organized. Accurate is not the best given that the point is to depict and leverage the parameters' variability. The fact that the method is automated is stressed in the introduction and is probably more appropriate. The central classification task is missing in the title. I would suggest, for instance: "Karst spring recession analysis and classification: an automated method considering both fast and slow flow components."

Reply to comment on the title
We thank the reviewer for the suggestion. We considered the suggestions made by the reviewer. However, we will like to retain "efficient" because the methods are fast and computationally inexpensive. Therefore, we will change the title to "Karst spring recession analysis and classification: efficient, automated methods for both fast and slow components".