the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Using LSTM to monitor continuous discharge indirectly with electrical conductivity observations
Abstract. Due to EC’s easy recordability and the existence of a strong correlation between EC (electrical conductivity) and discharge in certain catchments, EC is a potential predictor of discharge. This potential has not yet to be widely addressed. In this paper, we investigate the feasibility of using EC as a proxy for long-term discharge monitoring in a small karst catchment where EC always shows a negative correlation with the spring discharge. Given their complex relationship, a special machine learning architecture, LSTM (Long Short Term Memory), was used to handle the mapping from EC to discharge. LSTM results indicate that the spring discharge can be predicted well with EC, particularly in storms when the dilution dominates the EC dynamic; however, the prediction may have relatively large uncertainties in the small or middle recharge events. A small number of discharge observations are sufficient to obtain a robust LSTM for the long-term discharge prediction from EC, indicating the practicality of recording EC in ungauged catchments for indirect discharge monitoring. Our study also highlights that the random or fixed-interval discharge measurement strategy, which covers various climate conditions, is more informative for LSTM to give robust predictions than other strategies. While our study is implemented in a karst catchment, the method may be also suitable for non-karst catchments where there is a strong correlation between EC and discharge.
- Preprint
(1225 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on hess-2022-77', Anonymous Referee #1, 16 Apr 2022
Chang bet al. present the application of a statistical approach (LSTM) to determine discharge of rainfall event runoff from instream EC measurements.
MAJOR
Title/Premise: The authors present the application of a statistical approach (LSTM) to calculate discharge during rainfall events from EC observations. The title (Using LSTM to monitor continuous discharge indirectly with electrical conductivity observations) might perhaps mislead the reader, as certain time periods (low flow, initial runoff) are clearly excluded from the analysis. A more fitting title would be: Using LSTM to monitor STORMFLOW DISCHARGE indirectly with EC observations.
The performance of a model using EC only is compared to models using both EC and P and only P. It might be interesting to compare the selected model to a more simple approach, to really highlight the added value of a more complex model.
20: In your abstract in line 20 you write that in your spring EC always has a negative correlation with spring discharge. However, in line 126-130 you mention that there is occasionally a positive correlation (EC peak at the initial runoff).
23-25: “LSTM results indicate that the spring discharge can be predicted well with EC, particularly in storms when the dilution dominates the EC dynamic; however, the prediction may have relatively large uncertainties in the small or middle recharge events.” It seems the findings of your study do not support this conclusion at all. As I understood, spring discharge could ONLY be predicted well for large storm events; there are large uncertainties when it comes to intermediate and small events and it was not possible at all to use EC for the estimation of baseflow/low flow. So, one might conclude that overall spring discharge can actually not be predicted well.
130: It is unclear why a there is a need to correct the maximum EC values in 2017 to match them with 2018 and 2019. Please elaborate why the maximum EC should be the same in all years.
130: You corrected for drift of the sensor by subtracting 23µS/cm. Please elaborate why you choose this specific value. Also: A simple subtraction of measured EC does not adequately account for gradual drift.
424: You elaborate that the EC dynamics of the investigated spring are relatively simple without temporal EC peaks at the beginning of storms. However, in line 126-130 you describe that you found indeed initial EC peaks at the beginning of storm events in your 2018 and 2019 data and you state that you excluded these observations from your analysis.
426: To my knowledge, the cited paper of Hess and White (1993) does not give any reference to “piston flow”, it doesn’t mention the words ‘piston flow’
MINOR
83 –geographical coordinates of the spring might be useful
83-91 citation might be useful
Figure 1a: labels in map are too small to read
120 -121: “the spring`s EC dynamic is MAINLY controlled by the rock dissolution and the dilution from the low-EC event water during storms.” – what other minor influencing factors are there?
133: wrong unit: 23us/cm -> 23µs/cm
170: “LSTM belongs to a special kind of recurrent neural network” – I suggest different wording
253: “The performances of MP and MECP deteriorate obviously probably due to …” – obviously or probably, which one is it?
Figure 3e: red line in legend is missing
285: wording: middle -> intermediate
Citation: https://doi.org/10.5194/hess-2022-77-RC1 -
AC1: 'Reply on RC1', Yong Chang, 16 Jul 2022
Thank you very much for the valuable comments.
MAJOR
- Title/Premise: The authors present the application of a statistical approach (LSTM) to calculate discharge during rainfall events from EC observations. The title (Using LSTM to monitor continuous discharge indirectly with electrical conductivity observations) might perhaps mislead the reader, as certain time periods (low flow, initial runoff) are clearly excluded from the analysis. A more fitting title would be: Using LSTM to monitor STORMFLOW DISCHARGE indirectly with EC observations.
Response: Thank you. This is an excellent suggestion. We will change the paper title accordingly.
- The performance of a model using EC only is compared to models using both EC and P and only P. It might be interesting to compare the selected model to a more simple approach, to really highlight the added value of a more complex model.
Response: In the manuscript, we have compared the LSTM model to a simple linear regression model (see lines 189-191 and figure 3b). The regression model shows much worse performance than MEC.
- 20: In your abstract in line 20 you write that in your spring EC always has a negative correlation with spring discharge. However, in line 126-130 you mention that there is occasionally a positive correlation (EC peak at the initial runoff).
Response: Thanks. We will revise line 20. The spring EC always has a negative correlation with spring discharge in most times.
- 23-25: “LSTM results indicate that the spring discharge can be predicted well with EC, particularly in storms when the dilution dominates the EC dynamic; however, the prediction may have relatively large uncertainties in the small or middle recharge events.” It seems the findings of your study do not support this conclusion at all. As I understood, spring discharge could ONLY be predicted well for large storm events; there are large uncertainties when it comes to intermediate and small events and it was not possible at all to use EC for the estimation of baseflow/low flow. So, one might conclude that overall spring discharge can actually not be predicted well.
Response: We will further revise the sentences. Actually, most spring discharges including the baseflow under large storms can be well predicted by EC. This can be seen in Fig.4 that MEC has a large NSE value in storm events. In the manuscript, we define the discharge in the storm event as the period from the end of the last recharge events to the beginning of the next recharge events. It is also possible to predict the discharge under intermediate events but with a large uncertainty compared to the predictions under storm events. It is true that it was not possible to use EC for the estimation of low flow since a low correlation between EC and discharge in the small recharge events. Although this drawback, our approach is still promising because the discharge dynamic under storms or intermediate recharge events is the key information for flood management or understanding the behavior of hydrological systems combined with other hydrochemical indices.
- 130: It is unclear why a there is a need to correct the maximum EC values in 2017 to match them with 2018 and 2019. Please elaborate why the maximum EC should be the same in all years.
Response: The hydrochemistry of the studies spring is mainly controlled by the dissolution of carbonate rocks. The maximum EC of the spring water corresponds to the calcium carbonate equilibria. Meanwhile, the spring locates in the phreatic zone and its most hydrochemical indices, such as temperature, pH and EC,basically do not show obvious seasonal variation according to the previous monitoring. Therefore, the maximum EC of this spring is always relatively stable in different years. Given two different data loggers were used to monitor EC in 2017 and the other two years, it is reasonable to assert that the discrepancy in maximum EC between 2017 and the other two years is mainly caused by the instrumental drift.
We will add these clarifications in the revised version.
- 130: You corrected for drift of the sensor by subtracting 23µS/cm. Please elaborate why you choose this specific value. Also: A simple subtraction of measured EC does not adequately account for gradual drift.
Response: The selection of 23 µS/cm is based on the assumption that the maximum EC value in the 2017 is same to that in the other two years. Actually, although the maximum EC of spring water is relatively stable without an obvious seasonal variation at the study site, the maximum EC value may still have a slight variation. In the revised manuscript, we will add another plot to show the variation of the simulation result of MEC with the different EC adjustment values in test period 1 to further illustrate the uncertainty caused by this drift adjustment.
- 424: You elaborate that the EC dynamics of the investigated spring are relatively simple without temporal EC peaks at the beginning of storms. However, in line 126-130 you describe that you found indeed initial EC peaks at the beginning of storm events in your 2018 and 2019 data and you state that you excluded these observations from your analysis.
Response: I will further revise the sentence.
- 426: To my knowledge, the cited paper of Hess and White (1993) does not give any reference to “piston flow”, it doesn’t mention the words ‘piston flow’
Response: Thanks. There is a mistake here. The cited reference is a paper published also by Hess and White (1988) in which they found the phenomenon that the spring EC may rise firstly before beginning to drop during the storm. The ‘piston effect’ was named by Goldscheider and Drew (2007) in the book <Methods in Karst Hydrogeology>. We will add this reference in the revised version.
MINOR
83 –geographical coordinates of the spring might be useful
Response: We will add the geographical coordinate.
83-91 citation might be useful
Response: We will add the relevant references.
Figure 1a: labels in map are too small to read
Response: We will increase the label size accordingly.
120 -121: “the spring`s EC dynamic is MAINLY controlled by the rock dissolution and the dilution from the low-EC event water during storms.” – what other minor influencing factors are there?
Response: The spring EC may also slightly influenced by the concentration variation of some other irons during storms, such as K+, Na+, Cl-, and SO42-.
133: wrong unit: 23us/cm -> 23µs/cm
Response: Thanks, we will revise the unit.
170: “LSTM belongs to a special kind of recurrent neural network” – I suggest different wording
Response: Revise to ‘LSTM is a special recurrent neural network’.
253: “The performances of MP and MECP deteriorate obviously probably due to …” – obviously or probably, which one is it?
Response: Revise the sentence to ‘The performances of MP and MECP show an obvious deterioration which is probably due to…’.
Figure 3e: red line in legend is missing
Response: Thanks, we will update the legend.
285: wording: middle -> intermediate
Response: Thank you. We will change ‘middle’ to ‘intermediate’ in the revised version.
Reference:
Hess, J. W., & White, W. B. (1988). Storm response of the karstic carbonate aquifer of southcentral Kentucky. Journal of Hydrology, 99(3), 235–252. https://doi.org/10.1016/0022-1694(88)90051-0
Goldscheider, N., & Drew, D. (2007). Methods in karst hydrogeology: IAH: International contributions to hydrogeology (Vol. 26). CRC Press.
Citation: https://doi.org/10.5194/hess-2022-77-AC1
-
AC1: 'Reply on RC1', Yong Chang, 16 Jul 2022
-
RC2: 'Comment on hess-2022-77', Anonymous Referee #2, 18 Jun 2022
Yong Chang et al. present a study on estimating hourly discharge in a small 1 km2 karst catchment from precipitation and EC measurements using a LSTM. They set up three different LSTMs based on EC, precipitation and both signals together. Moreover, they explore the performance of other versions of these models with a reduced amount of provided training data. The topic of the study is an interesting contribution to the field, since the added value of EC measurements with gauge levels is indeed underexplored. Also the question about gauging strategies to build a rating curve is of interest. However, I see a couple of severe issues with the study which are in conflict with the strong claims raised and which require to be resolved before final publication.
Major Points
If I understand correctly, the only true estimate of discharge from EC is done with the M_EC model. Given the claims of the title, abstract and introduction, I would not expect precipitation as further variable. L77ff. again precipitation is not mentioned but the use of EC as a proxy. I think the rest of the paper does not really follow this line.
The models including precipitation input are directly predicting discharge. Hence a simple hydrological model and not a linear regression should be the benchmark for these models. Given the situation that the models including precipitation input perform worst in the 2nd evaluation period and otherwise in training and the 1st evaluation period, this raises concerns about what the LSTM actually learned during training. Apparently the temporal patterns of discharge in 2017 and 2019 are more similar than in 2018. What would happen if the model was trained in a different period? Why do the authors expect that the LSTM got sufficient data, when it obviously fails for the test period 2?
Why do the authors use a mean squared error as objective function (L200) instead of a more specific or several complementary evaluation functions?
Using the NSE for evaluation has the known shortcomings and tendency to high values with seasonal climate (Schaefli and Gupta 2007). Given the monsoon climate in the study region, a NSE >0.5 in the evaluation period should not at all be surprising or convincing. Given the adaptability of a LSTM a NSE near 1 should be expected during training. A NSE<0 refers to predictions worse than the mean value. Hence I would expect that the authors would not show arbitrary y-axes limits but to give clear guidance that the performance is not really impressive. Moreover, I would expect further performance measures like KGE, Spearman rank correlation etc.
If I understood correctly, the LSTM is allowed to receive forecasted EC values. I wonder if this is a fair comparison if P is only given in hindcast. If P and EC measurements could be used as proxy measurements, why should I bother about not using forecasted P too? How did the authors assess the chosen time window? I was also unable to identify the m-parameter defining this window. Moreover, I did not really understand the selection of a 7 h time delay factor (L192) since the LSTM should well be capable to learn this.
The authors rightfully expose discharge as central hydrological variable (L36f). But if I would replace this measurement with a model, why should I still be at least somewhat confident about my water balance to be met? Why should I use precipitation as a further explanatory variable to predict discharge if I then would use discharge and precipitation to estimate further characteristics? This fundamentally opens the gates for spurious correlation ill-posing the matter of measuring discharge in the first place.
Given these questions, I am under the impression that the second part of the analyses with different subsets of training data is actually highly case specific. This does not only relate to the selected arrangement of training period, objective function and evaluation procedure. It also refers to the system under study: 1) The authors already modified the EC data (L128ff.). 2) A Karst system should rather directly relate to fill-and-spill dynamics (McDonnell et al. 2020), which are a perfect learning case for LSTMs rarely met in other hydrological systems. 3) The catchment is very small (1 km2). Hence, I would be very cautious about the capabilities to perform this kind of analysis and the strong claims interpreted from the results. In the current form, I would not really agree that the findings are sufficiently supported.
Minor Points (only points in addition to the major ones are listed)
Title: I find the title not really in line with the content of the paper.
L21: What complex relationship? What special ML architecture? This is far too fuzzy.
L25: I did not spot any assessment of uncertainties. I guess you refer to the overall model performance evaluation.
L39f: depth? water level!; defined relationship? rating curve! Why omitting the established terminology?
Fig 1: I do not really get anything from the maps a and b. Map c is difficult to interpret.
L106: what is a combination of rectangular weirs? Do you have a rating curve for the weirs or is the discharge merely calculated with an empirical weir function? How is the gauge measured? Which uncertainty would you expect?
L108f: I suspect a Onset U24? Why do you report 15 min resolution if later on hourly data is used?
L124: What is unsaturated fast flow?
Fig 2b: Why are the side panels in reverse order and without annotated marks in the main panel?Why is the linear model used as reference not plotted? Why is (again) a different correlation measure used?
L148f: I guess you refer to discharge events (not rain events)?
L155f: A strong relationship? I would not claim a correlation of -0.51 to be specifically strong. Hence the relationship might be somewhat tangible there and is not found when plotting EC to Q for lower discharge.
Sec 3.1: Why dont you calrify your strategy with the three models M_EC, M_P and M_ECP upfront?
L192: What is really meant with the 7h forward shifting?
L206: Why do you report the NSE equation. Not needed. Better add further evaluation estimates.
L263: The benchmark is the linear regression which is slightly better than a pure mean value…
L265: See major points about the NSE and the expectations for an LSTM. Avoid normative claims. Certainly they do not expose excellent capability…
Fig 3: Caption reports Fig 2 instead of 3.
L276: Again, how do you support the claim? Test period 2 obviously fails and it is not analysed if this is due to the lack of precip data. Actually I do not expect that this is the case if the evaluation without OBGD remains that low.
Fig 5: Why do you show Nash values below -1?
[I have not recorded further minor points after L303 since I expect this to require substantial workover anyways.]
Code and data availability: Come on! We are in 2022! I find it absolutely necessary that we do not have to beg for seeing what is under the hood. HESS data and code policy is rather clear about this. I find it as an obligation for the authors to provide their data and code - especially for a study like yours which is merely applying a Keras LSTM so a very limited data set.
——
McDonnell, J. J., Spence, C., Karran, D. J., Meerveld, H. J. (Ilja) van, and Harman, C. J.: Fill-and-Spill: A Process Description of Runoff Generation at the Scale of the Beholder, Water Resour Res, 57, https://doi.org/10.1029/2020wr027514, 2021.
Schaefli, B. and Gupta, H. V.: Do Nash values have value?, Hydrol Process, 21, 2075–2080, https://doi.org/10.1002/hyp.6825, 2007.
Citation: https://doi.org/10.5194/hess-2022-77-RC2 -
AC2: 'Reply on RC2', Yong Chang, 16 Jul 2022
Thank you very much for the valuable comments.
Yong Chang et al. present a study on estimating hourly discharge in a small 1 km2 karst catchment from precipitation and EC measurements using a LSTM. They set up three different LSTMs based on EC, precipitation and both signals together. Moreover, they explore the performance of other versions of these models with a reduced amount of provided training data. The topic of the study is an interesting contribution to the field, since the added value of EC measurements with gauge levels is indeed underexplored. Also the question about gauging strategies to build a rating curve is of interest. However, I see a couple of severe issues with the study which are in conflict with the strong claims raised and which require to be resolved before final publication.
Major Points
1) If I understand correctly, the only true estimate of discharge from EC is done with the M_EC model. Given the claims of the title, abstract and introduction, I would not expect precipitation as further variable. L77ff. again precipitation is not mentioned but the use of EC as a proxy. I think the rest of the paper does not really follow this line.
Response: Thank you for this remark. The prediction of discharge by precipitation was just used as a comparison to the prediction result by EC. To avoid confusion, the simulation result of model MECP (using precipitation and EC to predict discharge) will be deleted in the revised manuscript, since it does not provide any effective information in the paper.
2) The models including precipitation input are directly predicting discharge. Hence a simple hydrological model and not a linear regression should be the benchmark for these models. Given the situation that the models including precipitation input perform worst in the 2nd evaluation period and otherwise in training and the 1st evaluation period, this raises concerns about what the LSTM actually learned during training. Apparently the temporal patterns of discharge in 2017 and 2019 are more similar than in 2018. What would happen if the model was trained in a different period? Why do the authors expect that the LSTM got sufficient data, when it obviously fails for the test period 2?
Response: We used the linear regression model was used as a benchmark model for MEC since currently there is yet no hydrological model that can predict the discharge using EC. For the model Mp, which uses prediction to predict discharge, we do not apply any benchmarking because this model is just used as a comparison to MEC. We will revise the sentence in lines 189-191.
The models including precipitation, like Mp and MECP (will be removed in the revised manuscript), has worse performance in the second test period due to a large error of precipitation data (OBGD). Whereas, the performance of model MEC is not severely influenced by the existence of OBGD (see Fig. 3b) because this mode just uses EC to predict discharge. That is, MEC does not fail to predict discharge in the test period 2. This also indicates the advantage of MEC to predict discharge over Mp in mountainous catchments where precipitation has a strong spatial variability. A sparse rain gage network would bring large precipitation uncertainty and bad discharge predictions by Mp.
Since the LSTM is a pure data-driven model, it may have a weak extrapolation ability. Therefore, when the LSTM was used for the discharge prediction by EC, we should collect EC-discharge data under a variety of rainfall conditions.
3) Why do the authors use a mean squared error as objective function (L200) instead of a more specific or several complementary evaluation functions?
Response: We used the mean squared error as it is a widely-used objective function in many machine learning works, see (Campolo et al., 1999; Gao et al., 2020; Kratzert et al., 2018). The aim of this paper is to explore the feasibility of predicting discharge with EC using a standard LSTM including a typical objective function, i.e. the MSE. Whether the selection of different objective functions affects the final simulation result is beyond the scope of this paper.
4) Using the NSE for evaluation has the known shortcomings and tendency to high values with seasonal climate (Schaefli and Gupta 2007). Given the monsoon climate in the study region, a NSE >0.5 in the evaluation period should not at all be surprising or convincing. Given the adaptability of a LSTM a NSE near 1 should be expected during training. A NSE<0 refers to predictions worse than the mean value. Hence I would expect that the authors would not show arbitrary y-axes limits but to give clear guidance that the performance is not really impressive. Moreover, I would expect further performance measures like KGE, Spearman rank correlation etc.
Response: We will revise the y-axes limits in Fig.3. In addition, we will provide the KGE and r values of the calibration and validation periods in the revised manuscript. The mean values of KGE of MEC are 0.86, 0.70 and 0.38 in the calibration and two test periods, respectively. The corresponding mean values of the correlation coefficients of MEC are 0.96, 0.82 and 0.73. The low KGE in the test period 2 is due to the poor performance of MEC on the low flows because the low discharge occupy most time in this period.
5) If I understood correctly, the LSTM is allowed to receive forecasted EC values. I wonder if this is a fair comparison if P is only given in hindcast. If P and EC measurements could be used as proxy measurements, why should I bother about not using forecasted P too? How did the authors assess the chosen time window? I was also unable to identify the m-parameter defining this window. Moreover, I did not really understand the selection of a 7 h time delay factor (L192) since the LSTM should well be capable to learn this.
Response: We only use the previous and current precipitation to predict the current discharge because of the obvious fact that observed spring discharge is just the catchment response to the previous precipitation. The model performance of Mp would not be improved even the precipitation data after the prediction time were used in the model. Whereas for MEC, because the EC dynamic always lags behind discharge, it is necessary to consider the EC data after the prediction time to forecast discharge. The procedure to determine input length (m) is shown in the appendix.
The 7 hours delay was only used in the simple regression benchmark model to account for delay between discharge and EC, not in the LSTM model.
6) The authors rightfully expose discharge as central hydrological variable (L36f). But if I would replace this measurement with a model, why should I still be at least somewhat confident about my water balance to be met? Why should I use precipitation as a further explanatory variable to predict discharge if I then would use discharge and precipitation to estimate further characteristics? This fundamentally opens the gates for spurious correlation ill-posing the matter of measuring discharge in the first place.
Response: The model MECP that uses the precipitation and EC to predict discharge will be deleted in the revised manuscript.
7) Given these questions, I am under the impression that the second part of the analyses with different subsets of training data is actually highly case specific. This does not only relate to the selected arrangement of training period, objective function and evaluation procedure. It also refers to the system under study: 1) The authors already modified the EC data (L128ff.). 2) A Karst system should rather directly relate to fill-and-spill dynamics (McDonnell et al. 2020), which are a perfect learning case for LSTMs rarely met in other hydrological systems. 3) The catchment is very small (1 km2). Hence, I would be very cautious about the capabilities to perform this kind of analysis and the strong claims interpreted from the results. In the current form, I would not really agree that the findings are sufficiently supported.
Response: Firstly, we would like to clarify that the aim of this paper is to explore for the very first time the ability to use EC to predict discharge using a standard LSTM. Exploring the impact of using different objective functions to train the LSTM would therefore not be the scope of this paper. The longest data series from March 1 to August 1 in 2019 was selected as the training period since the LSTM is a pure data-driven model and requires abundant data to get a stable simulation result. For the model evaluation, the performance of MEC basically is not influenced by the precipitation error in test period 2 since this model just uses EC as the model input.
Secondly, the adjustment of EC value in test period 1 is based on the fact that the maximum EC of this spring is always relatively stable in different years according to the previous monitoring and different data loggers were used to monitor EC in 2017 and other two years. To further interpret the possible uncertainty caused by this adjustment, we will add another figure to the revised manuscript that shows the variation of model performance with the different EC adjustment values in test period 1.
Finally, this work in the paper is the first time to apply LSTM model to predict discharge using EC. Although the study catchment is small, the observed spring discharge and EC dynamics are similar to many other karst springs (Olarinoye et al., 2020). Therefore, we think the catchment area should not be a problem to apply our approach. Regarding whether our approach can also be used in other hydrological systems, further work is needed which is our next step.
Minor Points (only points in addition to the major ones are listed)
Title: I find the title not really in line with the content of the paper.
Response: The title will be revised to ‘Using LSTM to monitor stormflow discharge indirectly with EC observations’ according to the comment from reviewer 1.
L21: What complex relationship? What special ML architecture? This is far too fuzzy.
Response: We will further revise the sentence.
L25: I did not spot any assessment of uncertainties. I guess you refer to the overall model performance evaluation.
Response: Change the word ‘uncertainties’ to model performance.
L39f: depth? water level!; defined relationship? rating curve! Why omitting the established terminology?
Response: Accept. We will change the words.
Fig 1: I do not really get anything from the maps a and b. Map c is difficult to interpret.
Response: Map a shows the location of study catchment in China. Map b displays the locations of two climatic stations and their observations were used to fill two recording gaps. Map c just shows the catchment area of the karst spring.
L106: what is a combination of rectangular weirs? Do you have a rating curve for the weirs or is the discharge merely calculated with an empirical weir function? How is the gauge measured? Which uncertainty would you expect?
Response: The discharge is calculated by the empirical weir function. The water level was measured by a HOBO data Logger U20 with precision of 0.3cm.
L108f: I suspect a Onset U24? Why do you report 15 min resolution if later on hourly data is used?
Response: Yes, the Onset U24 was used for the EC monitoring. The hourly data was used because the resolution of discharge in some periods is one hour.
L124: What is unsaturated fast flow?
Response: change to ‘low-EC event water’.
Fig 2b: Why are the side panels in reverse order and without annotated marks in the main panel? Why is the linear model used as reference not plotted? Why is (again) a different correlation measure used?
Response: We will add the annotation in the main panel. Figure 2b just displays the overall relationship between observed discharge and EC. The different correlation coefficients in the right panel of Fig.2b correspond to a different relationship between discharge and EC under different recharge events. Figure 2 just displays the observation data without any simulation results of different models.
L148f: I guess you refer to discharge events (not rain events)?
Response: Thanks, we will change the words.
L155f: A strong relationship? I would not claim a correlation of -0.51 to be specifically strong. Hence the relationship might be somewhat tangible there and is not found when plotting EC to Q for lower discharge.
Response: We will further revise the sentence.
Sec 3.1: Why dont you calrify your strategy with the three models M_EC, M_P and M_ECP upfront?
Response: MECP will be deleted in the revised manuscript. Mp was just used as the comparison to the prediction result by EC. We will add the description in the revised version.
L192: What is really meant with the 7h forward shifting?
Response: The 7h forward shifting was just used in the simple regression model since this model cannot finely learn the delay between EC and discharge.
L206: Why do you report the NSE equation. Not needed. Better add further evaluation estimates.
Response: Delete NSE equation.
L263: The benchmark is the linear regression which is slightly better than a pure mean value…
Response: The bad performance of the benchmark model is due to the weak linear relationship between discharge and EC. However, the model MEC can still get a better performance than the linear regression model
L265: See major points about the NSE and the expectations for an LSTM. Avoid normative claims. Certainly they do not expose excellent capability…
Response: Excellent changes to ‘good’
Fig 3: Caption reports Fig 2 instead of 3.
Response: Thanks.
L276: Again, how do you support the claim? Test period 2 obviously fails and it is not analysed if this is due to the lack of precip data. Actually I do not expect that this is the case if the evaluation without OBGD remains that low.
Response: The bad performance of Mp in the test period 2 even without OBGD is mainly caused by its poor simulation results in the low flow because the low flow takes up the most time in this period with only two storm events. We will add the explanation in the revised manuscript.
Fig 5: Why do you show Nash values below -1?
Response: Cap the y-axis to -1.
[I have not recorded further minor points after L303 since I expect this to require substantial workover anyways.]
Response: We will carefully revise the following sections according to the reviewer’s comment before.
Code and data availability: Come on! We are in 2022! I find it absolutely necessary that we do not have to beg for seeing what is under the hood. HESS data and code policy is rather clear about this. I find it as an obligation for the authors to provide their data and code - especially for a study like yours which is merely applying a Keras LSTM so a very limited data set.
Response: The code and data will be uploaded to a public repository.
Reference:
Campolo, M., Andreussi, P., Soldati, A., 1999. River flood forecasting with a neural network model. Water Resour. Res. 35, 1191–1197.
Gao, S., Huang, Y., Zhang, S., Han, J., Wang, G., Zhang, M., Lin, Q., 2020. Short-term runoff prediction with GRU and LSTM networks without requiring time step optimization during sample generation. J. Hydrol. 589, 125188. https://doi.org/10.1016/j.jhydrol.2020.125188
Kratzert, F., Klotz, D., Brenner, C., Schulz, K., Herrnegger, M., 2018. Rainfall – runoff modelling using Long Short-Term Memory ( LSTM ) networks, 6005–6022.
Olarinoye, T., Gleeson, T., Marx, V., Seeger, S., Adinehvand, R., Allocca, V., Andreo, B., Apaéstegui, J., Apolit, C., Arfib, B., Auler, A., Bailly-Comte, V., Barberá, J. A., Batiot-Guilhe, C., Bechtel, T., Binet, S., Bittner, D., Blatnik, M., Bolger, T., Hartmann, A. (2020). Global karst springs hydrograph dataset for research and management of the world’s fastest-flowing groundwater. Nature Scientific Data, 7(1). https://doi.org/10.1038/s41597-019-0346-5
Citation: https://doi.org/10.5194/hess-2022-77-AC2
-
AC2: 'Reply on RC2', Yong Chang, 16 Jul 2022
Status: closed
-
RC1: 'Comment on hess-2022-77', Anonymous Referee #1, 16 Apr 2022
Chang bet al. present the application of a statistical approach (LSTM) to determine discharge of rainfall event runoff from instream EC measurements.
MAJOR
Title/Premise: The authors present the application of a statistical approach (LSTM) to calculate discharge during rainfall events from EC observations. The title (Using LSTM to monitor continuous discharge indirectly with electrical conductivity observations) might perhaps mislead the reader, as certain time periods (low flow, initial runoff) are clearly excluded from the analysis. A more fitting title would be: Using LSTM to monitor STORMFLOW DISCHARGE indirectly with EC observations.
The performance of a model using EC only is compared to models using both EC and P and only P. It might be interesting to compare the selected model to a more simple approach, to really highlight the added value of a more complex model.
20: In your abstract in line 20 you write that in your spring EC always has a negative correlation with spring discharge. However, in line 126-130 you mention that there is occasionally a positive correlation (EC peak at the initial runoff).
23-25: “LSTM results indicate that the spring discharge can be predicted well with EC, particularly in storms when the dilution dominates the EC dynamic; however, the prediction may have relatively large uncertainties in the small or middle recharge events.” It seems the findings of your study do not support this conclusion at all. As I understood, spring discharge could ONLY be predicted well for large storm events; there are large uncertainties when it comes to intermediate and small events and it was not possible at all to use EC for the estimation of baseflow/low flow. So, one might conclude that overall spring discharge can actually not be predicted well.
130: It is unclear why a there is a need to correct the maximum EC values in 2017 to match them with 2018 and 2019. Please elaborate why the maximum EC should be the same in all years.
130: You corrected for drift of the sensor by subtracting 23µS/cm. Please elaborate why you choose this specific value. Also: A simple subtraction of measured EC does not adequately account for gradual drift.
424: You elaborate that the EC dynamics of the investigated spring are relatively simple without temporal EC peaks at the beginning of storms. However, in line 126-130 you describe that you found indeed initial EC peaks at the beginning of storm events in your 2018 and 2019 data and you state that you excluded these observations from your analysis.
426: To my knowledge, the cited paper of Hess and White (1993) does not give any reference to “piston flow”, it doesn’t mention the words ‘piston flow’
MINOR
83 –geographical coordinates of the spring might be useful
83-91 citation might be useful
Figure 1a: labels in map are too small to read
120 -121: “the spring`s EC dynamic is MAINLY controlled by the rock dissolution and the dilution from the low-EC event water during storms.” – what other minor influencing factors are there?
133: wrong unit: 23us/cm -> 23µs/cm
170: “LSTM belongs to a special kind of recurrent neural network” – I suggest different wording
253: “The performances of MP and MECP deteriorate obviously probably due to …” – obviously or probably, which one is it?
Figure 3e: red line in legend is missing
285: wording: middle -> intermediate
Citation: https://doi.org/10.5194/hess-2022-77-RC1 -
AC1: 'Reply on RC1', Yong Chang, 16 Jul 2022
Thank you very much for the valuable comments.
MAJOR
- Title/Premise: The authors present the application of a statistical approach (LSTM) to calculate discharge during rainfall events from EC observations. The title (Using LSTM to monitor continuous discharge indirectly with electrical conductivity observations) might perhaps mislead the reader, as certain time periods (low flow, initial runoff) are clearly excluded from the analysis. A more fitting title would be: Using LSTM to monitor STORMFLOW DISCHARGE indirectly with EC observations.
Response: Thank you. This is an excellent suggestion. We will change the paper title accordingly.
- The performance of a model using EC only is compared to models using both EC and P and only P. It might be interesting to compare the selected model to a more simple approach, to really highlight the added value of a more complex model.
Response: In the manuscript, we have compared the LSTM model to a simple linear regression model (see lines 189-191 and figure 3b). The regression model shows much worse performance than MEC.
- 20: In your abstract in line 20 you write that in your spring EC always has a negative correlation with spring discharge. However, in line 126-130 you mention that there is occasionally a positive correlation (EC peak at the initial runoff).
Response: Thanks. We will revise line 20. The spring EC always has a negative correlation with spring discharge in most times.
- 23-25: “LSTM results indicate that the spring discharge can be predicted well with EC, particularly in storms when the dilution dominates the EC dynamic; however, the prediction may have relatively large uncertainties in the small or middle recharge events.” It seems the findings of your study do not support this conclusion at all. As I understood, spring discharge could ONLY be predicted well for large storm events; there are large uncertainties when it comes to intermediate and small events and it was not possible at all to use EC for the estimation of baseflow/low flow. So, one might conclude that overall spring discharge can actually not be predicted well.
Response: We will further revise the sentences. Actually, most spring discharges including the baseflow under large storms can be well predicted by EC. This can be seen in Fig.4 that MEC has a large NSE value in storm events. In the manuscript, we define the discharge in the storm event as the period from the end of the last recharge events to the beginning of the next recharge events. It is also possible to predict the discharge under intermediate events but with a large uncertainty compared to the predictions under storm events. It is true that it was not possible to use EC for the estimation of low flow since a low correlation between EC and discharge in the small recharge events. Although this drawback, our approach is still promising because the discharge dynamic under storms or intermediate recharge events is the key information for flood management or understanding the behavior of hydrological systems combined with other hydrochemical indices.
- 130: It is unclear why a there is a need to correct the maximum EC values in 2017 to match them with 2018 and 2019. Please elaborate why the maximum EC should be the same in all years.
Response: The hydrochemistry of the studies spring is mainly controlled by the dissolution of carbonate rocks. The maximum EC of the spring water corresponds to the calcium carbonate equilibria. Meanwhile, the spring locates in the phreatic zone and its most hydrochemical indices, such as temperature, pH and EC,basically do not show obvious seasonal variation according to the previous monitoring. Therefore, the maximum EC of this spring is always relatively stable in different years. Given two different data loggers were used to monitor EC in 2017 and the other two years, it is reasonable to assert that the discrepancy in maximum EC between 2017 and the other two years is mainly caused by the instrumental drift.
We will add these clarifications in the revised version.
- 130: You corrected for drift of the sensor by subtracting 23µS/cm. Please elaborate why you choose this specific value. Also: A simple subtraction of measured EC does not adequately account for gradual drift.
Response: The selection of 23 µS/cm is based on the assumption that the maximum EC value in the 2017 is same to that in the other two years. Actually, although the maximum EC of spring water is relatively stable without an obvious seasonal variation at the study site, the maximum EC value may still have a slight variation. In the revised manuscript, we will add another plot to show the variation of the simulation result of MEC with the different EC adjustment values in test period 1 to further illustrate the uncertainty caused by this drift adjustment.
- 424: You elaborate that the EC dynamics of the investigated spring are relatively simple without temporal EC peaks at the beginning of storms. However, in line 126-130 you describe that you found indeed initial EC peaks at the beginning of storm events in your 2018 and 2019 data and you state that you excluded these observations from your analysis.
Response: I will further revise the sentence.
- 426: To my knowledge, the cited paper of Hess and White (1993) does not give any reference to “piston flow”, it doesn’t mention the words ‘piston flow’
Response: Thanks. There is a mistake here. The cited reference is a paper published also by Hess and White (1988) in which they found the phenomenon that the spring EC may rise firstly before beginning to drop during the storm. The ‘piston effect’ was named by Goldscheider and Drew (2007) in the book <Methods in Karst Hydrogeology>. We will add this reference in the revised version.
MINOR
83 –geographical coordinates of the spring might be useful
Response: We will add the geographical coordinate.
83-91 citation might be useful
Response: We will add the relevant references.
Figure 1a: labels in map are too small to read
Response: We will increase the label size accordingly.
120 -121: “the spring`s EC dynamic is MAINLY controlled by the rock dissolution and the dilution from the low-EC event water during storms.” – what other minor influencing factors are there?
Response: The spring EC may also slightly influenced by the concentration variation of some other irons during storms, such as K+, Na+, Cl-, and SO42-.
133: wrong unit: 23us/cm -> 23µs/cm
Response: Thanks, we will revise the unit.
170: “LSTM belongs to a special kind of recurrent neural network” – I suggest different wording
Response: Revise to ‘LSTM is a special recurrent neural network’.
253: “The performances of MP and MECP deteriorate obviously probably due to …” – obviously or probably, which one is it?
Response: Revise the sentence to ‘The performances of MP and MECP show an obvious deterioration which is probably due to…’.
Figure 3e: red line in legend is missing
Response: Thanks, we will update the legend.
285: wording: middle -> intermediate
Response: Thank you. We will change ‘middle’ to ‘intermediate’ in the revised version.
Reference:
Hess, J. W., & White, W. B. (1988). Storm response of the karstic carbonate aquifer of southcentral Kentucky. Journal of Hydrology, 99(3), 235–252. https://doi.org/10.1016/0022-1694(88)90051-0
Goldscheider, N., & Drew, D. (2007). Methods in karst hydrogeology: IAH: International contributions to hydrogeology (Vol. 26). CRC Press.
Citation: https://doi.org/10.5194/hess-2022-77-AC1
-
AC1: 'Reply on RC1', Yong Chang, 16 Jul 2022
-
RC2: 'Comment on hess-2022-77', Anonymous Referee #2, 18 Jun 2022
Yong Chang et al. present a study on estimating hourly discharge in a small 1 km2 karst catchment from precipitation and EC measurements using a LSTM. They set up three different LSTMs based on EC, precipitation and both signals together. Moreover, they explore the performance of other versions of these models with a reduced amount of provided training data. The topic of the study is an interesting contribution to the field, since the added value of EC measurements with gauge levels is indeed underexplored. Also the question about gauging strategies to build a rating curve is of interest. However, I see a couple of severe issues with the study which are in conflict with the strong claims raised and which require to be resolved before final publication.
Major Points
If I understand correctly, the only true estimate of discharge from EC is done with the M_EC model. Given the claims of the title, abstract and introduction, I would not expect precipitation as further variable. L77ff. again precipitation is not mentioned but the use of EC as a proxy. I think the rest of the paper does not really follow this line.
The models including precipitation input are directly predicting discharge. Hence a simple hydrological model and not a linear regression should be the benchmark for these models. Given the situation that the models including precipitation input perform worst in the 2nd evaluation period and otherwise in training and the 1st evaluation period, this raises concerns about what the LSTM actually learned during training. Apparently the temporal patterns of discharge in 2017 and 2019 are more similar than in 2018. What would happen if the model was trained in a different period? Why do the authors expect that the LSTM got sufficient data, when it obviously fails for the test period 2?
Why do the authors use a mean squared error as objective function (L200) instead of a more specific or several complementary evaluation functions?
Using the NSE for evaluation has the known shortcomings and tendency to high values with seasonal climate (Schaefli and Gupta 2007). Given the monsoon climate in the study region, a NSE >0.5 in the evaluation period should not at all be surprising or convincing. Given the adaptability of a LSTM a NSE near 1 should be expected during training. A NSE<0 refers to predictions worse than the mean value. Hence I would expect that the authors would not show arbitrary y-axes limits but to give clear guidance that the performance is not really impressive. Moreover, I would expect further performance measures like KGE, Spearman rank correlation etc.
If I understood correctly, the LSTM is allowed to receive forecasted EC values. I wonder if this is a fair comparison if P is only given in hindcast. If P and EC measurements could be used as proxy measurements, why should I bother about not using forecasted P too? How did the authors assess the chosen time window? I was also unable to identify the m-parameter defining this window. Moreover, I did not really understand the selection of a 7 h time delay factor (L192) since the LSTM should well be capable to learn this.
The authors rightfully expose discharge as central hydrological variable (L36f). But if I would replace this measurement with a model, why should I still be at least somewhat confident about my water balance to be met? Why should I use precipitation as a further explanatory variable to predict discharge if I then would use discharge and precipitation to estimate further characteristics? This fundamentally opens the gates for spurious correlation ill-posing the matter of measuring discharge in the first place.
Given these questions, I am under the impression that the second part of the analyses with different subsets of training data is actually highly case specific. This does not only relate to the selected arrangement of training period, objective function and evaluation procedure. It also refers to the system under study: 1) The authors already modified the EC data (L128ff.). 2) A Karst system should rather directly relate to fill-and-spill dynamics (McDonnell et al. 2020), which are a perfect learning case for LSTMs rarely met in other hydrological systems. 3) The catchment is very small (1 km2). Hence, I would be very cautious about the capabilities to perform this kind of analysis and the strong claims interpreted from the results. In the current form, I would not really agree that the findings are sufficiently supported.
Minor Points (only points in addition to the major ones are listed)
Title: I find the title not really in line with the content of the paper.
L21: What complex relationship? What special ML architecture? This is far too fuzzy.
L25: I did not spot any assessment of uncertainties. I guess you refer to the overall model performance evaluation.
L39f: depth? water level!; defined relationship? rating curve! Why omitting the established terminology?
Fig 1: I do not really get anything from the maps a and b. Map c is difficult to interpret.
L106: what is a combination of rectangular weirs? Do you have a rating curve for the weirs or is the discharge merely calculated with an empirical weir function? How is the gauge measured? Which uncertainty would you expect?
L108f: I suspect a Onset U24? Why do you report 15 min resolution if later on hourly data is used?
L124: What is unsaturated fast flow?
Fig 2b: Why are the side panels in reverse order and without annotated marks in the main panel?Why is the linear model used as reference not plotted? Why is (again) a different correlation measure used?
L148f: I guess you refer to discharge events (not rain events)?
L155f: A strong relationship? I would not claim a correlation of -0.51 to be specifically strong. Hence the relationship might be somewhat tangible there and is not found when plotting EC to Q for lower discharge.
Sec 3.1: Why dont you calrify your strategy with the three models M_EC, M_P and M_ECP upfront?
L192: What is really meant with the 7h forward shifting?
L206: Why do you report the NSE equation. Not needed. Better add further evaluation estimates.
L263: The benchmark is the linear regression which is slightly better than a pure mean value…
L265: See major points about the NSE and the expectations for an LSTM. Avoid normative claims. Certainly they do not expose excellent capability…
Fig 3: Caption reports Fig 2 instead of 3.
L276: Again, how do you support the claim? Test period 2 obviously fails and it is not analysed if this is due to the lack of precip data. Actually I do not expect that this is the case if the evaluation without OBGD remains that low.
Fig 5: Why do you show Nash values below -1?
[I have not recorded further minor points after L303 since I expect this to require substantial workover anyways.]
Code and data availability: Come on! We are in 2022! I find it absolutely necessary that we do not have to beg for seeing what is under the hood. HESS data and code policy is rather clear about this. I find it as an obligation for the authors to provide their data and code - especially for a study like yours which is merely applying a Keras LSTM so a very limited data set.
——
McDonnell, J. J., Spence, C., Karran, D. J., Meerveld, H. J. (Ilja) van, and Harman, C. J.: Fill-and-Spill: A Process Description of Runoff Generation at the Scale of the Beholder, Water Resour Res, 57, https://doi.org/10.1029/2020wr027514, 2021.
Schaefli, B. and Gupta, H. V.: Do Nash values have value?, Hydrol Process, 21, 2075–2080, https://doi.org/10.1002/hyp.6825, 2007.
Citation: https://doi.org/10.5194/hess-2022-77-RC2 -
AC2: 'Reply on RC2', Yong Chang, 16 Jul 2022
Thank you very much for the valuable comments.
Yong Chang et al. present a study on estimating hourly discharge in a small 1 km2 karst catchment from precipitation and EC measurements using a LSTM. They set up three different LSTMs based on EC, precipitation and both signals together. Moreover, they explore the performance of other versions of these models with a reduced amount of provided training data. The topic of the study is an interesting contribution to the field, since the added value of EC measurements with gauge levels is indeed underexplored. Also the question about gauging strategies to build a rating curve is of interest. However, I see a couple of severe issues with the study which are in conflict with the strong claims raised and which require to be resolved before final publication.
Major Points
1) If I understand correctly, the only true estimate of discharge from EC is done with the M_EC model. Given the claims of the title, abstract and introduction, I would not expect precipitation as further variable. L77ff. again precipitation is not mentioned but the use of EC as a proxy. I think the rest of the paper does not really follow this line.
Response: Thank you for this remark. The prediction of discharge by precipitation was just used as a comparison to the prediction result by EC. To avoid confusion, the simulation result of model MECP (using precipitation and EC to predict discharge) will be deleted in the revised manuscript, since it does not provide any effective information in the paper.
2) The models including precipitation input are directly predicting discharge. Hence a simple hydrological model and not a linear regression should be the benchmark for these models. Given the situation that the models including precipitation input perform worst in the 2nd evaluation period and otherwise in training and the 1st evaluation period, this raises concerns about what the LSTM actually learned during training. Apparently the temporal patterns of discharge in 2017 and 2019 are more similar than in 2018. What would happen if the model was trained in a different period? Why do the authors expect that the LSTM got sufficient data, when it obviously fails for the test period 2?
Response: We used the linear regression model was used as a benchmark model for MEC since currently there is yet no hydrological model that can predict the discharge using EC. For the model Mp, which uses prediction to predict discharge, we do not apply any benchmarking because this model is just used as a comparison to MEC. We will revise the sentence in lines 189-191.
The models including precipitation, like Mp and MECP (will be removed in the revised manuscript), has worse performance in the second test period due to a large error of precipitation data (OBGD). Whereas, the performance of model MEC is not severely influenced by the existence of OBGD (see Fig. 3b) because this mode just uses EC to predict discharge. That is, MEC does not fail to predict discharge in the test period 2. This also indicates the advantage of MEC to predict discharge over Mp in mountainous catchments where precipitation has a strong spatial variability. A sparse rain gage network would bring large precipitation uncertainty and bad discharge predictions by Mp.
Since the LSTM is a pure data-driven model, it may have a weak extrapolation ability. Therefore, when the LSTM was used for the discharge prediction by EC, we should collect EC-discharge data under a variety of rainfall conditions.
3) Why do the authors use a mean squared error as objective function (L200) instead of a more specific or several complementary evaluation functions?
Response: We used the mean squared error as it is a widely-used objective function in many machine learning works, see (Campolo et al., 1999; Gao et al., 2020; Kratzert et al., 2018). The aim of this paper is to explore the feasibility of predicting discharge with EC using a standard LSTM including a typical objective function, i.e. the MSE. Whether the selection of different objective functions affects the final simulation result is beyond the scope of this paper.
4) Using the NSE for evaluation has the known shortcomings and tendency to high values with seasonal climate (Schaefli and Gupta 2007). Given the monsoon climate in the study region, a NSE >0.5 in the evaluation period should not at all be surprising or convincing. Given the adaptability of a LSTM a NSE near 1 should be expected during training. A NSE<0 refers to predictions worse than the mean value. Hence I would expect that the authors would not show arbitrary y-axes limits but to give clear guidance that the performance is not really impressive. Moreover, I would expect further performance measures like KGE, Spearman rank correlation etc.
Response: We will revise the y-axes limits in Fig.3. In addition, we will provide the KGE and r values of the calibration and validation periods in the revised manuscript. The mean values of KGE of MEC are 0.86, 0.70 and 0.38 in the calibration and two test periods, respectively. The corresponding mean values of the correlation coefficients of MEC are 0.96, 0.82 and 0.73. The low KGE in the test period 2 is due to the poor performance of MEC on the low flows because the low discharge occupy most time in this period.
5) If I understood correctly, the LSTM is allowed to receive forecasted EC values. I wonder if this is a fair comparison if P is only given in hindcast. If P and EC measurements could be used as proxy measurements, why should I bother about not using forecasted P too? How did the authors assess the chosen time window? I was also unable to identify the m-parameter defining this window. Moreover, I did not really understand the selection of a 7 h time delay factor (L192) since the LSTM should well be capable to learn this.
Response: We only use the previous and current precipitation to predict the current discharge because of the obvious fact that observed spring discharge is just the catchment response to the previous precipitation. The model performance of Mp would not be improved even the precipitation data after the prediction time were used in the model. Whereas for MEC, because the EC dynamic always lags behind discharge, it is necessary to consider the EC data after the prediction time to forecast discharge. The procedure to determine input length (m) is shown in the appendix.
The 7 hours delay was only used in the simple regression benchmark model to account for delay between discharge and EC, not in the LSTM model.
6) The authors rightfully expose discharge as central hydrological variable (L36f). But if I would replace this measurement with a model, why should I still be at least somewhat confident about my water balance to be met? Why should I use precipitation as a further explanatory variable to predict discharge if I then would use discharge and precipitation to estimate further characteristics? This fundamentally opens the gates for spurious correlation ill-posing the matter of measuring discharge in the first place.
Response: The model MECP that uses the precipitation and EC to predict discharge will be deleted in the revised manuscript.
7) Given these questions, I am under the impression that the second part of the analyses with different subsets of training data is actually highly case specific. This does not only relate to the selected arrangement of training period, objective function and evaluation procedure. It also refers to the system under study: 1) The authors already modified the EC data (L128ff.). 2) A Karst system should rather directly relate to fill-and-spill dynamics (McDonnell et al. 2020), which are a perfect learning case for LSTMs rarely met in other hydrological systems. 3) The catchment is very small (1 km2). Hence, I would be very cautious about the capabilities to perform this kind of analysis and the strong claims interpreted from the results. In the current form, I would not really agree that the findings are sufficiently supported.
Response: Firstly, we would like to clarify that the aim of this paper is to explore for the very first time the ability to use EC to predict discharge using a standard LSTM. Exploring the impact of using different objective functions to train the LSTM would therefore not be the scope of this paper. The longest data series from March 1 to August 1 in 2019 was selected as the training period since the LSTM is a pure data-driven model and requires abundant data to get a stable simulation result. For the model evaluation, the performance of MEC basically is not influenced by the precipitation error in test period 2 since this model just uses EC as the model input.
Secondly, the adjustment of EC value in test period 1 is based on the fact that the maximum EC of this spring is always relatively stable in different years according to the previous monitoring and different data loggers were used to monitor EC in 2017 and other two years. To further interpret the possible uncertainty caused by this adjustment, we will add another figure to the revised manuscript that shows the variation of model performance with the different EC adjustment values in test period 1.
Finally, this work in the paper is the first time to apply LSTM model to predict discharge using EC. Although the study catchment is small, the observed spring discharge and EC dynamics are similar to many other karst springs (Olarinoye et al., 2020). Therefore, we think the catchment area should not be a problem to apply our approach. Regarding whether our approach can also be used in other hydrological systems, further work is needed which is our next step.
Minor Points (only points in addition to the major ones are listed)
Title: I find the title not really in line with the content of the paper.
Response: The title will be revised to ‘Using LSTM to monitor stormflow discharge indirectly with EC observations’ according to the comment from reviewer 1.
L21: What complex relationship? What special ML architecture? This is far too fuzzy.
Response: We will further revise the sentence.
L25: I did not spot any assessment of uncertainties. I guess you refer to the overall model performance evaluation.
Response: Change the word ‘uncertainties’ to model performance.
L39f: depth? water level!; defined relationship? rating curve! Why omitting the established terminology?
Response: Accept. We will change the words.
Fig 1: I do not really get anything from the maps a and b. Map c is difficult to interpret.
Response: Map a shows the location of study catchment in China. Map b displays the locations of two climatic stations and their observations were used to fill two recording gaps. Map c just shows the catchment area of the karst spring.
L106: what is a combination of rectangular weirs? Do you have a rating curve for the weirs or is the discharge merely calculated with an empirical weir function? How is the gauge measured? Which uncertainty would you expect?
Response: The discharge is calculated by the empirical weir function. The water level was measured by a HOBO data Logger U20 with precision of 0.3cm.
L108f: I suspect a Onset U24? Why do you report 15 min resolution if later on hourly data is used?
Response: Yes, the Onset U24 was used for the EC monitoring. The hourly data was used because the resolution of discharge in some periods is one hour.
L124: What is unsaturated fast flow?
Response: change to ‘low-EC event water’.
Fig 2b: Why are the side panels in reverse order and without annotated marks in the main panel? Why is the linear model used as reference not plotted? Why is (again) a different correlation measure used?
Response: We will add the annotation in the main panel. Figure 2b just displays the overall relationship between observed discharge and EC. The different correlation coefficients in the right panel of Fig.2b correspond to a different relationship between discharge and EC under different recharge events. Figure 2 just displays the observation data without any simulation results of different models.
L148f: I guess you refer to discharge events (not rain events)?
Response: Thanks, we will change the words.
L155f: A strong relationship? I would not claim a correlation of -0.51 to be specifically strong. Hence the relationship might be somewhat tangible there and is not found when plotting EC to Q for lower discharge.
Response: We will further revise the sentence.
Sec 3.1: Why dont you calrify your strategy with the three models M_EC, M_P and M_ECP upfront?
Response: MECP will be deleted in the revised manuscript. Mp was just used as the comparison to the prediction result by EC. We will add the description in the revised version.
L192: What is really meant with the 7h forward shifting?
Response: The 7h forward shifting was just used in the simple regression model since this model cannot finely learn the delay between EC and discharge.
L206: Why do you report the NSE equation. Not needed. Better add further evaluation estimates.
Response: Delete NSE equation.
L263: The benchmark is the linear regression which is slightly better than a pure mean value…
Response: The bad performance of the benchmark model is due to the weak linear relationship between discharge and EC. However, the model MEC can still get a better performance than the linear regression model
L265: See major points about the NSE and the expectations for an LSTM. Avoid normative claims. Certainly they do not expose excellent capability…
Response: Excellent changes to ‘good’
Fig 3: Caption reports Fig 2 instead of 3.
Response: Thanks.
L276: Again, how do you support the claim? Test period 2 obviously fails and it is not analysed if this is due to the lack of precip data. Actually I do not expect that this is the case if the evaluation without OBGD remains that low.
Response: The bad performance of Mp in the test period 2 even without OBGD is mainly caused by its poor simulation results in the low flow because the low flow takes up the most time in this period with only two storm events. We will add the explanation in the revised manuscript.
Fig 5: Why do you show Nash values below -1?
Response: Cap the y-axis to -1.
[I have not recorded further minor points after L303 since I expect this to require substantial workover anyways.]
Response: We will carefully revise the following sections according to the reviewer’s comment before.
Code and data availability: Come on! We are in 2022! I find it absolutely necessary that we do not have to beg for seeing what is under the hood. HESS data and code policy is rather clear about this. I find it as an obligation for the authors to provide their data and code - especially for a study like yours which is merely applying a Keras LSTM so a very limited data set.
Response: The code and data will be uploaded to a public repository.
Reference:
Campolo, M., Andreussi, P., Soldati, A., 1999. River flood forecasting with a neural network model. Water Resour. Res. 35, 1191–1197.
Gao, S., Huang, Y., Zhang, S., Han, J., Wang, G., Zhang, M., Lin, Q., 2020. Short-term runoff prediction with GRU and LSTM networks without requiring time step optimization during sample generation. J. Hydrol. 589, 125188. https://doi.org/10.1016/j.jhydrol.2020.125188
Kratzert, F., Klotz, D., Brenner, C., Schulz, K., Herrnegger, M., 2018. Rainfall – runoff modelling using Long Short-Term Memory ( LSTM ) networks, 6005–6022.
Olarinoye, T., Gleeson, T., Marx, V., Seeger, S., Adinehvand, R., Allocca, V., Andreo, B., Apaéstegui, J., Apolit, C., Arfib, B., Auler, A., Bailly-Comte, V., Barberá, J. A., Batiot-Guilhe, C., Bechtel, T., Binet, S., Bittner, D., Blatnik, M., Bolger, T., Hartmann, A. (2020). Global karst springs hydrograph dataset for research and management of the world’s fastest-flowing groundwater. Nature Scientific Data, 7(1). https://doi.org/10.1038/s41597-019-0346-5
Citation: https://doi.org/10.5194/hess-2022-77-AC2
-
AC2: 'Reply on RC2', Yong Chang, 16 Jul 2022
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
957 | 240 | 41 | 1,238 | 32 | 38 |
- HTML: 957
- PDF: 240
- XML: 41
- Total: 1,238
- BibTeX: 32
- EndNote: 38
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1