Spatio-temporal optimization of groundwater monitoring networks using data-driven sparse sensing methods
- Institute of Applied Geosciences, Division of Hydrogeology, Karlsruhe Institute of Technology, Karlsruhe, Germany
- Institute of Applied Geosciences, Division of Hydrogeology, Karlsruhe Institute of Technology, Karlsruhe, Germany
Abstract. Groundwater monitoring and specific collection of data on the spatio-temporal dynamics of the aquifer are prerequisites for effective groundwater management and determine nearly all downstream management decisions. An optimally designed groundwater monitoring network will provide the maximum information content at the minimum cost (Pareto optimum). In this study, PySensors, a Python package containing scalable, data-driven algorithms for sparse sensor selection and signal reconstruction with dimensionality reduction is applied to an existing groundwater monitoring network (GMN) in 1D (hydrographs) and 2D (gridded groundwater contour maps). The algorithm first fits a basis object to the training data, then applies a computationally efficient QR algorithm that ranks existing monitoring wells (for 1D) or suitable sites for additional monitoring (for 2D) in order of "importance" based on the state reconstruction to this tailored basis. This procedure enables a network to be reduced or extended along the Pareto front. Moreover, we investigate the effect of basis choice on reconstruction performance by comparing three types typically used for sparse sensor selection (identity, random projection, and singular value decomposition resp. principal component analysis). We define a gridded cost function for the extension case penalizes unsuitable locations. Our results show that this approach is generally better than the best randomly selected wells. The optimized reduction makes it possible to adequately reconstruct the removed hydrographs with a highly reduced subset with low loss. An average absolute reconstruction accuracy of 0.1 m is achieved with a subset of 6 % wells, 0.05 m with 31 %, and 0.01 m with 82 % wells.
- Preprint
(29913 KB) -
Supplement
(27322 KB) - BibTeX
- EndNote
Marc Ohmer et al.
Status: final response (author comments only)
-
RC1: 'Comment on hess-2022-69', Anonymous Referee #1, 28 Mar 2022
1. The paper is excellent, it only needs minor corrections/clarifications.
2. Abstract "Our results show that this approach is generally better than the best randomly selected wells". This sentence must be changed because it seems to insinuate than one could change n monitoring wells randomly and they could performed better than the optimized approach (because of the world "generally').
3. Introduction: "there is a dualism between monitoring costs and monitoring quality (i.e., the information gained by monitoring)" This sentence is not logical. Perhaps you mean that the higher the cost of monitoring, the better the quality of the data, provided that the monitoring is well designed. In other words, one could spend a lot of money and yet do not improve the quality of information unless the money is well spent. A deeper issue not addressed in the paper is how the expenditure in monitoring may improve the quality of groundwater, which means that groundwater monitoring is tied up to groundwater management.4. Introduction: "How does the reconstruction/interpolation error develop when a given number of monitoring wells are reduced? How
does the error of reducing wells according to information content compare to a random reduction?" These sentences are confusing. Perhaps you mean: "How does the reconstruction/interpolation error varies with changes in the number of monitoring wells? How does a random reduction of monitoring wells affect the information content gained by groundwater monitoring?" I urge the authors to improve the logical meaning of their paper's text.5. Section 2.1.4: you propose that an mxn matrix can be decomposed into matrices Q and R, such that A = QR; then you propose that there is matrix C such that A CT = Q R; unclear why not : A CT not equal to QR CT
6. Section 2.4.2 "Outlier values that exceeded a moving average (window size 11) of ±3 σ were removed during preprocessing"
Perhaps you mean: "Data values that deviate by more than ±3 σ from the moving average (with a window size of 11 values) are considered outliers and were removed from further processsing"
7. Section 2.4.3 " omnidirectional Gaussian semivariogram model" I believe you mean "an isotropic Gausian semivariogram model"
8. Figure 2b: the Pareto front of number of wells vs RMSE: what about Pareto fronts for the other goodness-of-fit criteria? such the NSE or the KGE?
9. Figure 7: add GMW (groundwater monitoring well) to the list of acronyms
-
AC1: 'Reply on RC1', Marc Ohmer, 16 May 2022
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2022-69/hess-2022-69-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Marc Ohmer, 16 May 2022
-
RC2: 'Comment on hess-2022-69', Anonymous Referee #2, 05 May 2022
General comments:
The submitted manuscript presents a versatile optimization strategy for monitoring well networks that can be used on temporal and/or spatial hydraulic head data. The methodology and its demonstration using the example data set are presented nicely with sophisticated plots and graphs. The analysis and interpretation of results are considerably elaborate, which increases the quality of the manuscript but on the other hand, makes the discussion hard to follow in some of the paragraphs for the non-expert reader.
The manuscript is in general very well written. There are only some minor issues that need to be addressed for improved readability.
Specific comments:
- The terms “sensor” and “well” are used interchangeably throughout the manuscript. For the sake of clarity, I’d suggest sticking with only one term. The sensor could be viewed as a part of the monitoring well, therefore in my opinion it makes more sense to use “well”. The goal is to optimally select wells.
- 7 and fig.6 on p.17: The performance metric nRMSE is used in Figure 6. Is it the RMSE relative to the standard deviation or the range of observations? Please explain this detail on page 7, where the RMSE equation is given.
- Lines 248-253: Although it is clear, why a single set of kriging parameters is used in the production of the map series, on what basis this particular parameter set is selected. Perhaps the values for the parameters can be provided in one additional sentence.
- Line 346: It seems that reduction stages (or steps as it is used in the caption of Fig.5) refer actually to the fraction of the wells used in the analysis. My understanding from a 10% reduction is that 90% of the wells that is 432 wells are used. Please consider rewording or clarifying this issue where necessary.
- Line 346-347: “Consequently, well 154-304-1, the highest-ranked well shown with rank 59 (bottom),…” – This is confusing because one would expect the well with rank number 1 the “most important” well.
- I’d suggest adding a little discussion in the conclusions section about the relative value of 1-D (hydrograph) data versus 2-D data. Which should be preferred if both are available? For which type of monitoring data does the presented approach work better?
Technical corrections:
- The website link for the reference on lines 559-560 needs to be corrected as it does not seem to be active.
-
AC2: 'Reply on RC2', Marc Ohmer, 16 May 2022
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess-2022-69/hess-2022-69-AC2-supplement.pdf
Marc Ohmer et al.
Marc Ohmer et al.
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
336 | 86 | 16 | 438 | 22 | 6 | 4 |
- HTML: 336
- PDF: 86
- XML: 16
- Total: 438
- Supplement: 22
- BibTeX: 6
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1