the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Incorporating interpretation uncertainties from deterministic 3D hydrostratigraphic models in groundwater models
Rasmus Bødker Madsen
Torben O. Sonnenborg
Lærke Therese Andersen
Peter B. E. Sandersen
Jacob Kidmose
Ingelise Møller
Thomas Mejer Hansen
Karsten Høgh Jensen
AnneSophie Høyer
Abstract. Many 3D hydrostratigraphic models of the subsurface are interpreted as deterministic models, where an experienced modeler combines relevant geophysical and geological information with background geological knowledge. Depending on the quality of the information from the input data, the interpretation phase will typically be accompanied by an estimated qualitative interpretation uncertainty. Given the qualitative and subjective nature of uncertainty, it is difficult to propagate the uncertainty to groundwater models. In this study, a stochastic simulationbased methodology to characterize interpretation uncertainty within a manual interpretationbased layer model is applied in a groundwater modeling setting. Three levels of interpretation uncertainty scenarios are generated and three locations in the models representing different geological structures are analyzed. The impact of interpretation uncertainty on predictions of capture zone area and median travel time is compared to the impact of uncertainty on parameters in the groundwater model. The main result is that in areas with thick and large aquifers and low geological uncertainty, the impact of interpretation uncertainty is negligible compared to the hydrogeological parameterization, while it may introduce a significant contribution in areas with thinner and smaller aquifers with high geologic uncertainty. The influence of the interpretation uncertainties is thus dependent on the geological setting as well as the confidence of the interpreter. In areas with thick aquifers, this study confirms existing evidence that if the conceptual model is welldefined, interpretation uncertainties within the conceptual model have limited impact on groundwater model predictions.
 Preprint
(2943 KB)  Metadata XML

Supplement
(276 KB)  BibTeX
 EndNote
Trine Enemark et al.
Status: closed

RC1: 'Comment on hess202374', Marc Bierkens, 12 Jun 2023
Review of “Incorporating interpretation uncertainties from deterministic 3D hydrostratigraphic models in groundwater models” by Enemark et al.
When regional groundwater models are developed, an important step is to build a conceptual hydrostratigraphic model based on the geological information at hand. Conceptual hydrostratigraphic models are based on mapping the 3D juxtaposition of geological layers and translating these to aquifers and aquitards, which are subsequently populated with hydraulic parameters (conductivities, transmissivities, storage coefficients) and used to schematize the 3Dmakeup of a groundwater flow and/or transport model. The mapping of geological layers is preferably done by expert geologists that combine their conceptual knowledge of the depositional or structural geological environment with insitu borehole descriptions, outcrop information and geophysical data (e.g. gamma logs, EM measurements etc). However, since there is much room for interpretation, no two geologists will provide the same conceptual hydrostratigraphic model.
In this paper, the authors use a recently developed method to assess this “interpretation uncertainty” in hydrostratigraphic models to assess how the uncertainty about the layer boundaries between hydrostratigraphic propagates to the uncertainty in groundwater model outcomes. They compare this degree of uncertainty with the uncertainty that accrues from unknown hydraulic parameters (a more common analysis). Apart from demonstrating the method in an uncertainty analysis (focused on capture zone size and median travel time), the authors also show that the schematization uncertainty is important of little insitu data are available and if the layers to be identified and mapped are thin.
This is a valuable paper that presents a nice approach that is worth being be picked up by the groundwater modelling community in order to extent their toolbox of approaches in uncertainty assessment.
I think this paper deserves being published in HESS subject to resolving the following issues.
 The LowFrequency model is insufficiently explained in the paper. It may well be based on work by Madsen et al, but this paper needs be readable on its own. Particularly:
 How is the manual interpretation model constructed? Are the smooth lines between the interpretation points in Figure 2 actual kriged values? Was this a 2Dkriging per surface? What semivariogram was used? How does one make sure that the boundaries between layers do not cross or do cross in case of a presumed erosive surface? And how is this resolved? Is there a manul postprocessing?
 The simulation of vertical perturbations at the interpretation points is based on categories and then the standard deviation per category. A table with standard deviations should be given per case and per category.
 Please be more specific about the nature of the LF model? How is it fitted? What are its equations? Are they smoothing splines? Or moving averages of a zeroerror interpretation model interpolated with kriging between the randomly perturbed interpretation points? Or is it kriging with uncertain data? Perhaps a stepbystep procedure description would be helpful.
 How does the degree of smoothing takes account of spatially varying degrees of uncertainty? Is the level of smoothing spatially varying as well?
 It seems that with high uncertainty, the smoothing is more intense. But how does the LF model distinguish between actual locally higher spatial variability (while the data are certain and based on detailed borelogs) and spatial uncertainty?
 Section 3.2.3: here the 200 behavioural hydraulic parameter sets are selected with GLUE and assuming the manual interpretation model to be true. How is it guaranteed that these parameter sets are still behavioural for the other simulated realizations of layering? I understand that all combinations cannot be evaluated, but a few random hydrostratigraphies could be checked whether the 200 parameter sets are still close enough to be called behavioural?
 The conclusion that with thick layers and lower uncertainty about the layer boundaries is small compared to the uncertainty from hydraulic parameters: To what extent is this conclusion dependent on model approach. How were the heads and fluxes simulated in these models: a quasi3D aquiferaquitard schematization or with a full 3D voxel model? In the first case, vertical fluxes in aquifers are ignored and this may underestimate the impacts of the thickness of a layer, particularly near wells. In this case it is also understandable that thickness (one order of magnitude variation) has much less impact than conductivity (with multiple orders of magnitude variation ).
 I urge the authors to make the datasets (schematization, hard data, interpretation points and manual interpretation model) available.
 There are some small remarks and suggestions for improvements that I have put in the pdf attached.

AC1: 'Reply on RC1', Trine Enemark, 26 Aug 2023
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess202374/hess202374AC1supplement.pdf
 The LowFrequency model is insufficiently explained in the paper. It may well be based on work by Madsen et al, but this paper needs be readable on its own. Particularly:

RC2: 'Comment on hess202374', Thomas Hermans, 16 Jun 2023
Dear authors,
I read with interest your paper entitled “Incorporating interpretation uncertainties from deterministic 3D hydrostratigraphic models in groundwater models” which investigates the role of both hydrostratigraphic uncertainty and model parameter uncertainty on the prediction of groundwater models. The approach starts from an interpreted model which is perturbed by adding uncertainty (at different levels) on the boundaries between categories to produce 50 different realization. Then, each hydrostratigraphic realization is tested with a selected set of 200 model parameters combinations. It is concluded that the impact of hydrostratigraphic uncertainty is lower when the data density and reliability is large.
I find the paper well written, scientifically rigorous and an important tool to characterize uncertainty. I recommend publication after taking into account the following suggestions.
 It took me some time to understand that the paper would both investigate hydrostratigraphic interpretation and model parameters. In particular, the initial GLUE interpretation to select the model parameters was unexpected. I would therefore suggest to i) clarify this objective in the introduction, ii) to present a stepbystep workflow in the methodology, ideally accompanied by a figure, to clarify from the beginning the methodological approach.
 I find the research context as currently presented in the introduction quite narrow. The field of uncertainty investigation in groundwater models is quite extended, and the introduction is rather written as an incremental step in the Danish methodology. Actually, what you propose could have much broader applications, as the same ideas could easily be applied to other geological modelling approaches (for example conditioned multiplepoint geostatistics). Actually, I see some similarities with the work of Benoit et al. (2020, 2021) to simulate both hydrostratigraphic units and hydraulic conductivity uncertainty. In particular, your choice of selecting first 200 parameter distributions is justified by the desire to look at the marginal impact of hydrostratigraphy, but it ignores that zonation and model parameter likely interact (as likely illustrated by some of the rejected realizations you obtain), so that the posterior distribution for different scenarios are likely different (e.g., Hermans et al., 2015). Approaches that simultaneously simulate structural/scenario with model parameter uncertainty (possibly with intrafacies variability) could also be introduced/discussed.
 I agree with reviewer 1 that the simulation approach to generate the hydrostratigraphic realizations should be better explained. Even if it is published in another paper, it is crucial for the current study and should therefore be included. In particular, the two step procedure (first category boundary, second LF model) could be illustrated with an example for each uncertainty level and corresponding simulations could be shown.
Below, I have a series of specific comments to further improve the manuscript.
 I wonder if using “deterministic” in the title is representative, as the uncertainty interpretation is actually based on stochastic simulations.
 “typically assigned” suggests that this is standard practice, but only a reference in preparation is added. Do you have other references to cite ?
 With “largescale” do you mean the difficulty lies in upscaling the uncertainty? I am not getting the point, as if an uncertainty measure is given at the proper scale, it should not be more difficult that at the small scale.
 L6769. You seem to make a difference between your approach starting from an interpreted model, and a stochastic approach that would start from a definition of prior probabilities. If conceptually different, isn’t the endresult equivalent (a set of realizations), your interpreted model acting as a training image (e.g., Benoit et al., 2020).
 L9198. I miss a better explanation here (only dealt with (partly) later in section 3.1.1) . From Figure 2, it seems that category 1 corresponds to wells, but it is unclear how the categories 2 to 4 are obtained (geophysics ?). How is the density of interpretation points chosen? For example, in a AEM image, one could have interpretation points all along the flight lines (sounding every few meters). See also main comment 3.
 does or does not ?
 L159160. It might be interesting to show an example (in Supplementary material?). Although this is not the topic of the article, it would help the reader to grasp what type of uncertainty we are talking about. For example, in absence of borehole, an AEM survey would not make the difference between sand layers of different ages, and thin layers might be missed. If uncertainty about a boundary is included, is the uncertainty about the presence of a boundary or not included in the different categories?
 L161163. This is again not the main topic of the paper, but from a geophysicist’s perspective, the uncertainty about the presence of the layer is different from the uncertainty of the depth of the corresponding interface. I have no doubt that the clay is clearly visible on the geophysical inversion, however the uncertainty related to its depth depends on the regularization (smoothing), the depth of the interface (loss of resolution with depth) but also the discretization (geophysical inversion typically uses a grid whose cell size increases with depth). I would at least refer to the papers where this has been dealt with.
 Section 3.1.2. Please see my main comment 3.
 L197198. I don't get the sentence. The travel time is surely dependent on the distribution of hydraulic conductivity in the area, while the zonation can (should) be an integral part of a calibration process.
 L222233. See main comments 1 and 2.
 Table 1. General Head and River conditions are mentioned in the table, but not in the description of the model and its boundary conditions. The outer boundary conditions (no flow?) and the recharge are not specified either. I guess the model is steadystate?
 There is no Enemark et al. (2021) in the reference list. Is it 2022?
 The transparency related to entropy is difficult to read on the figure because the initial color scale is made of nuances of the same colors (reddish, brownish). Maybe use a more diverse initial set of colors?
 This is a steadystate model, isn't a convergence problem then just a matter of convergence criteria rather than a similarity issue with the Manual interpretation? For example because the solution would be further away from the initial state. Have you tried to use another solver or increase the number of iterations?
 Isn't it a problem of the set of parameter realizations that ate not adequate, because of the interaction between zonation and model parameters? See main comment 2.
 I guess the river flow is an average flow in the river. Since Modflow will simulate the average base flow to the river, isn’t a large error expected (runoff component) in this case?
 “Have” instead of “has”
 Delete “impact of”
 L443444. Other alternatives exist using for example simulationbased learning avoiding calibration (see recent review in HESS by Hermans et al. 2023 (section 3) and references therein or Thibaut et al. 2021 for a recent application to well head protection area – so a similar context). But as this is related to my own work, I am clearly biased and I let it to you to decide if this (and works from other) is relevant.
Sincerely yours,
Thomas Hermans
References
Benoit, N., Marcotte, D., Boucher, A., D’Or, D., Bajc, A., Rezaee, H., 2018. Directional hydrostratigraphic units simulation using MCP algorithm. Stoch Environ Res Risk Assess 32, 1435–1455. https://doi.org/10.1007/s0047701715069
Benoit, N., Marcotte, D., Molson, J., 2021. Stochastic correlated hydraulic conductivity tensor calibration using gradual deformation. Journal of Hydrology 594, 125880. https://doi.org/10.1016/j.jhydrol.2020.125880
Hermans, T., Goderniaux, P., Jougnot, D., Fleckenstein, J.H., Brunner, P., Nguyen, F., Linde, N., Huisman, J.A., Bour, O., Lopez Alvis, J., Hoffmann, R., Palacios, A., Cooke, A.K., PardoÁlvarez, Á., Blazevic, L., Pouladi, B., Haruzi, P., Fernandez Visentini, A., Nogueira, G.E.H., TiradoConde, J., Looms, M.C., Kenshilikova, M., Davy, P., Le Borgne, T., 2023. Advancing measurements and representations of subsurface heterogeneity and dynamic processes: towards 4D hydrogeology. Hydrol. Earth Syst. Sci. 27, 255–287. https://doi.org/10.5194/hess272552023
Hermans, T., Nguyen, F., Caers, J., 2015. Uncertainty in training imagebased inversion of hydraulic head data constrained to ERT data: Workflow and case study. Water Resources Research 51, 5332–5352. https://doi.org/10.1002/2014WR016460
Thibaut, R., Laloy, E., Hermans, T., 2021. A new framework for experimental design using Bayesian Evidential Learning: The case of wellhead protection area. Journal of Hydrology 603, 126903. https://doi.org/10.1016/j.jhydrol.2021.126903
Citation: https://doi.org/10.5194/hess202374RC2 
AC2: 'Reply on RC2', Trine Enemark, 26 Aug 2023
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess202374/hess202374AC2supplement.pdf

RC3: 'Comment on hess202374', Anonymous Referee #3, 26 Jun 2023
The paper “Incorporating interpretation uncertainties from deterministic 3D hydrostratigraphic models in groundwater models” addresses the importance of characterizing uncertainties of both the hydrostratigraphic model and the model parameters of the groundwater model. The topic is well presented and of high importance. Therefore, I recommend publication after the following points are addressed.
General Remarks:
 At first, I found it challenging to understand the procedure of the performed uncertainty quantification. For example that both the hydrostratigraphic and model parameter uncertainties are investigated. This should be clarified in several parts of the paper. For instance on p. 3 in l.7176, three scenarios are mentioned. However, it should be clarified that not a single realization but multiple realizations are run per scenario.
 There exists extensive literature about how to deal with uncertainties, especially in the field of geological modeling (e.g., Wellmann and Caumon, 2018), which needs to be added to the introduction to provide a broader perspective on this topic.
 I agree with the previous reviewers that the information provided about the lowfrequency and manual interpretation model is insufficient and needs to be extended.
Further Remarks:
 p.1 l.14: The authors talk about “the qualitative and subjective nature of uncertainty”. In general, one distinguishes between epistemic and aleatoric uncertainties. While the statement is true for epistemic uncertainties it is not true for aleatoric uncertainties. So, the statement needs to be specified by explaining which type of uncertainties are addressed in the paper.
 p.2 l.42: A manuscript that is in preparation is cited. Please either publish that manuscript as a preprint and cite this preprint or use a different reference since the current reference is not available to the reader.
 p.2 l.4243: Clarify which type of uncertainties the paper addresses (see also the first comment under further remarks)
 p. 5 Figure 1: It would be helpful to denote the profile lines with a,b, and c according to Figure 2. Such that these two figures can be better set in relation to each other.
 p. 5 l. 111: “The synthetic well field does exist in the real world … ”. Should the formulation not be “The synthetic well field does not exist in the real world”?
 p. 10 l. 183184: Why were 50 realizations chosen? Has a convergence test been performed?
 p. 10 l. 189: Where are the uncertainties of the medium scenario listed?
 p. 11 Section 3.2.2: Provide the exact description and definitions of the boundary conditions and governing equations. It is not sufficient to list only the used packages.
 p. 11 l. 215: Why was a random sampling strategy chosen and not a quasirandom strategy such as the Latin Hypercube Sampling (LHS) method? The LHS would have the advantage of better sampling the parameter space with few samples and avoiding the clustering of sample points as it often occurs for the random sampling method.
 p. 11 l. 217: Why was a uniform prior chosen while all other considerations so far targeted normal distributions?
 p. 11 l. 220: Provide details on how it was determined that the parameters are insensitive. Which type of analysis was used to get to this conclusion?
 p. 15 l. 294296: The reasons why the solutions are not converging should be listed. Especially if the nonconvergence is related to specific parameter ranges this might have a significant impact on the interpretation. Why is a trend of decreasing convergence problems observed with an increase in uncertainties? Would one not expect it to be the other way around?
 p. 15 l. 298299: It should be explained if and when why the analysis is still representative if in one case 46 % of the realizations are discarded and in the other two scenarios only 6 % or 1 %.
 p. 16 l. 320: The standard deviation is only a valid measure for normal distributions. Have, for instance, qq plots been generated to show that the data follows a normal distribution?
References:
 Wellmann, F., & Caumon, G. (2018). 3D Structural geological models: Concepts, methods, and uncertainties. In Advances in geophysics (Vol. 59, pp. 1121). Elsevier.
Citation: https://doi.org/10.5194/hess202374RC3 
AC3: 'Reply on RC3', Trine Enemark, 26 Aug 2023
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess202374/hess202374AC3supplement.pdf
Status: closed

RC1: 'Comment on hess202374', Marc Bierkens, 12 Jun 2023
Review of “Incorporating interpretation uncertainties from deterministic 3D hydrostratigraphic models in groundwater models” by Enemark et al.
When regional groundwater models are developed, an important step is to build a conceptual hydrostratigraphic model based on the geological information at hand. Conceptual hydrostratigraphic models are based on mapping the 3D juxtaposition of geological layers and translating these to aquifers and aquitards, which are subsequently populated with hydraulic parameters (conductivities, transmissivities, storage coefficients) and used to schematize the 3Dmakeup of a groundwater flow and/or transport model. The mapping of geological layers is preferably done by expert geologists that combine their conceptual knowledge of the depositional or structural geological environment with insitu borehole descriptions, outcrop information and geophysical data (e.g. gamma logs, EM measurements etc). However, since there is much room for interpretation, no two geologists will provide the same conceptual hydrostratigraphic model.
In this paper, the authors use a recently developed method to assess this “interpretation uncertainty” in hydrostratigraphic models to assess how the uncertainty about the layer boundaries between hydrostratigraphic propagates to the uncertainty in groundwater model outcomes. They compare this degree of uncertainty with the uncertainty that accrues from unknown hydraulic parameters (a more common analysis). Apart from demonstrating the method in an uncertainty analysis (focused on capture zone size and median travel time), the authors also show that the schematization uncertainty is important of little insitu data are available and if the layers to be identified and mapped are thin.
This is a valuable paper that presents a nice approach that is worth being be picked up by the groundwater modelling community in order to extent their toolbox of approaches in uncertainty assessment.
I think this paper deserves being published in HESS subject to resolving the following issues.
 The LowFrequency model is insufficiently explained in the paper. It may well be based on work by Madsen et al, but this paper needs be readable on its own. Particularly:
 How is the manual interpretation model constructed? Are the smooth lines between the interpretation points in Figure 2 actual kriged values? Was this a 2Dkriging per surface? What semivariogram was used? How does one make sure that the boundaries between layers do not cross or do cross in case of a presumed erosive surface? And how is this resolved? Is there a manul postprocessing?
 The simulation of vertical perturbations at the interpretation points is based on categories and then the standard deviation per category. A table with standard deviations should be given per case and per category.
 Please be more specific about the nature of the LF model? How is it fitted? What are its equations? Are they smoothing splines? Or moving averages of a zeroerror interpretation model interpolated with kriging between the randomly perturbed interpretation points? Or is it kriging with uncertain data? Perhaps a stepbystep procedure description would be helpful.
 How does the degree of smoothing takes account of spatially varying degrees of uncertainty? Is the level of smoothing spatially varying as well?
 It seems that with high uncertainty, the smoothing is more intense. But how does the LF model distinguish between actual locally higher spatial variability (while the data are certain and based on detailed borelogs) and spatial uncertainty?
 Section 3.2.3: here the 200 behavioural hydraulic parameter sets are selected with GLUE and assuming the manual interpretation model to be true. How is it guaranteed that these parameter sets are still behavioural for the other simulated realizations of layering? I understand that all combinations cannot be evaluated, but a few random hydrostratigraphies could be checked whether the 200 parameter sets are still close enough to be called behavioural?
 The conclusion that with thick layers and lower uncertainty about the layer boundaries is small compared to the uncertainty from hydraulic parameters: To what extent is this conclusion dependent on model approach. How were the heads and fluxes simulated in these models: a quasi3D aquiferaquitard schematization or with a full 3D voxel model? In the first case, vertical fluxes in aquifers are ignored and this may underestimate the impacts of the thickness of a layer, particularly near wells. In this case it is also understandable that thickness (one order of magnitude variation) has much less impact than conductivity (with multiple orders of magnitude variation ).
 I urge the authors to make the datasets (schematization, hard data, interpretation points and manual interpretation model) available.
 There are some small remarks and suggestions for improvements that I have put in the pdf attached.

AC1: 'Reply on RC1', Trine Enemark, 26 Aug 2023
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess202374/hess202374AC1supplement.pdf
 The LowFrequency model is insufficiently explained in the paper. It may well be based on work by Madsen et al, but this paper needs be readable on its own. Particularly:

RC2: 'Comment on hess202374', Thomas Hermans, 16 Jun 2023
Dear authors,
I read with interest your paper entitled “Incorporating interpretation uncertainties from deterministic 3D hydrostratigraphic models in groundwater models” which investigates the role of both hydrostratigraphic uncertainty and model parameter uncertainty on the prediction of groundwater models. The approach starts from an interpreted model which is perturbed by adding uncertainty (at different levels) on the boundaries between categories to produce 50 different realization. Then, each hydrostratigraphic realization is tested with a selected set of 200 model parameters combinations. It is concluded that the impact of hydrostratigraphic uncertainty is lower when the data density and reliability is large.
I find the paper well written, scientifically rigorous and an important tool to characterize uncertainty. I recommend publication after taking into account the following suggestions.
 It took me some time to understand that the paper would both investigate hydrostratigraphic interpretation and model parameters. In particular, the initial GLUE interpretation to select the model parameters was unexpected. I would therefore suggest to i) clarify this objective in the introduction, ii) to present a stepbystep workflow in the methodology, ideally accompanied by a figure, to clarify from the beginning the methodological approach.
 I find the research context as currently presented in the introduction quite narrow. The field of uncertainty investigation in groundwater models is quite extended, and the introduction is rather written as an incremental step in the Danish methodology. Actually, what you propose could have much broader applications, as the same ideas could easily be applied to other geological modelling approaches (for example conditioned multiplepoint geostatistics). Actually, I see some similarities with the work of Benoit et al. (2020, 2021) to simulate both hydrostratigraphic units and hydraulic conductivity uncertainty. In particular, your choice of selecting first 200 parameter distributions is justified by the desire to look at the marginal impact of hydrostratigraphy, but it ignores that zonation and model parameter likely interact (as likely illustrated by some of the rejected realizations you obtain), so that the posterior distribution for different scenarios are likely different (e.g., Hermans et al., 2015). Approaches that simultaneously simulate structural/scenario with model parameter uncertainty (possibly with intrafacies variability) could also be introduced/discussed.
 I agree with reviewer 1 that the simulation approach to generate the hydrostratigraphic realizations should be better explained. Even if it is published in another paper, it is crucial for the current study and should therefore be included. In particular, the two step procedure (first category boundary, second LF model) could be illustrated with an example for each uncertainty level and corresponding simulations could be shown.
Below, I have a series of specific comments to further improve the manuscript.
 I wonder if using “deterministic” in the title is representative, as the uncertainty interpretation is actually based on stochastic simulations.
 “typically assigned” suggests that this is standard practice, but only a reference in preparation is added. Do you have other references to cite ?
 With “largescale” do you mean the difficulty lies in upscaling the uncertainty? I am not getting the point, as if an uncertainty measure is given at the proper scale, it should not be more difficult that at the small scale.
 L6769. You seem to make a difference between your approach starting from an interpreted model, and a stochastic approach that would start from a definition of prior probabilities. If conceptually different, isn’t the endresult equivalent (a set of realizations), your interpreted model acting as a training image (e.g., Benoit et al., 2020).
 L9198. I miss a better explanation here (only dealt with (partly) later in section 3.1.1) . From Figure 2, it seems that category 1 corresponds to wells, but it is unclear how the categories 2 to 4 are obtained (geophysics ?). How is the density of interpretation points chosen? For example, in a AEM image, one could have interpretation points all along the flight lines (sounding every few meters). See also main comment 3.
 does or does not ?
 L159160. It might be interesting to show an example (in Supplementary material?). Although this is not the topic of the article, it would help the reader to grasp what type of uncertainty we are talking about. For example, in absence of borehole, an AEM survey would not make the difference between sand layers of different ages, and thin layers might be missed. If uncertainty about a boundary is included, is the uncertainty about the presence of a boundary or not included in the different categories?
 L161163. This is again not the main topic of the paper, but from a geophysicist’s perspective, the uncertainty about the presence of the layer is different from the uncertainty of the depth of the corresponding interface. I have no doubt that the clay is clearly visible on the geophysical inversion, however the uncertainty related to its depth depends on the regularization (smoothing), the depth of the interface (loss of resolution with depth) but also the discretization (geophysical inversion typically uses a grid whose cell size increases with depth). I would at least refer to the papers where this has been dealt with.
 Section 3.1.2. Please see my main comment 3.
 L197198. I don't get the sentence. The travel time is surely dependent on the distribution of hydraulic conductivity in the area, while the zonation can (should) be an integral part of a calibration process.
 L222233. See main comments 1 and 2.
 Table 1. General Head and River conditions are mentioned in the table, but not in the description of the model and its boundary conditions. The outer boundary conditions (no flow?) and the recharge are not specified either. I guess the model is steadystate?
 There is no Enemark et al. (2021) in the reference list. Is it 2022?
 The transparency related to entropy is difficult to read on the figure because the initial color scale is made of nuances of the same colors (reddish, brownish). Maybe use a more diverse initial set of colors?
 This is a steadystate model, isn't a convergence problem then just a matter of convergence criteria rather than a similarity issue with the Manual interpretation? For example because the solution would be further away from the initial state. Have you tried to use another solver or increase the number of iterations?
 Isn't it a problem of the set of parameter realizations that ate not adequate, because of the interaction between zonation and model parameters? See main comment 2.
 I guess the river flow is an average flow in the river. Since Modflow will simulate the average base flow to the river, isn’t a large error expected (runoff component) in this case?
 “Have” instead of “has”
 Delete “impact of”
 L443444. Other alternatives exist using for example simulationbased learning avoiding calibration (see recent review in HESS by Hermans et al. 2023 (section 3) and references therein or Thibaut et al. 2021 for a recent application to well head protection area – so a similar context). But as this is related to my own work, I am clearly biased and I let it to you to decide if this (and works from other) is relevant.
Sincerely yours,
Thomas Hermans
References
Benoit, N., Marcotte, D., Boucher, A., D’Or, D., Bajc, A., Rezaee, H., 2018. Directional hydrostratigraphic units simulation using MCP algorithm. Stoch Environ Res Risk Assess 32, 1435–1455. https://doi.org/10.1007/s0047701715069
Benoit, N., Marcotte, D., Molson, J., 2021. Stochastic correlated hydraulic conductivity tensor calibration using gradual deformation. Journal of Hydrology 594, 125880. https://doi.org/10.1016/j.jhydrol.2020.125880
Hermans, T., Goderniaux, P., Jougnot, D., Fleckenstein, J.H., Brunner, P., Nguyen, F., Linde, N., Huisman, J.A., Bour, O., Lopez Alvis, J., Hoffmann, R., Palacios, A., Cooke, A.K., PardoÁlvarez, Á., Blazevic, L., Pouladi, B., Haruzi, P., Fernandez Visentini, A., Nogueira, G.E.H., TiradoConde, J., Looms, M.C., Kenshilikova, M., Davy, P., Le Borgne, T., 2023. Advancing measurements and representations of subsurface heterogeneity and dynamic processes: towards 4D hydrogeology. Hydrol. Earth Syst. Sci. 27, 255–287. https://doi.org/10.5194/hess272552023
Hermans, T., Nguyen, F., Caers, J., 2015. Uncertainty in training imagebased inversion of hydraulic head data constrained to ERT data: Workflow and case study. Water Resources Research 51, 5332–5352. https://doi.org/10.1002/2014WR016460
Thibaut, R., Laloy, E., Hermans, T., 2021. A new framework for experimental design using Bayesian Evidential Learning: The case of wellhead protection area. Journal of Hydrology 603, 126903. https://doi.org/10.1016/j.jhydrol.2021.126903
Citation: https://doi.org/10.5194/hess202374RC2 
AC2: 'Reply on RC2', Trine Enemark, 26 Aug 2023
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess202374/hess202374AC2supplement.pdf

RC3: 'Comment on hess202374', Anonymous Referee #3, 26 Jun 2023
The paper “Incorporating interpretation uncertainties from deterministic 3D hydrostratigraphic models in groundwater models” addresses the importance of characterizing uncertainties of both the hydrostratigraphic model and the model parameters of the groundwater model. The topic is well presented and of high importance. Therefore, I recommend publication after the following points are addressed.
General Remarks:
 At first, I found it challenging to understand the procedure of the performed uncertainty quantification. For example that both the hydrostratigraphic and model parameter uncertainties are investigated. This should be clarified in several parts of the paper. For instance on p. 3 in l.7176, three scenarios are mentioned. However, it should be clarified that not a single realization but multiple realizations are run per scenario.
 There exists extensive literature about how to deal with uncertainties, especially in the field of geological modeling (e.g., Wellmann and Caumon, 2018), which needs to be added to the introduction to provide a broader perspective on this topic.
 I agree with the previous reviewers that the information provided about the lowfrequency and manual interpretation model is insufficient and needs to be extended.
Further Remarks:
 p.1 l.14: The authors talk about “the qualitative and subjective nature of uncertainty”. In general, one distinguishes between epistemic and aleatoric uncertainties. While the statement is true for epistemic uncertainties it is not true for aleatoric uncertainties. So, the statement needs to be specified by explaining which type of uncertainties are addressed in the paper.
 p.2 l.42: A manuscript that is in preparation is cited. Please either publish that manuscript as a preprint and cite this preprint or use a different reference since the current reference is not available to the reader.
 p.2 l.4243: Clarify which type of uncertainties the paper addresses (see also the first comment under further remarks)
 p. 5 Figure 1: It would be helpful to denote the profile lines with a,b, and c according to Figure 2. Such that these two figures can be better set in relation to each other.
 p. 5 l. 111: “The synthetic well field does exist in the real world … ”. Should the formulation not be “The synthetic well field does not exist in the real world”?
 p. 10 l. 183184: Why were 50 realizations chosen? Has a convergence test been performed?
 p. 10 l. 189: Where are the uncertainties of the medium scenario listed?
 p. 11 Section 3.2.2: Provide the exact description and definitions of the boundary conditions and governing equations. It is not sufficient to list only the used packages.
 p. 11 l. 215: Why was a random sampling strategy chosen and not a quasirandom strategy such as the Latin Hypercube Sampling (LHS) method? The LHS would have the advantage of better sampling the parameter space with few samples and avoiding the clustering of sample points as it often occurs for the random sampling method.
 p. 11 l. 217: Why was a uniform prior chosen while all other considerations so far targeted normal distributions?
 p. 11 l. 220: Provide details on how it was determined that the parameters are insensitive. Which type of analysis was used to get to this conclusion?
 p. 15 l. 294296: The reasons why the solutions are not converging should be listed. Especially if the nonconvergence is related to specific parameter ranges this might have a significant impact on the interpretation. Why is a trend of decreasing convergence problems observed with an increase in uncertainties? Would one not expect it to be the other way around?
 p. 15 l. 298299: It should be explained if and when why the analysis is still representative if in one case 46 % of the realizations are discarded and in the other two scenarios only 6 % or 1 %.
 p. 16 l. 320: The standard deviation is only a valid measure for normal distributions. Have, for instance, qq plots been generated to show that the data follows a normal distribution?
References:
 Wellmann, F., & Caumon, G. (2018). 3D Structural geological models: Concepts, methods, and uncertainties. In Advances in geophysics (Vol. 59, pp. 1121). Elsevier.
Citation: https://doi.org/10.5194/hess202374RC3 
AC3: 'Reply on RC3', Trine Enemark, 26 Aug 2023
The comment was uploaded in the form of a supplement: https://hess.copernicus.org/preprints/hess202374/hess202374AC3supplement.pdf
Trine Enemark et al.
Trine Enemark et al.
Viewed
HTML  XML  Total  Supplement  BibTeX  EndNote  

381  110  20  511  40  5  4 
 HTML: 381
 PDF: 110
 XML: 20
 Total: 511
 Supplement: 40
 BibTeX: 5
 EndNote: 4
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1