the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Datadriven estimates for the geostatistical characterization of subsurface hydraulic properties
Falk Heße
Sebastian Müller
Sabine Attinger
Abstract. The geostatistical characterization of the subsurface is confronted with the double challenge of large uncertainties and high exploration costs. Making use of all available data sources is consequently very important. Bayesian inference is able to mitigate uncertainties in such a data scarce context by drawing on available background information in form of a prior distribution. To make such a prior distribution transparent and objective, it should be calibrated against a data set containing estimates of the target variable from available sites. In this study, we provide a collection of covariance/variogram functions of the subsurface hydraulic parameters from a large number of sites. We analyze this data set by fitting a number of widely used variogram model functions and show how they can be used to derive prior distributions of the parameters of said functions. In addition, we discuss a number of conclusions that can be drawn for our analysis and possible uses for the data set.
Falk Heße et al.
Status: open (until 30 Mar 2023)

RC1: 'Comment on hess202315', Anonymous Referee #1, 22 Feb 2023
reply
The paper "Datadriven estimates for geostatistical characterization of subsurface hydraulic properties" by Hesse et al. is a nice try to build a comprehensive data set of variogram parameters that could be used as reference values or as prior distributions for cases in which there are not enough data to construct an experimental variogram from the available observations. While the proposal seems interesting, I have many reservations about the paper's contents and how the results are obtained and analyzed.
First of all, pretending to find the "universal" variogram parameters distribution is a little bit pretentious. The analysis is made from a purely mathematical/computer science perspective without considering that variograms relate to the underlying geology. In this sense, one would not expect variograms computed in fractured media to be the same as those computed in sedimentary formations. Also, and very importantly, a parameter such as hydraulic conductivity is not expected to vary in space smoothly under any circumstances; for this reason, the use of Gaussian variograms or, in general, variograms with low roughness should never be advocated (unless they are accompanied by a nugget effect that would break the smoothness imposed by those variograms).
Next, the terminology used departs sufficiently from the one used in standard geostatistical literature (see, Journel and Huijbrets, 1978; Isaaks and Srivastava, 1989; Deutsch and Journel, 1991; or Goovaerts, 1997) to make it very confusing, and, at some points incorrect.
Let's dive into more detail about my reservations. The variogram is a vectorial function that depends on the separation distance vector h; only in the case of isotropic variograms in all directions does such a function becomes a scalar one. The definition in section 2.2 and the analysis thereof is made assuming that the variogram is a scalar function depending only on the modulus of h. This is a significant conceptual flaw, which is not cleaned with the anisotropy analysis made in section 3.3. The expression of the variogram function in line 135 is incorrect. You cannot separate the variance and the nugget effect. The variance of the random function model is (n+σ^{2}) (using the notation in that section), and the covariance is the variance minus de variogram ((n+σ^{2})γ). The authors should recall that, generally, the variogram is modeled as a sum of variogram functions, each one contributing their share to the total variance (or variogram sill). They have chosen to model it as the sum of two variogram models, a nugget effect (with a contribution of n) and another parametric function (with a contribution of σ^{2}), which is perfectly OK; however, the authors should acknowledge that there could be fitting with more than two components. What is not OK is much of the later analysis in which this fact is disregarded: many of the findings are due to this fact, the fitting of a nugegt plus another variogram, and some of the conclusions are said to be due to, say, the type of parametric model used, disregarding that a nugget had been used to fit the variogram.
Very important, σ^{2} is not the variance of the process! It is only the contribution of one of the two structures fitted to the total variance.
The variogram computed as a function of the modulus of the separation vector is referred to as the omnidirectional variogram. It is generally used during the modeling process to assess the nugget effect and the sill of the variogram and to have an idea of the average range from the different directions, but it is not a variogram to be used in practice for anisotropic media. Variograms are vectorial functions, and as such, their value depends on the vector orientation reflecting the anisotropy in the correlation patterns of the variable under analysis. Such anisotropy is characterized by an ellipsoid (in 3D) or an ellipse (in 2D), which, in principle, may be arbitrarily oriented, although, in practice, it is assumed that one of the ellipsoid axes is oriented vertically and the other two in the XY plane, not necessarily parallel to the Cartesian axes. The lengths of the semiaxes of this ellipsoid are known as the anisotropy ranges and are crucial to characterizing the spatial continuity of the attribute. The authors disregard the vectorial nature of the variogram and introduce the "characteristic length" as one of the parameters of the parametric variogram components. The characteristic length, 'l' in the equations in the lines between 135 and 150, is not the range and may induce mistakes. In standard geostatistics, the range is defined as the distance at which the variogram value reaches the sill, or for variograms that are asymptotic to the sill, the distance at which the variogram reaches at least 95 % of the sill. This means the range will equal l for the spherical variogram but 3l for the Gaussian or exponential ones. Comparing the value of l for different variogram functions is like comparing two different quantities.
I am very concerned with the many published theoretical papers in which the variogram is assumed to be Gaussian. This assumption is always a matter of convenience, especially in those papers more focused on analytical approaches, disregarding the fact that Gaussian variograms make only sense for variables that vary smoothly in space, such as the thickness of a layer, but never should be used to model a parameter characterized by a very large spread with significant shortscale variability like the hydraulic conductivity.
It is not clear if the variograms analyzed are for K or logK. Line 157 mentions K, and nowhere is logK mentioned. The variogram of K may not of interest in geostatistical studies since it is common to use a lognormal random function to model the spatial variability of K. In addition, if K had been used, my criticism of using Gaussian variograms (or variograms with low roughness) is exacerbated.
Considering the significant uncertainties associated with experimental variograms, trying to distinguish whether a stable or a Matern variogram fits better than an exponential or a spherical one is meaningless. The analysis would have benefited from comparing just the spherical and exponential variograms (plus the nugget effect). Neither the Matern nor the stable models make sense when modeling hydraulic conductivity from a geological point of view. A much more meaningful fitting would have been a power variogram, which would relate to the fractal behavior claimed by Neuman.
More specific comments
What are observable base rates? (line 22)
What is a variogram cloud? (line 39, line 96, and many more times in the following pages)
The paragraph from lines 52 to 55 is meaningless.
After introducing the roughness coefficient in line 166, the authors analyze the roughness of the different parametric functions used in their variogram definitions, but they fail to recognize that as soon as a nugget effect is added, the roughness of the fitted variogram is zero. This explains why, later, some of the analyses seem to favor fitting a "Gaussian variogram" when in truth, they are fitting a variogram composed of a nugget effect plus a Gaussian component. The nugget removes the implicit roughness of the Gaussian variogram, justifying its application in practice.
When introducing section 3, the authors discuss the length scale and vertical and horizontal anisotropy. For the first time, the authors recognize that anisotropy may be an issue, but it is poorly defined. The length scale could be the larger range, and the horizontal and vertical anisotropies could be the ratios of the ranges in the other two principal directions with respect to the largest range. But this is not what the authors analyze; they focus on the range of the omnidirectional variogram, plus the ranges in some arbitrary directions in the XY plane (not necessarily in the directions of the principal ranges) and the range in the vertical direction.
In section 2.3, it would be nice to discuss the fitting procedure used to get the variogram parameters.
In the fitting algorithm, the authors should use an algorithm capable of fitting a fully vectorial function and attempt to identify not just a single omnidirectional range but actually the whole anisotropy ellipsoid (or ellipse), that is, the three (or two) principal ranges and their orientations.
The statement in line 215 does not seem to be much justified. Why should all models fit data? There may be physical reasons why one kind of variogram is more appropriate than other. In some cases, as already mentioned, some variogram models are inconsistent with the sample data (such as the Gaussian variogram without nugget effect to fit the spatial variability of hydraulic conductivity).
Line 223. The authors realize that the nugget effect is responsible for the similar results yielded but the different parametric models.
The matrices of scatterplots must be explained. What does each point in each scatterplot represent? The diagonal scatterplots make no sense unless they represent something different from the rest of the scatterplots. If they show the same parameters, all points in the diagonal scatterplots should fall in the diagonal at 45 degrees.
The end of the sentence in line 229 is very important. It is always so! The particularities of the site under analysis should drive the analysis. Trying to find the de "universal" parameter distribution is a mistake.
The second sentence in the paragraph starting at line 230 is hard to understand.
Unclear what is meant in the sentence that starts in line 251 to the end of the paragraph.
End of page 12: A power variogram could explain this behavior.
In the paragraph in line 260, the relationship between the lengths should be at the 45degree line when comparing spherical and Gaussian and at the y=3x for the comparison with the spherical variogram.
Section 3.3 makes absolutely no sense. Anisotropy is a fact and should be appropriately analyzed and accounted for. Analyzing λ_{x} and λy when they do not represent the actual principal directions of the anisotropy ellipse is useless.
The analysis of λz would make sense if compared with the largest range in the plane but not when compared with an arbitrary range calculated in the x direction.
In section 3.4, I hope that the normalization of the nugget was done with respect to the total variance (n+σ^{2}), not with regard to σ^{2}. There is no reason why n could not be larger than σ^{2}, therefore escaping from the [0,1] limits.
The nugget not only accounts for measurement errors but also for shortscale variability. The discussion in this section is poor and, at times, flawed.
Line 354 is explained because the Gaussian model needs the nugget to fit most experimental variograms, whereas the exponential does not. Consequently, the results are not an artifact of the fitting procedure.
In the opening of section 3.5, the authors forget that the length parameter is not enough; there is an anisotropy ellipse.
Why does Figure 15 not show a histogram as before?
The discussion about survivor bias is very interesting.
The quality of the figures must be improved.
I think a reanalysis of the data accounting for the mistakes made in this evaluation, probably reducing the number of variogram functions and avoiding the construction of the prior probability functions for the different models, would be worth publishing.
Citation: https://doi.org/10.5194/hess202315RC1 
RC2: 'Comment on hess202315', Anonymous Referee #2, 12 Mar 2023
reply
The paper presents present a datadriven approach to a classical problem in subsurface hydrology, the estimation of parameters characterizing the variogram of subsurface properties. The proposed method advocates the use of Bayesian inference to set up a prior distribution for models that describe spatial correlations (covariance or variogram). A remarkable data set is examined, and available data are classifiable into primary (point measurements), secondary (empirical variogram functions), or tertiary (statistical estimates of subsurface properties) data. Data were processed to avoid overlaps and overrepresentation.
The first available result from the manuscript is the comparison between different variogram model functions, that could be improved in my view (see comment #2). The scale dependency of the hydraulic conductivity is examined next, confirming earlier literature results. An example application of the Bayesian approach to the estimate of correlation length given the maximum length scale is then presented. Other variogram parameters examined are anisotropy, nugget effect, and shape parameter.
The discussion addresses important issues such as the unbiasedness of the data set employed, among others.
The paper looks as a mature contribution; given the topic and the type of paper, I see however some room for further improvement. Results are of interest to the readership of Hydrology and Earth System Sciences. The methods are adequate, the paper subdivision into sections sound, and the figures illustrative. I recommend minor revisions for the reason explained below.
 The manuscript examines only stationary variograms, I suggest to mention that nonstationary variograms (see, e.g.,Di Federico and Neuman, 1997) were excluded from the analysis.
 The comparison among variograms having a different numbers of degrees of freedom (section 3.1) could be rendered more qualitative by model identification criteria (AOC, AIC_c, KIC, …), incorporating the number of parameters involved and the principle of parsimony. The same holds probably for other comparisons performed.
 How crucial is the assumption of two independent Gaussian distributions in Section 3.2 ? Could they develop a more general theory without it, maybe subject to other limitations ?
Citation: https://doi.org/10.5194/hess202315RC2 
RC3: 'Comment on hess202315', Shlomo P. Neuman, 14 Mar 2023
reply
The authors have conducted an interesting statistical analysis of raw variograms (I assume that this is what the authors mean by variogram clouds) of published data from a variety of geographic locations and geologic settings. Though the data are said to include saturated hydraulic conductivities from aquifers and shallow soils, as well as permeabilities and transmissivities, I assume (but the authors need to confirm or otherwise clarify) that the data consist of logarithms of these variables. A strength AND a weakness of the analysis is the lack of clear distinction between varied hydraulic properties (conductivity versus transmissivity) from varied settings (sediments, rocks, porous versus fractured media) measured by varied means (I presume single and multihole pressure tests, perhaps other) on varied scales of measurement. The strength of the approach is that it reveals commonalities among the data (scaling and amenability to analysis by a variety of variogram models); its weakness is lack of clarity about the way hydrogeologic settings, methods and scales of measurements, affect variogram behavior.
To me, one of the more interesting results of the analysis is confirmation of a scale effect that reveals itself, precisely, when one throws data from varied sites and settings into a single basket, as do the authors. It would therefore behove the authors to take the additional step of attempting to fit a truncated power variogram to their data, thereby potentially showing that such a variogram captures a greater range of behaviors with fewer parameters than do the stationary variograms they consider.
Not only might a truncated power variogram prove to be more general and parsimonious than do standard stationary variogram, but they would be much more suitable than the latter when one needs to predict, using a transport model, how a plume of contaminant might migrate and behave outside the domain originally used to define the variogram.
I think that with additional work along the lines I have just proposed, this paper could stand out as a much more significant contribution to the literature than it does now.
Citation: https://doi.org/10.5194/hess202315RC3
Falk Heße et al.
Falk Heße et al.
Viewed
HTML  XML  Total  BibTeX  EndNote  

263  77  10  350  2  5 
 HTML: 263
 PDF: 77
 XML: 10
 Total: 350
 BibTeX: 2
 EndNote: 5
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1