Challenges in conditioning a stochastic geological model of a heterogeneous glacial aquifer to a comprehensive soft data set

In traditional hydrogeological investigations, one geological model is often used based on subjective interpretations and sparse data availability. This deterministic approach usually does not account for any uncertainties. Stochastic simulation methods address this problem and can capture the geological structure uncertainty. In this study the geostatistical software TProGS is utilized to simulate an ensemble of realizations for a binary (sand/clay) hydrofacies model in the Norsminde catchment, Denmark. TProGS can incorporate soft data, which represent the associated level of uncertainty. High-density (20 m× 20 m× 2 m) airborne geophysical data (SkyTEM) and categorized borehole data are utilized to define the model of spatial variability in horizontal and vertical direction, respectively, and both are used for soft conditioning of the TProGS simulations. The category probabilities for the SkyTEM data set are derived from a histogram probability matching method, where resistivity is paired with the corresponding lithology from the categorized borehole data. This study integrates two distinct data sources into the stochastic modeling process that represent two extremes of the conditioning density spectrum: sparse borehole data and abundant SkyTEM data. In the latter the data have a strong spatial correlation caused by its high data density, which triggers the problem of overconditioning. This problem is addressed by a work-around utilizing a sampling/decimation of the data set, with the aim to reduce the spatial correlation of the conditioning data set. In the case of abundant conditioning data, it is shown that TProGS is capable of reproducing non-stationary trends. The stochastic realizations are validated by five performance criteria: (1) sand proportion, (2) mean length, (3) geobody connectivity, (4) facies probability distribution and (5) facies probability–resistivity bias. In conclusion, a stochastically generated set of realizations soft-conditioned to 200 m moving sampling of geophysical data performs most satisfactorily when balancing the five performance criteria. The ensemble can be used in subsequent hydrogeological flow modeling to address the predictive uncertainty originating from the geological structure uncertainty.


Introduction
Constraints in accurate and realistic solute transport modeling in hydrogeology are caused by the difficulty in characterizing the geological structure.The subsurface heterogeneity heavily influences the distribution of contaminants in the groundwater system.The scale of heterogeneity is often smaller than the data availability (e.g., borehole spacing).In traditional hydrogeological studies, one geological model is built based on the best comprehensive knowledge from often sparse borehole data and subjective interpretations.This can lead to alleged correct results, for instance when addressing the water balance on catchment scale, but will often prove to be inadequate for simulations beyond general flows and heads, e.g., contaminant transport modeling.Therefore, it is proposed by numerous studies that the uncertainty on the geological conceptualization is crucial when assessing uncertainties on flow paths (Neuman, 2003;Bredehoeft, 2005;Hojberg and Refsgaard, 2005;Troldborg et al., 2007;Seifert et al., 2008).One of the strategies often recommended for characterizing geological uncertainty and assessing its impact on hydrological predictive uncertainty is the use of multiple geological models (Renard, 2007;Refsgaard et al., 2012).
In this respect geostatistical tools such as two-point statistics, for example TProGS (Carle and Fogg, 1996;Carle et al., 1998), and multipoint statistics (MPS) (Caers and Zhang, 2002;Strebelle, 2002;Caers, 2003;Journel, 2004) are powerful tools as they enable the generation of multiple equally plausible realizations of geological facies structure.This study targets the realistic description of heterogeneity in a geological model by introducing diverse data into the stochastic modeling process to generate a set of equally plausible realizations of the subsurface using geostatistics (Refsgaard et al., 2006).
In geostatistical applications, field observations can constrain the simulation as soft or hard conditioning."Hard conditioning" forces the realizations to honor data completely, whereas "soft conditioning" honors the data partly with respect to the uncertainty of the observation (Falivene et al., 2007).This feature is essential because it enables the user to associate uncertainties with the conditioning data set that can be of either subjective or objective nature.Incorporating a comprehensive and continuous soft conditioning data set to a stochastic simulation such as TProGS is challenging.Alabert (1987) published an early study on the implications of using sparse soft conditioning data to a stochastic simulation.The analysis shows that soft conditioning significantly increases the quality of the realizations.The same was also observed by McKenna and Poeter (1995), where soft data from geophysical measurements could significantly improve the geostatistical simulation.In the past years, highly sophisticated geophysical methods and advanced computational power have allowed for stochastic simulations that are conditioned to a vast auxiliary data set.This poses new challenges to the data handling and to the simulation techniques.Chugunova and Hu (2008) present a study where continuous auxiliary data, in addition to the general training image, are introduced directly, to a MPS simulation.MPS requires a site-specific training image that represents the geological structure accordingly, which is often the main source of uncertainty in MPS simulations.The above mentioned MPS studies conduct mostly 2-D simulations, partly on synthetic data.The training image is the backbone of the MPS method and it has been acknowledged by dell 'Arciprete et al. (2012) and He et al. (2013) that reliable 3-D training images are difficult to acquire.
Alternative methods to integrate vast auxiliary information (e.g., geophysics) into the modeling process and at the same time force local accuracy are collocated cokriging or cosimulation techniques (Babak and Deutsch, 2009).Here a linear relationship between the auxiliary variable and the target variable is built in a model of cross covariance.The essentially linear relationship is often too restrictive and does not represent the complex physical processes.Mariethoz et al. (2009b) present a prospective method that extends the collocated simulation method by using a model of spatial variability of the target variable and a joint probability density distribution to depict the conditional distribution of the target variable and the auxiliary variable at any location.
The method of anchored distributions (MAD) (Rubin et al., 2010) is a suitable approach for the inverse modeling of spatial random fields with conditioning to local auxiliary information.Structural parameters such as global trends and geostatistical attributes are considered in a conditional simulation.The conditioning is undertaken by anchored distributions which statistically represent the relationship between any data and the target variable.
The truncated pluri-Gaussian simulation method (Mariethoz et al., 2009a) generates a Gaussian field for the target and the auxiliary variable using variogram statistics.These Gaussian fields are truncated to produce categorical variables that represent the hydrofacies.The truncation is controlled by threshold values that can be defined in a "lithotype rule" that represents the general geological concept.It is a very flexible method, because conceptual understandings are easily incorporated, but non-stationarity and especially directionally dependent lithotype rules are difficult to incorporate.
TProGS is a well-established stochastic modeling tool for 3-D applications and it has been successfully applied to simulate highly heterogeneous subsurface systems by constraining the simulation to borehole data (Carle et al., 1998;Fleckenstein et al., 2006).Weissmann et al. (1999), Weissmann and Fogg (1999) and Ye and Khaleel (2008) use additional spatial information obtained from soil surveys, sequence stratigraphy and soil moisture, respectively, for accessing the complex lateral sedimentary variability and thus improving the quality of the model in terms of spatial variability.It has not been tested whether TProGS is capable of handling abundant soft conditioning data.Moreover, the risk that cell-by-cell soft constraining may cause an overconditioning of the simulation has not been fully investigated.Overconditioning is defined by the authors as an effect triggered by dense and spatially correlated conditioning data that produce an altered picture of observable uncertainties.Therefore the self-consistency of the stochastic simulation is questioned, because soft constraining should be treated accordingly during the simulation.
Recent studies by Lee et al. (2007) and dell 'Arciprete et al. (2012) highlight that TProGS is comparable with other geostatistical methods like multipoint statistics, sequential Gaussian simulations and variogram statistics (Gringarten and Deutsch, 2001).The distinct strength of TProGS is the simple and direct incorporation of explicit facies manifestations like mean length, proportion and (asymmetric) juxtapositional tendencies of the facies.
Geophysical data sets are valuable information in many hydrogeological investigations.The data can considerably improve the conceptual understanding of a facies or hydraulic conductivity distribution and identify non-stationary trends.However, the integration of geophysical data and lithological borehole descriptions is often difficult.A recent study by Emery and Parra (2013) presents an approach to combine borehole data and seismic measurements in a geostatistical simulation to generate multiple realizations of porosity.Hubbard and Rubin (2000) review three methods that allow hydrogeological parameter estimation based on geophysical data.The three methods link seismic, ground penetrating radar (GPR) and tomographic data with sparse borehole data to support the hydrogeological description of the study site.Our study integrates high-resolution airborne geophysical data with borehole data to build a probabilistic classification of the subsurface at site.The geophysical data are collected by SkyTEM, an airborne transient electromagnetic method (TEM) that has been used extensively in Denmark for the purpose of groundwater mapping (Christiansen and Christensen, 2003;Jorgensen et al., 2003b;Sorensen and Auken, 2004;Auken et al., 2009).This study utilizes a method that translates SkyTEM observation data into facies probability, which enables associating the geophysical data with softness according to the level of uncertainty.Very few studies have integrated high-resolution airborne geophysical data in a stochastic modeling process (Gunnink and Siemon, 2009;He et al., 2013).
Most stochastic studies only make relatively simple validations of how well the simulations are able to reproduce known geostatistical properties.Carle (1997) and Carle et al. (1998) investigate the goodness of fit between the simulated and the defined model of spatial variability.The geobody connectivity is used by dell' Arciprete et al. (2012) to compare results originated from two-and multipoint geostatistics.Chugunova and Hu (2008) make a simple visual comparison between the auxiliary variable fracture density and stochastic realizations of the simulated fracture media.A more advanced validation is conducted by Mariethoz et al. (2009b), where simulated variograms and histograms are compared with reference data for the simulation of synthetic examples.In spite of these few studies that have addressed the validation issue, no guidance on which performance criteria to use and how to conduct a systematical validation of a stochastic simulation has been provided so far.
It should be noted that we, in line with Refsgaard and Henriksen (2004), do not use the term model validation in a universal manner, but in a site-specific context where a model validation is limited to the variables for which it has been tested as well as to the level of accuracy obtained during the validation tests.
The objectives of this study are (1) to set up TProGS for a study site based on lithological borehole data and highresolution airborne geophysical data and investigate the effect of the two distinct conditioning data sets to the simulation, (2) to assess the problem of overconditioning in a stochastic simulation, (3) to ensure that non-stationary trends are simulated accordingly by TProGS, and (4) to identify and test a set of performance criteria for stochastic simulations that allow for validation against geostatistical properties derived from field data.The results of the present study are intended for application in a hydrological modeling context (Refsgaard et al., 2014), where the geological uncertainty, represented by a set of plausible realizations of the subsurface, is propagated through the model.The varying flow paths between the realizations will affect solute transport and thus nitrate reduction and can be utilized to assess the scale of potential predictive capability of a hydrogeological flow model.

Study site
Figure 1 shows the 101 km 2 Norsminde catchment, located on the east coast of Jutland south of Aarhus.The topography allows a separation between an elevated western part, with changing terrain and a maximum elevation of 100 m, and a flat and low elevated eastern part, where the coastline represents the eastern boundary.Glacial morphologies, namely moraine landscapes, are predominant in most of the catchment.The geological stratigraphy indicated by borehole logs encompasses Paleogene and Neogene marine sediments underlying a heterogeneous stratigraphy of Pleistocene glacial deposits.The Paleogene sediments are characterized by very fine-grained impermeable marl and clay.Above the Neogene sequence shows sandy formations encased by a claydominated environment with Miocene marine sediments.The entire Miocene sequence varies in thickness by up to 40 m and the sandy formations reach thicknesses of more than 10 m.The Miocene sequence is only present in the western part of the catchment, where the stochastic modeling is conducted and forms the lower boundary of the simulation domain.Thus, only the upper Pleistocene glacial sequence is modeled.The glacial deposits in the western part of the catchment contain both sandy and clayey sediments, where clay is predominant.Borehole logs indicate that the Pleistocene clay spans from glaciolacustrine clay to clay till.Within the clay environment, the sandy units are allocated in small units and vary between gravel, meltwater sand and sandy tills.The total thickness of glacial sediments varies between 10 and 40 m with heterogeneous distributions of the mostly glaciofluvial sand features between less than 1 m and 20 m in thickness.The subject of the stochastic modeling, the delineated Pleistocene glacial sequence in the western part, provides interesting challenges like distinct heterogeneity and a diverse terrain.

Data
Two different sources of data, namely lithological borehole data and airborne-based geophysical data (SkyTEM), are used, where the former is utilized to describe the vertical sand and clay variability and the latter for assessing the lateral direction.

Borehole data
The borehole data set contains 112 borehole logs with varying depths.The descriptions in the borehole reports are converted to a categorical binary (sand/clay) system at 5 cm vertical discretization.Further, each borehole's uncertainty is validated according to the method of He et al. (2014).The uncertainty assessment allows for definition of individual trust scores and thus the definition of how much each borehole should constrain the conditional simulation in the form of soft data.Among other variables, drilling method, age and purpose of drilling are used to ensure a systematic approach to validate the uncertainty of each individual borehole.The boreholes are grouped into four quality groups with 100, 95, 90 and 85 % as trust scores.The classified borehole data set states an overall sand proportion of 30 %.

Geophysical data
The geophysical data set comprises resistivity data from SkyTEM helicopter surveys.The SkyTEM method has been extensively used for subsurface mapping in Denmark (Jorgensen et al., 2003a(Jorgensen et al., , 2005)), where it has proven to be a successful tool for hydrogeological investigations.SkyTEM data have the advantage of a high spatial resolution in the top 20 to 30 m and large spatial coverage.However, in some studies give rise to concern about the accuracy of interpretations of deep soundings (Andersen et al., 2013).In the Nors- minde catchment, data were collected at 2000 flight kilometers, containing over 100 000 sounding points.The distance between the flight lines is between 50 and 100 m.The data set is processed with a spatially constrained inversion algorithm (Schamper and Auken, 2012), giving a 3-D distribution of the underground resistivity.The sounding data were interpolated to a 20 m × 20 m × 2 m grid domain by using 3-D kriging as the interpolation method.The gridded resistivity data can be utilized as a proxy for lithology, as high-and low-resistivity cells indicate a high probability of sand and clay, respectively.Bowling et al. (2007) conducted a detailed study on the relationship between sediment and resistivity at a field site.Resistivity is linked to grain size distribution and is used to delineate major geological structures.A strong positive correlation between gravel content in the lithology and resistivity is observed.
The SkyTEM data set covers approximately 85 % of the delineated glacial sequence.Figure 2 shows the spatial variation of the median resistivity for a 4-and a 16-subarea grid.Higher median resistivity values are located in the southern part of the glacial sequence.This indicates a greater sand proportion in the given areas.The conclusion regarding the spatial pattern in Fig. 2 is that stationarity cannot be attested to the glacial sequence.This will have implications for the stochastic simulation.
The exact sand proportion can be derived by introducing a cutoff value that divides the SkyTEM data set into a sand and a clay fraction.Jorgensen et al. (2003b) estimate resistivity thresholds to differentiate between sediments in buried valleys in Denmark.Accordingly, glacial sand has a resistivity greater than 60 m whereas clayey till sediments are placed between 25 and 50 m, and thus the exact cutoff value varies between study sites.

Data integration
Figure 3 underlines some of the associated problems of the data integration of geophysical SkyTEM data and borehole descriptions.The lithological information from the borehole interprets thin layers of meltwater sand confined by clay in the top few meters.The SkyTEM data with a vertical resolution of 2 m cannot capture this small-scale variability.This supports the use of geophysical data only for the lateral model of spatial variability and incorporation of the fine descriptions from the borehole data for the vertical model of spatial variability.He et al. (2014) developed a method to manually calibrate the cutoff value by comparing borehole with SkyTEM data at different spatial domains with the aim to reduce the deviation in sand proportion between the two data types.It is assumed that the deviation has to be minimized at domains with a high borehole density where the boreholes are assumed to best represent the domain conditions.It is shown that a borehole density of 2 per km 2 reduces the representative error and that 46 m as a cutoff value reduces the deviation in sand proportion between the two data sets.The calibrated cutoff value yields a sand proportion of 23 %.
Further, He et al. ( 2014) developed a histogram probability matching (HPM) method that enables a direct translation from resistivity into facies probability.Resistivity is paired with the lithological borehole description at the coinciding cell.The data pairs are grouped in 10 m bins, and for each bin the sand/clay fraction is first calculated and then plotted as a histogram.Third-order polynomial curve fitting is applied to the histogram and the manually calibrated cutoff value is superimposed to the fitted curve (Fig. 4).The shape of the curve reflects the lumped uncertainties from both data sets.The flatness of the transition zone, around 50 %, sand probability indicates a high uncertainty for the corresponding resistivity values.There are many sources of uncertainty that will affect the relationship between electrical conductivity and facies information.The HPM method lumps various sources and the shape of the fitted curve reflects those, especially the width of the transition zone.He et al. (2014) discussed the prevalent uncertainties: first, borehole descriptions are not accurate, and classification of borehole lithology is subjective.Second, there are uncertainties on the resistivity data due to the resolution of the physics itself, the geophysics instruments, field measurements and signal processing (inversion).Third, there is no unique relationship between resistivity and lithology, and the curve can therefore be fitted in various ways.Last, there are uncertainties related to the scale of aggregation, since the borehole data and geophysical data have different resolutions and hence different supporting scales.The HPM method used in this study is study site specific and is purely based on spatial correlations and is not built up on physical relationships.Therefore it cannot be transferred to other catchments; however the relationship between resistivity and facies is manually calibrated for this site and is thus expected to be valid.
The HPM method is a probabilistic approach, and it is preferred to deterministic model approaches because of its suitability for soft conditioning in a stochastic geological simulation.In a more deterministic sense, a positive correlation between resistivity soundings and hydraulic conductivity estimates derived from pumping test has been acknowledged for glacial outwash aquifers (Urish, 1981).Linde et al. (2006) review different strategies to relate geophysical and hydrogeological properties and attest that geophysics doubtlessly add value to a hydrogeological characterization.One suggested approach which would be suitable for SkyTEM data for the estimation of hydrogeophysical parameters is the joint inversion, where the geophysical or hydrogeological inversion utilizes hydrogeological or geophysical data, respectively.Another review paper by Slater (2007) addresses joint inversion methods as well as petrophysical relations between geophysical (electrical) properties and effective hydraulic properties (pore volume and pore surface) at core and field scale, which allow for direct mapping of hydraulic properties (Kemna et al., 2004).Slug tests with an estimate of the local saturated hydraulic conductivity are available for the Norsminde catchment.However, due to differences in scale, a low number of slug tests and unclarity of the correlation we found the direct mapping approach unfeasible for our study.

TProGS -Transition Probability Geostatistical Software
The geostatistical software TProGS is applied in this study.
It is based on the transition probability (TP) approach (Carle and Fogg, 1996;Carle et al., 1998).Continuous Markov chain models (MCM) are used to represent the model of spatial variability (Krumbein and Dacey, 1969;Carle and Fogg, 1997;Ritzi, 2000).TProGS allows for the simulation of multiple realizations by utilizing a sequential indicator simulation (SIS) (Seifert and Jensen, 1999) and by performing simulated quenching (Deutsch and Cockerham, 1994;Carle, 1997).These two steps are mutually dependent and they make sure that the realizations honor local conditioning data as well as the defined model of spatial variability.The major advantage of TProGS is that fundamental observable attributes are parameterized in the modeling process: volumetric fractions (proportions), mean lengths (thickness and lateral extent) and (asymmetric) juxtapositional tendencies.These attributes can be assessed by data analysis and geological interpretations and control the shape of the MCM model.The facies proportion is related to the asymp-totic limit of the MCM.The mean length is indicated on a plot of autotransition probabilities as the intersection of the tangent at the origin with the x axis.Asymmetric juxtapositional tendencies are of interest when simulating a system with at least three categories and can thus be neglected in this study.TProGS computes the realizations of the geology in two uncoupled but mutually dependent steps.An initial configuration of facies distribution is produced by the SIS algorithm (Deutsch and Journel, 1992).Secondly, the initial configuration is reshuffled by the simulation quenching optimization algorithm (Deutsch and Cockerham, 1994).The TProGS simulation domain of this study is discretized into 20 m × 20 m × 2 m cells on a 450 ×600 × 40 cell grid.The horizontal TPs are based on SkyTEM data, which are categorized by a cutoff value of 46 m, and the vertical extent is purely based on borehole data.

Split sample test
The two incorporated conditioning data sets are very distinct and will affect the simulation in opposite ways: sparse borehole data allow for large simulation freedom, whereas dense SkyTEM data limit the simulation freedom.Naturally they will be combined in order to condition the simulation to the best combined knowledge of the system.However it is of interest to know how each individual conditioning data set affects the simulation.In this context a split sample test can reveal valuable information: one simulation conditioning to purely borehole data and the other one conditioned to purely SkyTEM data.It will be tested how well the simulations conditioned to borehole data reproduce the high-resistivity cells, where a high sand probability is evident, and how well the simulations conditioned to SkyTEM data reproduce the locations with borehole information.

Moving sampling
Most studies on stochastic modeling condition the simulation to sparse data.In this study a comprehensive cell-by-cell soft conditioning data set is applied, and it is anticipated that this may result in overconditioning.Decimating the conditioning data set is a very intuitive sampling approach to work around the problem of overconditioning.However, if the resampled conditioning data set is too sparse, information from the original data set might not be sufficiently accounted for.Thus the tradeoff between two extremes, too much and too few data is investigated.As opposed to the static sampling technique, a moving sampling method is applied.n different location grids with the same distance between the samples for each chosen distance (100 m, 200 m, etc.), where each has an accumulated shift of the origin (+ sampling distance / n in x and y direction).For the 100 m moving sampling approach the first sampling grid has the origin (0, 0) the second (20, 20), the third (40, 40), etc.For the TProGS application in this study, five location grids are generated, which yields five independent soft conditioning data sets.Five realizations are computed for each soft data set, giving a total of 25 realizations.In addition to the comparison between moving and static sampling, different sample densities are also be tested.

Sampling scenarios
In total, eight conditioning scenarios are tested in this study.For the split sample test, two scenarios are used, namely purely borehole data ("onlyBH") and purely cellby-cell SkyTEM data ("onlySky20").In the following both data sources are combined to represent the best combined knowledge of the system.Further, static and moving sampling are applied: borehole data and SkyTEM data sampled statically at 20, 100, 200 and 500 m ("BH-Sky20static", "BH-Sky100static", "BH-Sky200static" and "BH-Sky500static", respectively).Moving sampling is tested at 100 and 200 m sampling distance ("BH-Sky100moving" and "BH-Sky200moving", respectively).

Performance criteria
Five performance criteria are defined to evaluate an ensemble of realizations of the geology.They aim to validate the ensemble with respect to the TProGS input, namely the defined model of spatial variability (mean length and proportion) and the soft conditioning data set.The five performance criteria test the self-consistency of TProGS and thus whether all input parameters and data are treated accordingly.The glacial structure in the Norsminde catchment represents only approximately 20 % of the entire TProGS simulation domain and deviations in simulated spatial statistics between the entire model domain and the simulation target are expected.

Sand proportion
The deviation between the mean simulated sand proportion and the defined sand proportion in the MCM can be calculated for a set of realizations.The focus should be on the target area only -the area that will be extracted from the rectangular model domain for subsequent applications.The analysis of the sand proportion is based on 25 realizations.

Mean length
The simulated mean length can be estimated by recalculating the TPs from the TProGS output for the target area only.The simulated TPs for a set of realizations can be averaged (10 realizations in this case) and compared with the measured TPs to estimate the deviation in mean length between the predefined and the mean simulated length.

Geobody connectivity
The degree of connectivity of permeable areas in the subsurface has major implications for flow lines and particle ages.Renard and Allard ( 2013) conducted a methodology study on various static and dynamic connectivity metrics.These metrics can be utilized as a comparison and interpretation indicator for multiple stochastically generated realizations of the geology.The work by dell' Arciprete et al. (2012) shows the successful implementation of connectivity metrics to compare stochastic realizations computed by two-and multipoint statistics.Giudici et al. (2012) underline that evidence of a single "best" connectivity metric is still missing and further research is necessary in that field.
For this study two static connectivity metrics, θ and , are selected.They refer to the first and second geobody connectivity defined by Hovadik and Larue (2007).A geobody is defined as one connected 3-D cluster of sand.Hence it is a distinct sand feature that is confined by clay.The architectural elements are interpreted on a 20 m × 20 m × 2 m scale.
where V i is the volume of an individual geobody, n is the number of unconnected geobodies and V l is the volume of the largest occurring geobody.θ represents the ratio of the volume of the largest geobody to the total volume.Denoted as is the proportion of the pairs of cells that are connected among the entire pairs.The two selected connectivity metrics originate from the percolation theory, which describes the transition from many disconnected clusters to one large coherent cluster.This mainly depends on the facies proportion.As the proportion gradually increases, it reaches a point where one big cluster appears.The percolation threshold is expected to be approximately 0.59 and 0.31 for a 2-D and 3-D grid, respectively (Hovadik and Larue, 2007).Mean values of θ and are computed based on 10 realizations.

Facies probability distribution
The facies probability distribution reflects the intervariability among a set of realizations and can be extracted from a probability map.Each cell in the probability map reflects the simulated category probability within a set of realizations.The comparison between the distribution of the original soft data set which constrains the simulation and the simulated facies probability distribution allows for validation of the performance of the simulation.Ideally the distribution of the original soft data set is reproduced by the simulation, which does not allow for assumptions concerning the accuracy of the allocation pattern of the simulated facies probability.

Facies probability-resistivity bias
The validation of the facies probability-resistivity bias depicts whether the simulated facies probability corresponds to the fitted curve derived from the histogram probability matching method, and thereby test whether the simulated facies probability is according to the resistivity pattern.The simulated facies probability value is paired with the coinciding resistivity value of the gridded SkyTEM data set.The pairs are grouped in 5 m bins, and the median values of simulated facies probability can be plotted for each bin.Further, the root-mean-square error (RMSE) can be calculated between the simulated facies probability and the fitted curve at each bin in order to quantify the agreement.

TProGS setup
The computed TPs and the fitted Markov chain model (MCM) for both horizontal and vertical direction are given in Fig. 5.A sand proportion of 23 % and a mean length of a sand lens of 5 and 500 m for the vertical and horizontal direction, respectively, yield MCMs that are in good agreement with the measured TPs. Figure 3 indicates an increasing gradient in sand proportion from north to south.This non-stationary trend is also shown in Fig. 5, where the additional sand-sand transition MCMs are plotted that fit measured TP data from the northern and southern subdomain, defined by 13 %, 2 m, 400 m and 30 %, 5 m and 600 m, respectively.A total of 25 realizations are generated based on the MCMs that are specified in Fig. 5.

Split sample test
Two sets of 25 realizations are computed.The entire conditioning data set is split into two parts in order to analyze the effect of both extremes of the conditioning spectrum: abundant data (onlySky20) and sparse data (onlyBH).

Visual comparison
Figure 6 presents two individual realizations (a) and (b) and the resulting probability maps (c) and (e) from both conditioning data sets at an elevation of 49 m.Examining the individual realizations reveals that the spatial variability is much greater for the onlyBH scenario results.This is reasonable, because the amount of constraining data is also much less.This conclusion is supported by the probability maps.The probability map computed from the onlySky20 conditioning scenario shows only little intervariability among the 25 realizations and almost resembles a binary sand and clay image.
The onlyBH scenario simulates a probability map that shows high intervariability among the computed realizations, but the high-probability sand areas do not coincide with the highresistivity areas in the SkyTEM data (d), because many large sand features are not captured by borehole data.On the other hand, some high-probability sand features in the onlyBH scenario are not represented by the onlySky20 scenario, because small sand features that are indicated by the borehole data are not detected by the SkyTEM survey.

Quantitative comparison
High-resistivity areas are defined by a minimum resistivity value of 60 m, which is equivalent to 70 % probability of sand occurrence based on the fitted histogram curve in Fig. 4. The results of the split sample test are given in Table 1.The onlyBH scenario allocates only 20.1 % of the high-resistivity cells accordingly.Also, only 74.3 % of the cells where the lithology in the borehole reports shows sand are simulated correspondingly.Some of the borehole data are treated as soft data, which enables the simulation to overwrite the lithological information, during the SIS and the simulated quenching.This will happen especially when sand lenses are very thin and vertically confined by clay.The onlySky20 scenario simulates 44 % of those cells accordingly and considers almost all high-resistivity cells as sand.However, almost 60 % of the high-resistivity cells are simulated with 100 % sand probability.This is in poor agreement with field data, because the fitted histogram curve does not exceed sand probability values higher than 85 % (Fig. 4).The SkyTEM data set indicates a large high-resistivity cluster in the southwest at an elevation of 49 m (Fig. 6), which is not detected at all by the borehole data set, because there is only one borehole penetrating this area.

Local comparison
Figure 7 shows the vertical profile of one borehole (99.918) that penetrates the sand cluster and compares the simulation results from the onlyBH and onlySky20 scenarios.The borehole has a trust score of 95 %.While both data sets agree on the top layer being sandy and the occurrence of a thick clay layer below 75 m followed by a sand layer, they disagree on the location of the deeper sand layer.In the borehole data the deeper sand layer is detected at an elevation of 45 m and below" whereas the SkyTEM data set indicates sand occurrence approximately 8 m higher: 53 m and below.This discrepancy between 45 and 53 m has considerable implications for the simulation results at 49 m shown in Fig. 6.However, one borehole alone will not be sufficient to substantially influence the simulation over large areas.Marginal amplification of the onlyBH scenario is noticeable at borehole 99.918.On the other hand, sand probabilities are clearly amplified in the onlySky20 scenario; everything above 0.5 is amplified close to 1.0 and everything below 0.5 close to 0. The results from Table 1 and Figs.6 and 7 support the assumption of overconditioning caused by the comprehensive cell-by-cell soft conditioning.

Overconditioning
The observed problem of overconditioning is caused by spatially correlated data which are incorporated into the modeling process.A very intuitive approach to work around the problem of overconditioning is decimating the SkyTEM data set by only sampling part of it.This will only be necessary in horizontal direction because the correlation length of the data is much less in the vertical direction.There is a tradeoff between the correctly simulated facies probability and the accuracy of the spatial allocation pattern.To illustrate this tradeoff, three resampled conditioning scenarios are compiled: 100, 200 and 500 m sampling distance in the x and y direction, and at the same time also including the boreholes for conditioning.For each of the three conditioning scenarios (BH-Sky100static, BH-Sky200static and BH-Sky500static, respectively), 25 realizations are computed and  2).It is evident that the percentage of cells at the extreme ends of the simulated sand probability falls drastically after decimating the soft data.The 100 m distance scenarios still allocates more than 80 % of the high resistivity correctly.On the other hand, the BH-Sky500static performs poorly, only simulating 32.7 % of the high-resistivity cells correctly.It is also evident that the differences between static and moving sampling are small with regard to the correct allocation of the higher-resistivity cells.

Performance criteria
For further validation of the different sampling distances (20, 100, 200 and 500 m) and sampling schemes (static and moving), the five identified performance criteria will be applied to quantify the quality of the simulations.

Sand proportion
Table 3 shows the defined sand proportions of the delineated glacial structure.In order to investigate nonstationarities, the model domain is additionally subdivided into north and south.The SkyTEM data set indicates a higher sand fraction in the southern part compared to the north, 30 and 13 % respectively.The simulated sand proportions for the BH-Sky20static scenario show a good agreement with the defined values.Larger deviations are evident for the BH-Sky200moving scenario.Both conditioning scenarios are capable of reproducing the non-stationarity of the system in regard to the sand proportion.The sand proportions are somewhat overestimated for the BH-Sky200moving scenario, and much less for the BH-Sky20static scenario.Also, the overestimation of simulated sand proportion in the northern subarea is larger than in the southern subarea.

Mean length
The comparison of the early (first lag = 100 m) measured and simulated TPs for the sand-sand transitions in the x and y direction allows for validation of how well the lateral mean length is simulated by TProGS.Figure 9 comprises the measured TPs in horizontal direction, the fitted MCM and the computed mean TPs for the BH-Sky20static scenario and BH-Sky200moving scenario, based on 10 realizations, for the total and the subdomains.The effect of overconditioning is very evident, as the computed mean TPs based on 20 m sampling conditioning data purely represent the original measured TP values.Since no simulation freedom is present, the MCM cannot control the output.However, the BH-Sky200moving scenario computes mean TPs that are more independent from the original data and rather follow the defined MCM.The mean length of a sand lens can be derived by the steepness of the tangent where the lag approaches zero.In general, the TPs at lag 0 and 100 m are simulated too low, indicating that the simulated mean size of a sand lens is too small.This is more prominent in results by the BH-Sky200moving scenario.It is evident that the nonstationarity of the mean length of a sand lens is represented accordingly, although it is undersimulated in all domains.

Geobody connectivity
For the categorized SkyTEM data, θ and are computed as 98.7 and 99.3 %, respectively.This shows values close to unity and should not be seen as a real reference but rather as a benchmark, because the extremely low variability picture does not account for any uncertainties.The TProGS simulations based on the two conditioning scenarios both undersimulate the connectivity metrics.The BH-Sky20static scenario yields negative deviations of 2.1 and 1.1 %, respectively, and the BH-Sky200moving scenario 2.8 and 1.4 %, respectively.
The results indicate that θ and show a similar behavior, where appears to be decreasingly greater as the proportion increases.Values close to unity and the very small deviations are in good agreement with the general percolation theory, which sets the percolation threshold to approximately 30 % for 3-D grids (Hovadik and Larue, 2007).

Facies probability distribution
Figure 10 shows the probability distribution for all discussed conditioning scenarios, with static (a) and moving (b) sampling, with 25 realizations in each set.The original soft data distribution has its maximum at approximately 20 % and less than 5 % are with either 0 or 100 % sand probability.The BH-Sky20static scenario simulates approximately 70 % of the cells with zero change and thus has an extremely poor fit with the soft data set and the overconditioning is very prominent.It appears that overconditioning amplifies the conditioning values to the extremes (e.g., 0.6 is simulated as 1.0 and 0.4 as 0.0, Fig. 7).The BH-Sky500static scenario reproduces the probabilities from the original soft data set well, with only approximately 10 % zero-change cells.However, the allocation pattern shows little resemblance with the original data set (Fig. 8b).BH-Sky100static scenario gives an intermediate solution, as the probability is better reproduced than with the BH-Sky20static scenario, but still, more than 20 % of the cells are simulated as either purely sand or clay within the ensemble.Nevertheless, the BH-Sky100static scenario is dense enough to capture the full variability of the system, as indicated by the original SkyTEM data set.Additionally, the results of the BH-Sky200static scenario are plotted in (a).The number of zero variability cells is decreased to approximately 20 %, and the maximum at 20 % sand probability is close to the original soft data set.Figure 10b compares the static with the moving sampling approach for the 100 and 200 m distance scenarios.The simulated facies probability distribution shows no differences for the static and moving 100 m distance scenarios.However, at 200 m sampling distance, the two sampling techniques are distinguishable, as the moving sampling yields fewer zero variability cells than the static sampling.

Facies probability-resistivity bias
The results are given in Fig. 11 for the static sampling (a) and the moving sampling approach (b).The strong amplification of the resulting probabilities originating from the BH-Sky20static scenario is obvious in (a).The BH-Sky500static scenario performs poorly, especially in high-resistivity areas, because those areas are not sufficiently covered by the 500 m sampling distance.A better fit is represented by the BH-Sky100static scenario, because the amplification is much lower than for the BH-Sky20static scenario, especially for high-resistivity areas.On the other hand, low-resistivity areas are more amplified than high-resistivity areas.The BH-Sky200static scenario gives a satisfactory fit with the original fitted curve, especially in high-resistivity areas, which indicates that the high-probability sand cells are mostly allocated correctly by the model.The simulated facies uncertainty for the low-resistivity cells is rather amplified by the BH-Sky200static scenario.Figure 11b investigates the simulation differences caused by the static and moving sampling approach.The behavior is similar to Fig. 10b because the differences for the 100 m distance scenarios are marginal, while the BH-Sky200moving scenario generates a slightly lower facies probability-resistivity bias than the BH-Sky200static scenario.The RMSEs between the fitted curve (Fig. 4) and the simulations show that the BH-Sky200moving and BH-Sky200static sampling conditioning scenarios perform best, both with an RMSE of 0.06.Comparable are the BH-Sky100moving and BH-Sky100static sampling conditioning scenarios with an RMSE of 0.09 and 0.08, respectively.The BH-Sky20static scenario performs poorest with an RMSE of 0.2.

Choice of geostatistical method
The choice of the stochastic method for this study is application driven (Refsgaard et al., 2014).In the Norsminde catchment, it is evident from both borehole and geophysical data that the glacial sequence contains till clay and sand lenses distributed in extremely irregular patterns that are nonstationary.Without dense conditioning data the heterogeneous and non-stationary structures will not be simulated correctly.TProGS allows for conditioning and operates a straightforward way to build the model of spatial variability.
In multipoint statistics (MPS) the definition of a reliable 3-D training image is challenging, especially when simulating extremely irregular patterns (Honarkhah and Caers, 2012).Defining a MPS training image for the Norsminde catchment is peculiar, because it could only be based on interpreted SkyTEM data, with inflated length scales in the vertical direction.This makes the model of spatial variability in TProGS more reliable and objective, because it is based on measured transition probabilities and not on an interpreted training image.Further, the transition probabilities are based on the data type we trust best: borehole data in the verticaland SkyTEM data in the horizontal direction.However, MPS is broadly applied in 2-D and 3-D applications: the snesim (Single Normal Equation SIMulation) algorithm (Liu, 2006) combines object-based and pixel-based methods in the general MPS framework to enforce spatial pattern reproduction and local conditioning, respectively.It was successfully applied by He et al. (2013) in a 3-D application.Another promising approach is given by Chugunova and Hu (2008), where MPS is tested on non-stationary 2-D structures by means of continuous soft conditioning to a secondary variable.Here two training images from the geological structure and from the secondary variable are joined in the simulation.
Many promising geostatistical methods have advanced to incorporate auxiliary information to constrain the simulated target variable.Among them are, e.g.truncated pluri-Gaussian simulation (Mariethoz et al., 2009a) and collocated simulation with probability aggregation (Mariethoz et al., 2009b).Most of them are only tested on 2-D applications partly with synthetic data.This present study uses TProGS as the geostatistical tool because of its reliable model of spatial variability, and furthermore it is well established in 3-D applications with sparse conditioning data.

TProGS setup
Direct transformation of geophysical data, such as SkyTEM, into a deterministic subsurface model is risky, because too much reliance on geophysical mapping can lead to extremely inaccurate hydrogeological models (Andersen et al., 2013).Uncertainties are expected in both geophysical and lithological data, and the shape of the fitted histogram curve reflects those.High uncertainty is associated with the transition zone: around 50 % sand probability.Although the cutoff value that divides the SkyTEM data set into sand and clay is calibrated, there is a large quantity of highly uncertain cells included, which makes the measured TPs directly dependent on the cutoff value.Therefore the facies proportion and mean length are very sensitive to the selection of the cutoff value.As a result, the MCM in the lateral direction, as part of the TProGS setup, is highly dependent on the way the SkyTEM data is treated.Difficulties in the integration of the two data types are indicated in Fig. 2. Small-scale heterogeneities indicated by the borehole descriptions are not represented by the coarser SkyTEM data set.This supports computing the horizontal and vertical TPs individually using SkyTEM and borehole data, respectively.
The SkyTEM data set used in the present study is a 3-D grid of 20 m × 20 m × 2 m which was spatially interpolated from soundings with distances of about 17 m and 50-100 m along and between the flight lines, respectively.To reduce the overconditioning problem, it might have been preferable to use the direct sounding data instead of the interpolated data set.A similar effect is achieved by resampling, but here interpolated data with a higher uncertainty than the direct soundings are used.Simulating a binary system is a crude simplification of the broad range of sediments in the glacial sequence.However, classifying the SkyTEM data into discrete facies or deriving the soft information on facies membership are peculiar in a multi-facies environment.Additionally, less abundant facies (e.g., gravel) will show extremely uncertain correlations in the histogram probability matching method.Lastly, the less abundant facies might be represented on a 20 m domain, but it will often not be visible on the 100 m domain chosen for the subsequent hydrological flow simulations.Dell' Arciprete et al. (2012) present a study where geostatistics are implemented to simulate small-scale heterogeneities in a multi-facies environment.

Data footprint
Borehole and SkyTEM data are integrated by the histogram probability matching method (He et al., 2014), where differences in support scale are partly neglected.The support scales of the two data types are expected to vary.The lithological data from the boreholes are aggregated to 2 m to be in better vertical agreement with the geophysical data set.The agreement in the lateral direction is more questionable, because the footprint increases with depth for the geophysical data.The footprint is approximately 15-20 m on the surface and in the range of 50 m at 30 m penetration depth.Further, the footprint will depend on the material, with a larger energized volume for high-conductance materials (high clay content).The two steps of processing the sounding data, namely inversion and kriging, are both expected to inflate the footprint by smoothing values.However one can assume that the chosen grid size of 20 m × 20 m × 2 m is suitable for near-surface resistivity values, because the footprint of the geophysical data is constantly smaller than the correlation length, which is approximately 500 m in the vertical direction and 5 m in lateral direction.

Split sample test
Both data sources have advantages and disadvantages: borehole data have a higher data certainty and a finer spatial resolution in the vertical extent to better represent smaller sand features, but are essentially undersampled in the lateral extend.On the other hand, SkyTEM data have a good spatial coverage and represent the bigger sand features well, but at the same time the data are associated with a higher data uncertainty.At this point, four major sources of uncertainty can be defined: (1) the inversion that transforms the SkyTEM measurement into resistivity, (2) the borehole data, (3) the relationship between lithology and resistivity, and (4) the footprint mismatch between small-scale borehole data and large-scale SkyTEM data.So it is precarious to assume the SkyTEM data as true geology, but it can serve as a reference/benchmark when validating the simulation results.The onlyBH scenario does not capture all of the main sand features, which are revealed by the SkyTEM survey: only 20 % of the high-resistivity cells,where the resistivity is greater than 70 m are simulated correctly.For the onlySky20 scenario only 44 % of the sand descriptions in the boreholes are simulated correctly, which underlines that the SkyTEM data do not measure the finer sand features correctly.The conducted split sample test does not allow firm conclusions to be drawn on simulation performance; rather, it analyzes the agreement between the two data sets propagated through the model.

Overconditioning
Correlated data, both temporally and spatially, are a common problem in hydrogeological investigations.It has not been previously reported how TProGS is able to handle such a conditioning data set.TProGS stochastically simulates the subsurface facies system by utilizing the two mutually dependent steps: SIS and simulated quenching.Soft information is not considered accordingly during the cokriging of the local probability estimate in the SIS step, nor is it completely accounted for in the objective function used for the simulated quenching in the latest TProGS version.However, Deutsch and Wen (2000) successfully integrate exhaustive soft data in simulated quenching, which shows that the algorithms are generally capable of incorporating soft data.
Work-around methods have to be developed to overcome the problems associated with overconditioning.Decimating the soft conditioning data set may seem to be an overly simplistic and very crude approach, but the study aims at finding the balance between too few and too many data.The risk of missing important features is high when conditioning to too few data.This study mainly deals with the latter case, where too many data lead to an underestimation of the simulation uncertainty.Including a moving sampling strategy ensures that the spatial variation in the original data set is best represented.A drawback of this approach is that valuable information might be lost, which again underlines the need for model validation, where the entire geophysical data set is used for the evaluation.The decimation approach works as a very pragmatic solution for a study-specific problem and its generalization might be limited.Decimating the SkyTEM data set and only considering data on a 200 m spaced moving sampling grid gives the most satisfactory results.A 200 m sampling distance is expected to be sufficient to adequately capture all relevant geological features proxied by the entire data set; this can be argued by the fraction between the observed mean length and the conditioning spacing.The mean length of a sand lens is found to be 500 m and can proxy the correlation length.With a horizontal length scale of 500 m and sampling at 200 m, we still condition the simulation with two to three soft data points in each horizontal direction for each mean sized sand feature.
In summary, it cannot be directly concluded that overconditioning is a general problem in stochastic simulations where a vast conditioning data set is applied.However it can be presumed that heavily spatially correlated data will also affect other stochastic other stochastic simulation algorithms besides TProGS which was clearly not developed clearly not developed to run with such comprehensive conditioning.To our knowledge, the problem of overconditioning has not yet been reported, nor has it been discussed, and with our study we would like to create awareness.In regard to the technique of geophysical prospecting, it can be concluded that the problem of overconditioning is clearly not limited to airbornebased TEM data.

Performance criteria
We identified and tested five performance criteria for validating the model.
1. Sand proportion.Artificial conditioning data outside the target area honoring the defined proportion and MCM may help to make the simulation more homogeneous.
In that context, exhaustive hard conditioning outside the simulation target can be tested.
2. Mean length.The simulated and measured TPs are compared by Carle (1997) and Carle et al. (1998).Carle et al. (1998) simulate a four-category system and the simulated quenching yields a perfect match between the modeled TPs and the defined MCM.On the other hand, Carle (1997) underlines that small deviations are to be expected and shows this by various examples where different SIS and simulated quenching parameters are tested.
3. Geobody connectivity.The connectivity is partly dependent on the proportion.The sand connectivity for the simulation based on the BH-Sky200moving scenario is simulated lower and the sand proportion higher in comparison to the results from the BH-Sky20static scenario.This shows that the geobody connectivity is not fully dependent on the proportion in this study.However it is a more feasible performance criterion for proportions far below the percolation threshold.
4. Facies probability distribution.A good agreement between the simulated facies probability distribution and the original soft data set does not ensure that the allocation pattern of the simulated probability is correct.This becomes evident when validating the results of the BH-Sky500static scenario.
5. Facies probability-resistivity bias.The simulated facies probability should be in agreement with a corresponding resistivity observation to ensure that the spatial allocation pattern is simulated correctly.All bins are weighted the same, neglecting the inequality of data in each bin.
We used 25, 10 and 10 realizations to compute the first 3 performance criteria, respectively.Computing a moving average shows than the mean converges to ±2 % deviation to the final mean after ca. 15 realizations for the first criterion and after ca. 5 realizations for the second and third criteria, which justifies the selected number of realizations.The two latter criteria incorporate the computed probability map based on 25 realizations.Probability maps proved to be a useful tool to investigate the intervariability among realizations (Alabert, 1987;Carle, 2003;Mariethoz et al., 2009b).The results of the onlyBH scenario show the highest intervariability, and a moving average tested at 10 random locations in the grid shows that the mean converges to less than ±20 % from the final mean after 20 realizations, and to less than ±10 % after 23 realizations.These numbers are supposed to decrease as the conditioning data increase, and therefore the 25 realizations in the analysis of the two latter criteria are justifiable.Table 4 compiles the five performance criteria for two different TProGS simulations: the BH-Sky20static and the BH-Sky200moving scenario.A weighted and balanced analysis of the performance criteria helps to identify the best result.For example, if only considering sand proportion and mean length, it can be argued that the validation favors the BH-Sky20static scenario.However both the facies probability distribution as well as the facies probability-resistivity bias attest poor performance.On the other hand, if interpreting the probability distribution only, it seems that the validation favors the BH-Sky500static scenario.Collectively, the conclusion is that the BH-Sky200moving scenario generates the overall most balanced results.

Figure 1 .
Figure 1.Insert: the study site in eastern Jutland, Denmark.Main image: the Norsminde catchment with the delineated glacial structure in the western part of the catchment and, additionally, the river network and the topography.

Figure 2 .
Figure 2. The median resistivity values from the SkyTEM data for the 4-and 16-subarea grid.Dark colors indicate a high median (max: 43.2 and 45.0 m for the 4-and 16-subarea grid, respectively), light colors a low median (min: 32.0 and 29.5 m for the 4-and 16-subarea grid, respectively) and white colors the absence of data.Additionally the location of the boreholes, the river network and the delineated glacial structure are shown.The extent is 9 km in the x and 12 km in the y direction.

Figure 3 .
Figure 3. Side-by-side comparison of borehole lithological data and SkyTEM vertical sounding data at borehole number 99.625 (He et al., 2014).

Figure 4 .
Figure 4. Upper panel: two individual realizations for two different conditioning scenarios: onlyBH (a) and onlySky20 data (b).Lower panel: probability maps for the two scenarios (c) and (e) showing the probability of sand in each cell based on 25 realizations.The derived sand probability which is used for conditioning the simulation is shown in (d).All maps show data at an elevation of 49 m.

Figure 5 .
Figure 5.The bias-corrected histogram curve: the calibrated cutoff value (46 m) is added to the histogram and the fitted curve is forced to honor it (He et al., 2014).

Figure 6 .
Figure 6.The computed transition probabilities in the vertical and horizontal direction and the fitted MCM: vertical 5 m, horizontal 500 m mean length of a sand lens and 23 % sand proportion.Additionally the fitted MCM for the north and south subdomain are plotted for the vertical and horizontal sand-sand transitions: 2 m, 400 m and 13 % (vertical) and 5 m, 600 m and 30 % (horizontal).

Figure 7 .
Figure 7.The simulated versus conditioned sand probability over the vertical extent at one borehole (98.918), located in the southwestern part of the glacial structure.The results originate from the two different soft conditioning scenarios: onlyBH and onlySky20 (based on 25 realizations each).

Figure 8 .
Figure 8. (a): 100 m (small dots) and 500 m (big dots) sampling grids for thinning out the conditioning data set.(b-e): probability of sand at an elevation of 49 m for SkyTEM data set (b), and for static 20, 200 and 500 m conditioning (c-e).Red colors represent high sand probability and blue colors low sand probability (based on 25 realizations).

Table 2 .
Proportion of high-probability sand cells (resistivity > 60 m) that are simulated with corresponding sand probabilities (> 70 %) or fully deterministic (probability = 1.0) for six conditioning data sets based on 25 realizations.Prob. of sand > 0.7 Prob. of sand = 1for sand are presented in Fig.8.The simulated probability maps of the BH-Sky100static and BH-Sky200static conditioning scenarios are visually almost identical.Therefore only the latter is shown (d) and the image reflects already a higher variability than the results by the BH-Sky20static scenario (c).Reducing the conditioning data density increases the uncertainty of sand or clay.But at the same time the accuracy of correctly locating sand or clay units decreases, because the BH-Sky500static scenario (e) shows high-probability sand areas which are not indicated by the original data set (b).If for instance a high-resistivity cell embedded in low-resistivity cells is sampled for the conditioning, this cell may generate a sand lens in the thinned-out conditioning scenario but would be limited by the neighboring cells in the BH-Sky20static scenario.The moving sampling method can improve the spatial coverage of the conditioning data sets and thus improve the quality of a set of realizations.Again, the high-resistivity cells are investigated to analyze whether the bigger sand lenses are simulated correctly by the different conditioning data sets (Table

Figure 9 .
Figure 9.The simulated transition probabilities for the south, north and total domain are compared with the SkyTEM data and the fitted MCM.The results for two soft conditioning data set are shown: BH-Sky20static and BH-Sky200moving.The simulated TP and the MCM at lag 100 m are compared to quantify the underestimation of a sand lens.The TP values are mean values based on 10 realizations.The defined length of a sand lens (X) and the mean simulated length for the BH-Sky20static (Y) and BH-Sky200moving scenario (Z) are given in each graph.(Xm -Ym/Zm).

Figure 10 .
Figure 10.The simulated facies probability distributions based on sets of realizations conditioned to differently sampled soft data sets (based on 25 realizations): (a) static sampling at different sampling distances and (b) stationary and moving sampling at different sampling distances.Also shown is the sand probability distribution of the original soft data set which is desired to be reproduced.

Figure 11 .
Figure 11.The simulated facies probability-resistivity bias based on sets of realizations conditioned to differently sampled soft data sets (based on 25 realizations): (a) static sampling at different sampling distances and (b) stationary and moving sampling at different sampling distances.The simulated sand probability is paired with the original resistivity value, grouped into 5 m bins and then plotted as the median for each bin.Also shown are the observed data and the fitted curve from the histogram which is desired to be reproduced.

Table 1 .
Split sample test showing how many of the high-probability sand cells (resistivity > 60 m) are simulated with corresponding sand probabilities (> 70 %)

Table 3 .
Simulated and defined sand proportions for the total domain and two subdomains based on two simulations with different soft conditioning data sets (BH-Sky20static and BH-Sky200moving), based on 25 realizations.

Table 4 .
The five performance criteria and categorized SkyTEM data as benchmark that are applied to the two simulations with different soft conditioning data sets: cell-by-cell soft conditioning and 200 m moving sampling soft conditioning, both including borehole data.The first three criteria are expressed as deviation to the benchmark.