Aquifer heterogeneity in combination with data scarcity is a major challenge for reliable solute transport prediction. Velocity fluctuations cause non-regular plume shapes with potentially long-tailing and/or fast-travelling mass fractions. High monitoring cost and a shortage of simple concepts have limited the incorporation of heterogeneity into many field transport models up to now.

We present an easily applicable hierarchical conceptualization strategy for hydraulic conductivity to integrate aquifer heterogeneity into quantitative flow and transport modelling. The modular approach combines large-scale deterministic structures with random substructures. Depending on the modelling aim, the required structural complexity can be adapted. The same holds for the amount of monitoring data. The conductivity model is constructed step-wise following field evidence from observations, seeking a balance between model complexity and available field data. The starting point is a structure of deterministic blocks, derived from head profiles and pumping tests. Then, subscale heterogeneity in the form of random binary inclusions is introduced to each block. Structural parameters can be determined, for example, from flowmeter measurements or hydraulic profiling.

As proof of concept, we implemented a predictive transport model for the heterogeneous MADE site. The proposed hierarchical aquifer structure reproduces the plume development of the MADE-1 transport experiment without calibration. Thus, classical advection–dispersion equation (ADE) models are able to describe highly skewed tracer plumes by incorporating deterministic contrasts and effects of connectivity in a stochastic way without using uni-modal heterogeneity models with high variances. The reliance of the conceptual model on few observations makes it appealing for a goal-oriented site-specific transport analysis of less well investigated heterogeneous sites.

Groundwater is extensively used worldwide as a major drinking water resource and consequently needs to be protected with respect to quantity and quality. Increasing pressure on the quality originates from the intensification of agriculture using agrochemicals (non-point sources) and an increased urbanization with the resulting solid and liquid waste and contaminant spills from industrial applications (point sources).

Essential for groundwater protection is the quantitative analysis of the fate and transport of various contaminants in the groundwater body. This can be either for a provisional risk assessment or for the clean-up of an already existing groundwater contamination. Numerical models are common tools to quantify the flow and transport, where partial differential equations are solved using initial and boundary conditions.

For simplicity, we restrict ourselves to saturated flow and transport of a dissolved, non-reactive contaminant. The governing equation for its concentration

The adequate parameterization of the heterogeneous conductivity

Stochastic methods allow the heterogeneity to be resolved and thus capture the induced uncertainty in flow and transport predictions. However, the amount of observation data required is usually high, depending on the method's complexity.
Common methods are (i) Kriging

For many unconsolidated sediments, field observations showed that conductivity is approximately log-normal

A recent debates series

Here, we present a parsimonious hierarchical heterogeneity conceptualization which is easy to apply in quantitative models for predicting flow and solute transport. In a deterministic–stochastic framework we combine descriptive zonation with statistical methods, following the lines of

We create a deliberate connection between the model parameterization requirements and the field characterization methods beyond a single monitoring method. Pumping tests, for example, are best suited to determine the spatially averaged transmissivity, i.e. hydraulic conductivity, even in a heterogeneous aquifer environment

We demonstrate the methodology for MADE, a heterogeneous, well investigated research site (e.g.

The course of the paper is the following: Sect.

Large-scale hydraulic structures of hundreds of metres or more determine the groundwater flow direction and magnitude in combination with groundwater catchment boundaries. Subsequently, they set the mean transport velocity. This is the key parameter to predict the location of the bulk mass of substances dissolved in the groundwater when input conditions are known.

Variations of hydraulic properties on intermediate scale, in the range of tens of metres, generate spatially variable flow fields. They also render transport velocities variable at these scales, resulting in larger spreading of plumes. This is particularly important for modelling tailing or leading mass fronts. Fluctuations on scales smaller than these intermediate scales have a blending effect, generally increasing local mixing and enhancing dispersion

Following this conceptual view, we generate hydraulic conductivity fields composed of three components: Module (A), (B), and (C), which capture the effects at large-, intermediate-, and small-scale heterogeneity, respectively. Each component is selected according to the model aim and the data at hand to parameterize the hydraulic conductivity for this component.

The procedure is exemplified for the MADE site. This significantly heterogeneous site was intensively investigated with various measurement devices providing many different data sets, such as pumping tests and flowmeter and DPIL (direct push injection logging) measurements

In the approach, we consider several steps:

specifying the aim of the model (what do we want to predict?);

selecting processes and process components which need to be accounted for in the model (what does this imply for the conceptualization of hydraulic conductivity?);

selecting suitable measurement methods (which method can deliver the data needed for parameterizing hydraulic conductivity with affordable effort?);

conceptualizing hydraulic conductivity;

calculating flow and transport.

The piezometric surface map of MADE (

Borehole flowmeter logs at MADE

Such strong vertical variation indicates the presence of high conductivity channels acting as preferential flow paths and low conductivity zones with stagnant flow, which both impact plume spreading behaviour strongly. Consequently, when aiming to model early and late plume arrival, these features need to be accounted for in a flow and transport model for the MADE site.

Given the scale dependency of hydraulic conductivity features and their distinct relevance for flow and transport predictions, we propose three components: Module (A), (B), and (C), which capture large-, intermediate-, and small-scale heterogeneity effects, respectively. Given a certain model aim, components are selected (or not) with regard to the available field data. We shortly discuss the modules and motivations of their use based on the data of the MADE site example for different aims.

The aquifer domain of interest is divided into deterministic zones of significantly different mean conductivity (i.e. more than 1 order of magnitude). The structure can comprise horizontal or vertical layering, simply in blocks or complex zone geometries, depending on the information available. The use of Module A is warranted when observation data indicate significant areal conductivity contrasts.

The zones represent large-scale geological structures exhibiting conductivity differences potentially over several orders of magnitude as a result of changes in deposition history or changes in the material's composition

The MADE site is an example for which the concept of two zones of different mean hydraulic conductivity (Fig.

When hydraulic conductivity shows heterogeneous features at the same length scale as the plume transport itself, they require proper resolution. A contaminant plume typically passes several of these intermediate-scale features but not enough to ensure ergodic transport behaviour. Thus, using effective parameters is not warranted. Since limited data availability precludes a deterministic representation of these features, stochastic approaches are best suited.

Binary stochastic models are a simple way to capture the effects of intermediate-scale features

The inclusion topology is a matter of choice and data availability.
A simple design is a distribution of non-overlapping blocks with horizontal length

Characteristic length scales in a vertical direction

The binary structure as in Fig.

Variations in grain size and soil texture form small-scale heterogeneities of characteristic length scales up to 1 m. Their relevance for transport predictions depends on the degree of heterogeneity and ergodicity. A plume is considered ergodic when the behaviour within one realization is statistically representative, i.e. exchangeable with ensemble behaviour.
Figuratively speaking, an ergodic plume has travelled long enough to sufficiently sample heterogeneity. This is usually assumed for transport distances of 10–100 characteristic lengths

If required, small-scale features can be conceptualized with a log-normal conductivity distribution

Geostatistical parameters can be inferred from spatially distributed observations (Fig.

Geostatistical values for MADE from DPIL (direct push injection logging)

When combined with larger heterogeneity structures, small-scale fluctuations are subordinate. In the case of field evidence, Module (C) can be combined with Modules (A) and (B) by adding zero-mean fluctuations. According to

The MADE site is a rare example with geostatistics from multiple observation methods (Figs.

The large value in variance, as determined for MADE, could likely be the result of preferential flow and/or trends in mean conductivity. Thus, explicitly representing deterministic zones (Module A) and preferential flow paths (Module B) might render the representation of small-scale features (Module C) redundant. Modelling hydraulic conductivity as log-normal fields solely based on Module (C) seems warranted when there is no indication of deterministic zones or preferential pathways.

The hierarchy of scales poses an inherent problem for each groundwater model based on heterogeneous field data. Data interpretation often does not allow general trends to be clearly distinguished from randomness.
The three modules provide a simple classification of transport-relevant heterogeneity scales: (A) beyond the plume scale, i.e. above 100 m; (B) in the range of the plume scale (about 10–100 m); and (C) subscale (

Which module to integrate at a specific site depends on multiple aspects: (i) is there field data evidence for a heterogeneity structure of a certain length scale? (ii) Is there sufficient data to parameterize a conceptual heterogeneity representation? (iii) Is it necessary to present the heterogeneity given the travel distance of the plume (ergodicity)? Having a positive answer to each of these questions for a certain module warrants its consideration in the conductivity conceptual model.

We validate our approach by performing flow and transport calculations for the MADE setting without parameter calibration. Although many approaches to model the transport at the MADE site exist, including detailed aquifer conceptualizations (e.g.

Based on the scale-dependent conductivity modules (Sect.

Following the approach steps outlined in Sect.

The MADE site is located on the Columbus Air Force Base in Mississippi, United States.
The aquifer was characterized as shallow, unconfined, of about 10–11 m thickness

The MADE-1 transport experiment was conducted in the years 1986–1988

Concentrations were observed within a spatially dense monitoring network at several times after injection. We focus on the reported longitudinal mass distribution of Adams and Gelhar (1992; Fig. 7) at six times: 49, 126, 202, 279, 370, and 503 d after injection. Values are integrated measures over transverse planes and accumulated over segments of

Three hydraulic conductivity conceptualizations are designed in line with the specifications for MADE in Sect.

Realizations of hydraulic conductivity structures.

Following the lines of

We fix mean conductivity values in the zones as

When fixing regional conductivities from pumping tests, the model scale coincides with the measurement scale. This way, our structures are independent of the upscaling of the method-specific (and location-specific) geometric means reported for MADE (Fig.

Flowmeter logs from MADE show a significant discontinuous heterogeneity in the layering (Fig.

The binary conductivity distribution is constructed for the entire domain comprising both deterministic zones. The upstream zone

We identify the specific values of

The inclusions' structure in both zones is designed according to the simplified block structure outlined in Sect.

Longitudinal mass distribution at

The parameter

For the Monte Carlo approach, we create ensembles of

We combine Modules (A), (B), and (C) to an inclusion structure in deterministic zones with small-scale fluctuations (A+B+C), depicted in Fig.

The log-normal fluctuations

Flow and transport are calculated making use of the finite element solver OpenGeoSys

Mass distributions at times

We checked the impact of dimensionality. A detailed discussion is provided in the Supplement. We found almost no differences between 2D and 3D simulation setups, where the binary structure (Module B) dominates. Extending the binary structure in the horizontal direction perpendicular to main flow does not provide additional degrees of freedom for the flow. Thus, extending the model hardly impacts the flow and thus transport pattern while significantly increasing computational effort. However, dimensionality effects hold for conductivity conceptualization, with prevailing log-normal distribution, i.e. dominated by Module C. The option of complexity reduction by using 2D instead of 3D models is warranted for this application by the fact that conductivity conceptualizations are dominated by the binary structure (module B).

Simulation results are processed like the MADE-1 experimental data. Longitudinal mass distributions are vertical averages and accumulated horizontally over

Figure

The mass distribution for the deterministic structure (concept A, yellow) shows a sharp peak close to the injection location and no mass downstream.
The conductivity structures with inclusions in deterministic zones (A + B, blue) and with subscale heterogeneity (A + B + C, green) result in skewed mass distributions, with a peak close to the injection area and a small amount of mass ahead of the bulk. Shaded areas indicate parametric uncertainty due to the variable inclusion length

A direct comparison of the mass distributions

Figure

Breakthrough curves: total mass

BTCs are not available for the MADE-1 transport experiment. However, we added the aggregated mass values at the three locations for the six reported times in a subplot to indicate a trend of temporal mass development. Note that mass values of the BTCs and those at MADE are at different scales due to data aggregation and mass recovery.

All conductivity structures were able to reproduce the skewed hydraulic head distribution as observed at MADE (Fig.

The deterministic block structure (A) failed to reproduce the skewed mass distribution observed at MADE. The leading front mass travelling through fast flow channels could not be predicted (Fig.

Tracer transport in a binary conductivity structure with inclusions (concept A + B) reproduces the observed mass, both for the peak near the injection site and the leading front.
The simulated longitudinal mass distribution shows a second peak downstream (Fig.

The horizontal inclusion length

The predicted plume shape for the conductivity structure with inclusions and subscale heterogeneity (A + B + C) is almost similar to the one without subscale heterogeneity (A + B). Consequently, the inclusion structure is the one which determines the shape of the distribution, whereas the impact of subscale heterogeneity is minor. Given the model aim of plume prediction, the additional effort for determining characterizing geostatistical parameters for the subscale heterogeneity is not warranted.

The binary conductivity conceptualization (A + B) was derived for MADE with few observations from standard methods, as can be expected to be present at many field sites. The price for the limited amount of data is parametric uncertainty. A sensitivity study revealed that the mass distribution resulting from the binary conductivity structure is very robust against the choice of parameters. The inclusion length

We introduce a modular concept of heterogeneous hydraulic conductivity for predictive modelling of field-scale subsurface flow and transport. The central idea is to combine deterministic structures with simple stochastic approaches to rely on few measurements and to forgo calibration. The scale hierarchy of hydraulic conductivity induces three structure modules which represent (A) deterministic large-scale features like facies, (B) intermediate-scale heterogeneity-like preferential pathways or low conductivity inclusions, and (C) small-scale random fluctuations. Field evidence of heterogeneity features and module's input parameters are provided by observation methods with the appropriate detection scale. The specific form of the scale-dependent features depends on the site characteristics and field data. We propose a deterministic model for large-scale features, a simple binary statistical model for intermediate-scale features, and a log-normal random model for small-scale features. However, the integration of alternative conductivity structures is possible. Thus, the concept is easily adaptable to any field site, making aquifer heterogeneity accessible for practical applications.

An illustrative example is given for the heterogeneous MADE site. Three modular conductivity structures are constructed, based on two observations: (i) the existence of distinct zones of mean flow velocity and (ii) high conductivity contrasts in depth profiles, suggesting local inclusions acting as fast flow channels. The structures are used in a predictive flow and transport model which is free of calibration. The comparison of results to the MADE-1 field tracer experiment showed that all conceptualizations can be of value, depending on the modelling aim. However, predicting the mass plume behaviour required heterogeneity to be taken into account.

The combination of deterministic and simple binary stochastic showed the best results given the trade-off between transport prediction and need for measurements. Realizations of hydraulic conductivity were composed of binary inclusions in two blocks with different average conductivity. Details on the topology are thereby secondary, since binary structures show robustness towards the choice of specific parameters.

The simple binary structure was able to capture the overall characteristics of the MADE tracer plume with reasonable accuracy, requiring only a small number of observations. Among the few predictive transport models for the MADE site, the approach presented shows a higher level of simulation effort due to the Monte Carlo simulations. However, the lower level of data requirements makes it attractive for application at less investigated sites. Note that when applying the proposed heterogeneity conceptualization in other modelling applications, a 3D model setup should be considered first, in particular when heterogeneity is conceptualized by a log-normal distribution (Module C). A complexity reduction to 2D models is warranted when the heterogeneous conductivity conceptualizations do not impact the flow pattern in the transverse horizontal direction, such as the binary structure. The generality of the binary concept makes it easily transferable to other sites, particularly when focusing on a few, but scale-related measurements.

A hierarchical conductivity structure allows for balance between complexity and available data. Large-scale structures determine the mean flow behaviour, which is most critical for flow predictions. They can be integrated into a model with reasonably low effort. Structural complexity increases with a decreasing heterogeneity scale, where small-scale features have the highest demand on observation data. However, even with limited information on the conductivity structure, simple stochastic modules can be used to incorporate the effect of heterogeneity. Considering small-scale features, the conductivity structure can be extended by including modules when additional measurements are available.

Distinguishing the effects of the scale-specific features on flow and transport also allows the need for further field investigations and potential strategies to be identified. The adaptive construction based on scale-specific modules allows a conductivity structure model to be created as complex as necessary but as simple as possible.

The use of simple binary models is very powerful when dealing with strongly heterogeneous aquifers. They require fewer observation data compared to uni-modal heterogeneity models, as log-normal conductivity with high variances.
Binary models also allow effects of dual-domain transport models to be incorporated without the drawback of having non-measurable input parameters which require model calibration. Our work shows that highly skewed solute plumes can be reproduced with classical ADE models by incorporating deterministic contrasts and effects of connectivity statistically.
In summary, we conclude the following:

Modular concepts of conductivity structure allow the multiple scales of heterogeneity to be separated. Scale-related investigation methods provide field evidence and conductivity model parameters. A hierarchical approach for conductivity can thus reduce observation effort by focusing on the model aim.

Site-specific heterogeneous hydraulic conductivity can be easily constructed with simple methods, taking the (limited) amount of data into account. For aquifers with a high conductivity contrast, we recommend combining large-scale deterministic structures and simple binary stochastic models.

The application example at MADE showed that complex field structures can be represented appropriately for transport predictions with an economic use of investigation data.

Study-related Python scripts are publicly available at

The supplement related to this article is available online at:

All authors contributed to developing the approach and writing the paper. Simulations and figure preparation was performed by AZ.

The authors declare that they have no conflict of interest.

The authors would like to thank Jeff Bohling for background information on flowmeter and DPIL data. We thank the editor and reviewers for their helpful comments.

The article processing charges for this open-access publication were covered by a Research Centre of the Helmholtz Association.

This paper was edited by Nunzio Romano and reviewed by Joost Herweijer and three anonymous referees.