Unrepresented model errors influence the estimation of effective soil
hydraulic material properties. As the required model complexity for a
consistent description of the measurement data is application dependent and
unknown a priori, we implemented a structural error analysis based on the
inversion of increasingly complex models. We show that the method can
indicate unrepresented model errors and quantify their effects on the
resulting material properties. To this end, a complicated 2-D subsurface
architecture (ASSESS) was forced with a fluctuating groundwater table while
time domain reflectometry (TDR) and hydraulic potential measurement devices monitored
the hydraulic state. In this work, we analyze the quantitative effect of
unrepresented (i) sensor position uncertainty, (ii) small
scale-heterogeneity, and (iii) 2-D flow phenomena on estimated soil
hydraulic material properties with a 1-D and a 2-D study. The results of
these studies demonstrate three main points: (i) the fewer sensors are
available per material, the larger is the effect of unrepresented model
errors on the resulting material properties. (ii) The 1-D study yields
biased parameters due to unrepresented lateral flow. (iii) Representing and
estimating sensor positions as well as small-scale heterogeneity decreased
the mean absolute error of the volumetric water content data by more than a
factor of

Soil hydraulic material properties are essential to advance quantitative
understanding of soil water dynamics. Despite decades of research, direct
identification of these properties is time-consuming and near to impossible
at larger scales. Therefore, indirect identification methods, such as
inversion

Most of these studies describe
the given data with models chosen upfront with restricted complexity and a
minimum number of parameters. If the models are too simple, critical
uncertainties and processes may be neglected, leading to suboptimal results.
If the models are too complex, the resulting material properties are likely
to be application dependent. In general, the required model complexity is
unknown a priori

This
problem can be quantified with a Bayesian total error analysis (BATEA;

In this work, we change the perspective and associate the model with our quantitative understanding of reality that is tested against the given measurement data. To analyze the required model complexity, we prescribe temporally constant material properties, calculate the maximum likelihood of increasingly complex models and analyze the corresponding structural model–data mismatch. We show that this structural error analysis indicates limitations of these models and quantifies the effect of the respective unrepresented model errors on the inversely estimated material properties. Specifically, we analyze measurement data acquired at the test site (ASSESS) while it as forced with a fluctuating groundwater table which ensures a high dynamical range of the hydraulic state. We set up a basic representation accounting for uncertainties of the hydraulic material properties and the forcing. Following an uncertainty analysis, we additionally estimate the sensor position and small-scale heterogeneity. These increasingly complex models are applied to (i) three 1-D profiles in ASSESS with an increasing number of sensors per material and (ii) the full 2-D profile to additionally analyze the implications of the restriction to a 1-D subsurface architecture and to few sensors per material.

The grain size distribution in percent by weight displays the different granularity of the
materials A, B, and C of ASSESS (G. Schukraft, personal communication, Institute of Geography,
Heidelberg University, 2010). Whereas the composition of the materials B and C is similar, material A
features a higher percentage of fine sand. Since the mechanical wet analysis is time-consuming and laborious,
only material B was sampled twice. Thus,

The approximately

The test site is equipped with a
weather station, a tensiometer (UMS T4-191), and

View of ASSESS site with tensiometer access tube, weather station, and groundwater well along the
left boundary. The jump in color reveals different sands that crop out at the surface (figure adapted from

ASSESS features an effective 2-D architecture with three different kinds of
sand (A, B, and C). The hydraulic state can be manipulated with a groundwater
well (white square, at

For representing the soil water dynamics in ASSESS during the experiment, we
follow

The Richards equation

We choose the Brooks–Corey parameterization

Small-scale heterogeneities, i.e., the texture of the porous medium, can be
represented with Miller scaling if the pore spaces at any two points are
assumed geometrically similar

The hydraulic state was forced with a fluctuating
groundwater table by pumping water in or out of a groundwater well. The
experiment was arranged in three different phases: (i) initial
drainage phase, (ii) multistep imbibition phase, and (iii) multistep drainage phase. The detailed forcing is presented in Table

The position of the groundwater table was measured manually in the groundwater well and automatically
with the tensiometer (Fig.

The measured water content data for the three different phases (initial drainage, multistep
imbibition, and multistep drainage – separated by the solid vertical black lines in the figure) show a
high variability up to and beyond the validity limits of the Richards equation due to the fluctuating
groundwater table (Fig.

During the experiment, ASSESS was forced with a fluctuating groundwater table. Therefore,

The hydraulic state was monitored in particular with hydraulic potential and
water content measurements during the experiment. The hydraulic potential was
assessed via the position of the fluctuating groundwater table. This position
was measured (i) manually in the groundwater well and (ii) automatically with
the tensiometer (Fig.

The water content data are based on measured TDR traces which yield the
relative permittivity of the soil

The evaluated water content data
of those TDR sensors that were desaturated during the experiment are
displayed in Fig.

We attribute the spread of the water content during saturation mainly to
small-scale heterogeneity and quasi-saturation due to entrapped air

As outlined in Sect.

Preparing the tools for the method, we start this section
with the Levenberg–Marquardt algorithm (Sect.

We employ the Levenberg–Marquardt algorithm for parameter
estimation. Our implementation is based on

Assuming (i)

We follow

Finally, the parameter update

The described gradient-based algorithm heuristically balances performance and stability.
Expanding the stability measures, we introduce a damping vector

This overview includes specification whether the considered model error is represented and explicitly estimated within the scope of this study.

By applying the

The structural error analysis and the assessment of
uncertainties result from iterative evaluations. To illustrate the method,
we present an iteration where the orientation of ASSESS was not yet
compensated by rotating the geometry and the gravitation vector (Sect.

The subsurface architecture of ASSESS (Fig.

A visual analysis of the standardized residual
increases the intuitive understanding of the model–data mismatch

In addition to the visual
analysis of the standardized residual, statistical measures help to quantify
the model–data mismatch. As a single measure might be misleading

The setup of the parameter estimation is explained in
Fig.

In
order to analyze the effect of the uncertainty of the sensor position,
small-scale heterogeneity, and lateral flow on the estimated material
properties along the lines presented in
Sect.

In the basic setup, we estimated the hydraulic material properties, an offset to the Dirichlet boundary condition, and the saturated hydraulic conductivity of the gravel layer.

With the position setup, we estimated the sensor positions in addition to the parameters in the basic setup.

For the Miller setup, we estimated one Miller scaling factor for each TDR sensor in addition to the parameters in the basic setup.

Finally, in the Miller and position setup, we estimated both the sensor positions and one Miller scaling factor for each TDR sensor in addition to the parameters in the basic setup.

The available hydraulic potential

In order to investigate the extent to which the experiment at
ASSESS can be described with a 1-D model, we set up three different cases with
an increasing number of TDR sensors per material (Table

As described above, the analysis is organized in four
different setups (basic, position, Miller, and
Miller and position). The basic setup is adjusted for the 1-D
studies, such that not only the material functions of the materials with
sensors but also the saturated conductivity of the third material (material
A in case II and material C in case III) are
estimated for case II and case III. The other
setups remain accordingly. Further details concerning the implementation of
the 1-D study are given in Sect.

For each of the different setups, we ran an ensemble of 20 inversions, starting
from Latin hypercube sampled initial parameter sets in order to analyze the
convergence behavior. The sampling algorithm was implemented with the help of
the pyDOE package (

The 1-D study comprises three different cases which investigate the three materials
with an
increasing number of TDR sensors per material at different locations in ASSESS (Fig.

In this study, we expand the investigated domain
to two dimensions and analyze the performance of the improved representation. To this
end, we set up four different setups (basic, position,
Miller, and Miller and position) as described above. Since the
positions of both the tensiometer and the groundwater well are in the modeled
domain, we use the hydraulic potential measurement data as well as the TDR
measurement data in this study. Thus, the position setup is adjusted
such that both the positions of TDR sensors and the tensiometer are
estimated. All inversions for the 2-D study are initialized with the initial
state material functions (Sect.

In order to improve the quantitative understanding
of the hydraulic behavior of ASSESS (Sect.

The standardized residual
for each case is presented in Fig.

The large residuals are not
random and preferably occur in transient phases. We attribute them to missing
processes in the dynamics or to biased parameters. As the curves in the
probability plot are basically centered at the origin, a significant constant
bias in the residuum can be excluded. The according statistical measures are
given in Table

The

For the 1-D study, the standardized residuals of the best ensemble member are visualized over time

In order to analyze the results of the 1-D study, the performance of the best ensemble members for each case and for each setup are benchmarked with statistical measures. With increasing numbers of included TDR sensors per material, the statistical measures for the basic setup indicate a worse description of the measurement data. However, estimating the position and the Miller scaling factor for each TDR sensor improves the description of the measurement data significantly according to the statistical measures.

Comparing the resulting material properties of the evaluated ensemble members
for the different cases and setups (Fig.

We also ran the
inversions without estimating the offset to the Dirichlet boundary condition
(Sect.

The estimated material functions of the best ensemble member are shown for
each of the three cases (case I,
case II, and
case III) and the four setups of the
1-D study. Additionally, we present the material functions resulting from the
initial state estimation (Sect.

The estimation of uncertain model components can lead to correlated
estimated parameters, e.g., as an incorrect position of the groundwater table
(

The three cases cover the three materials at different locations in ASSESS and are based on distinct data with respect to both quantity and data range.

This is most evident for material A
which is located at the bottom of ASSESS and nearly saturated in case
I, whereas it is at the top and rather dry in case III (colored dots
in Fig.

In order to minimize the structural model–data mismatch during this
equilibration phase, the parameter estimation algorithm increases the
hydraulic conductivity to compensate for the non-represented lateral flow
with additional vertical flow from above the sensor. Hence, the hydraulic
conductivity of case I is larger than the hydraulic conductivity for both
case III and the 2-D study, which is discussed in
Sect.

The measurement data
of material B used in the inversions of case II and case III do not emphasize the relaxation of the capillary fringe strongly. Hence,
we expect that the effect of the unrepresented lateral flow is not as
significant as for material A, leading to relatively congruent
resulting material functions. This expectation is confirmed by the results,
except for those setups of case II, in which no Miller scaling factor
was estimated. These setups show a larger curvature of the soil water
characteristic and of the hydraulic conductivity function which is explained
in Sect.

Similarly as for material B, the inversions for material C are not strongly influenced by the relaxation of capillary fringe. The large uncertainty in the saturated hydraulic conductivity reflects the low sensitivity of the measurement data on this parameter due to the lack of measurements influenced by the saturated material C.

The curvature of the soil water characteristic for
the inversion results is reasonably close the initial state material
functions (Sect.

The standardized residuals of the 2-D study are visualized over time

We show the resulting material functions for all three materials involved in
the 2-D study which is analyzed with the four setups
(basic,
position,
Miller, and
Miller and position). The plot range is
adjusted to the available water content range for each material.
The height of the histogram bars denotes the number of available water
content measurements and is normalized over all figures in this work. Since
the inversions for all setups are initialized with the material functions
resulting from the initial state estimation (Sect.

For the 2-D study, the
number of sensors is comparable to the number of hydraulic material
parameters. Therefore, estimating sensor positions and Miller scaling factors
increases the total number of parameters and thus the computational cost
considerably (basic:

In order to understand this deviation in more detail, we investigate the
remaining structural model–data mismatch during the final drainage and
equilibration phases between 30 and 40 h. The largest residuals
occurring during the drainage phase around

The largest residuals during the final equilibration phase between
30–40 h come from TDR sensors 2 and 22 close to the capillary
fringe. We attribute them to unrepresented processes in the dynamics, such as
hysteresis or 3-D flow (Sect.

Due to
the persisting large residuals during transient phases, the probability plot
(Fig.

For each setup of the 2-D study, the results are benchmarked with statistical measures. Similar to the 1-D study, estimating the sensor position and the Miller scaling factors improves the statistical measures related to the water content significantly. The statistical measures for the position of the groundwater table including both the tensiometer and the groundwater well data improve only for setups in which the sensor positions are estimated.

The description of the hydraulic potential
data only improves in those setups in which the sensor position is estimated
(Fig.

The 2-D study is based on a
larger number of water content measurements, additional hydraulic potential
measurements, and more complicated flow phenomena compared to the previously
discussed 1-D study (Sect.

We present the effective hydraulic material parameters obtained with the Miller and position setup of
the 2-D study. The formal standard deviations of the parameter estimation are given with the understanding that these are
specific to the applied algorithm and will change for different algorithm parameters.
The estimations for the saturated hydraulic conductivity of the gravel layer and for the offset to the Dirichlet boundary
condition are

Each setup starts from
the same initial material functions (Sect.

To investigate this, consider the initial state
estimation for material B shown in Fig.

It is worth
noting that although the uncertainty of the measured grain size distribution
(Table

We applied a structural error analysis on a representation of the effectively 2-D architecture ASSESS. This representation includes TDR and hydraulic potential measurement data which were acquired during a fluctuating groundwater table experiment. Based on the assumption that structural model–data mismatch indicates incomplete quantitative understanding of reality, we implemented a 1-D and a 2-D study organized in different setups with increasingly complex models. We started with the estimation of effective hydraulic material properties and we added the estimation of sensor positions, small-scale heterogeneity, or both. It was demonstrated that the structural error analysis can indicate significant unrepresented model errors, such as the slope of the ASSESS test site.

We showed that estimated material properties resulting from a
1-D study are biased due to unrepresented lateral flow. Analyzing
representations with increasing data quantity, it was also found that the
fewer sensors are available per material, the stronger is the influence of
the unrepresented model errors on the estimated material properties. We
illustrated that the more complicated flow phenomena are represented, the
better uncertain model components can be separated by the parameter
estimation algorithm leading to more reliable material properties. Generally,
representing sensor position uncertainty and small-scale heterogeneity
improved the description of the water content data quantitatively in setups
with many sensors. Yet, the residuals of the water content data still reach
more than

In order to minimize the error in the initial state, we developed a method to estimate the initial water content distribution based on TDR measurements and an approximation of hydraulic head which additionally yields an approximation of the soil water characteristic. We found that this approximation is reasonably close to inversion results and that the according parameters can be used as initial parameters for gradient-based optimization. Since all the inversions of the 2-D study are initialized with these parameters, the comparison of the results directly displays the quantitative effect of the according unrepresented model errors on the estimated material properties.

Since the three approaches ((i) initial state estimation, (ii) 1-D inversion, and (iii) 2-D inversion) allow to estimate effective hydraulic material parameters, we finally discuss their levels of improving the quantitative understanding of soil water dynamics.

The initial state estimation requires at least three water content measurements
per material over the full water content range and the position of the
groundwater table to estimate the parameters for soil water characteristic
for one specific equilibrated hydraulic state. Lacking direct measurements of
the unsaturated hydraulic conductivity, the method cannot estimate the
remaining parameters

The 1-D inversions are comparably fast (several minutes up to several hours on a local machine) and can represent transient states. In contrast to the initial state estimation, 1-D inversions can estimate all parameters of the material functions. However, more complicated flow phenomena including lateral flow can not be represented. This leads to biased parameters.

The unique characteristics of the 2-D inversions (days on a cluster with same number of cores as parameters) is the ability to represent lateral flow phenomena which are typically monitored with a high number of sensors. Hence, the consistency of the representation is implicitly checked. Therefore, we expect that of the three approaches discussed, this one yields the most reliable material properties. Still, unrepresented model errors including 3-D flow phenomena influence the results.

The underlying measurement data are available at

The Richards equation (Eq.

ASSESS is not built completely rectangular.
Most importantly, both the surface and the ground are not horizontal but
primarily inclined towards the groundwater well with a mean slope of

The volumetric water content is
evaluated from measured TDR traces (Fig.

The evaluation of a TDR trace is based on the detection of the inflection points caused by the probe head and the end of the rod. This is done automatically after calculating of the first temporal derivative of the trace. Parabolas are fitted to the maxima of the temporal derivative to increase the precision of the evaluated signal travel time.

The numerical solution of the Richards
equation (Eq.

In
order to represent the heterogeneity of ASSESS which is not covered by
describing the different sand types with distinct material properties due to
the small-scale variability of the pore space, the center of each grid cell
is associated with a Miller scaling factor (Eq.

Generally, the boundary of the simulation is
implemented with a Neumann no-flow condition. However, during the forcing
phases, we prescribe the measured groundwater table as a Dirichlet boundary
condition at the position of the groundwater well. In addition to the
orientation of ASSESS (Sect.

Since we use an inversion method
for parameter estimation (Sect.

Hence, we developed a method to estimate the initial water content distribution based on TDR measurement data.

In the first step, we
assume static hydraulic equilibrium and approximate the matric potential at
the measured position of the TDR sensors with the negative distance of this
position to the groundwater table. Subsequently, the approximated matric
potential is associated with the measured water content for each sensor.
Further, we assume spatially homogeneous and temporally constant material
properties which allow us to group the data of the TDR sensors by material,
together with the approximated matric potential and the measured water
content. For each material, we then fit the parameters

As the parameters for
the Brooks–Corey parameterization are derived from static measurement data,
we may use them as initial parameter values for computationally expensive
gradient-based inversions of dynamic measurement data (Sect.

In particular due to (i) a limited number of TDR sensors, (ii) missing hydraulic potential measurements at the position of the TDR sensors, and (iii) spatial small-scale heterogeneity present in the materials, structural deviations between the estimation and the measurements occur, which indicate limitations of describing ASSESS with effective soil hydraulic material properties.

The water content measured by TDR sensors 5, 12, and 29 deviate
structurally from the estimation of the initial state for material B
(Fig.

We use the Brooks–Corey parameterization to estimate the initial water content distribution between
the TDR sensors. Assuming hydraulic equilibrium, we approximate the matric potential

The estimated initial water content distribution is based on the TDR measurement data
(Fig.

The forward simulations were calculated with a grid resolution of

The 2-D simulations in this work are calculated with a grid resolution of

SJ designed and conducted the experiment, developed the main ideas, implemented the algorithms, and analyzed the measurement data. KR contributed with guiding discussions. SJ prepared the paper with contributions of both authors.

The authors declare that they have no conflict of interest.

We thank Jens S. Buchner for the code to process the ASSESS architecture raw data, Angelika Gassama for technical assistance with respect to ASSESS, and Andreas Dörr for helping to set up a Beowulf cluster. Additionally, we thank Hannes H. Bauser, Andreas Dörr, and Patrick Klenk for discussions that improved the quality of the manuscript. We especially thank Patrick Klenk and Elwira Zur for assistance during the experiment. The authors acknowledge support by the state of Baden-Württemberg through bwHPC and the German Research Foundation (DFG) through grants INST 35/1134-1 FUGG and RO 1080/12-1. We are also grateful to the editor Roberto Greco, two anonymous reviewers, and Conrad Jackisch, who all helped to improve the manuscript significantly. Edited by: Roberto Greco Reviewed by: two anonymous referees