Coupled hydrogeophysical inversion of an artificial infiltration experiment monitored with ground penetrating radar: synthetic demonstration

. In this study, we investigate the use of ground penetrating radar (GPR) time-lapse monitoring of artificial soil infiltration experiments. The aim is to evaluate this protocol in the context of estimating the hydrodynamic unsaturated soil parameter values and their associated uncertainties. The originality of this work is to suggest a statistical parameter estimation approach using MCMC to have direct estimates of the parameter uncertainties. The use of the GPR time data from the moving wetting 5 front only does not provide reliable results. Thus, we propose to use additional information from other types of reflectors to optimize the quality of the parameter estimation. Water movement and electromagnetic wave propagation in the unsaturated zone are modeled using a one-dimensional hydrogeophysical model. The GPR travel time data are analyzed for different re-flectors: a moving reflector (the infiltration wetting front) and three fixed reflectors located at different depths in the soil. Global sensitivity analysis (GSA) is employed to assess the influence of the saturated hydraulic conductivity K s , the saturated and 10 residual water contents θ s and θ r , and the Mualem–van Genuchten shape parameters α and n of the soil on the GPR travel time data of the reflectors. Statistical calibration of the soil parameters is then performed using the Markov chain Monte Carlo (MCMC) method. The impact of the type of reflector (moving or fixed) is then evaluated by analyzing the calibrated model parameters and their confidence intervals for different scenarios. GSA results show that the sensitivities of the GPR data to the hydrodynamic soil parameters are different between moving and fixed reflectors, whereas fixed reflectors at various depths 15 have similar sensitivities. K s has a similar and


Introduction
The vadose zone is defined by the region between the ground surface and the groundwater table.Because of its location, it is at the center of the interactive atmospheric-surface-underground water system.Hence, understanding water flow in the vadose zone is crucial for hydrological modeling and forecasting that can be useful for water resources management, agricultural practices optimization, or geotechnical studies.The porous medium in the vadose zone is filled by both water and air phases.
The air phase is considered infinitely mobile and remains at atmospheric pressure.The movement of water has a non-linear behavior and is characterized by two fundamental hydraulic relationships, namely, the water retention and the hydraulic conductivity functions.Various mathematical expressions can describe these functions in terms of dependent variables and fitting parameters.In this work, we use the Mualem-van Genuchten (Mualem 1976, van Genuchten 1980) hydraulic conductivity and water retention models.These models include the following unsaturated soil hydraulic parameters: the saturated hydraulic conductivity, the saturated and residual water contents, and the Mualem-van Genuchten shape parameters α and n.
Different approaches can be applied to estimate the unsaturated soil parameters.In soil physics, the reference method relies on laboratory measurements conducted on soil core samples.Such experiments can use various techniques such as thermogravimetry or tensiometry, but common practices rely on hydraulic fluxes measurements (Vereecken et al., 2008).Laboratory measurements can provide direct measurements of the soil hydraulic properties or state variables and can therefore be of great accuracy at the column scale.On the other hand, it is prone to certain limitations when the objective is to deduce the soil parameter values at larger scales.Indeed, sample analysis through laboratory experiments is unlikely to provide parameter estimates at field conditions since the volume of the analyzed samples is often not representative of the field heterogeneity at the mesoscale (Scharnagl et al., 2011).In addition, the method is invasive and can be labor-intensive for deep or large scales investigations (Binley et al., 2015).Furthermore, the conservation of collected samples can be challenging because of issues of compaction and changes in porosity.
At the field scale, the soil hydraulic properties and state variables can be estimated using numerous approaches.Measurements of the soil water content, water pressure, and hydraulic conductivity can show significant variations because of their sensitivity to different hydrological processes.As a consequence, such measurements are convenient for the estimation of soil parameters of the subsurface at the field scale by inverse modeling.Soil hydraulic properties and state variables measuring techniques can be classified into two categories, based on whether the measuring devices do have to be in direct contact or not with the soil.In the first instance, when the measuring devices must be in direct contact with the soil, measurements can present a spatial support around the micro (mm -cm) and local scale (cm -m) with water content sensing techniques (using thermal or electromagnetic sensors, e.g., capacitance or time domain reflectometry, Jones et al., 2005;Belfort et al., 2019), water pressure measurements with tensiometers (Cassel and Klute, 1986) or psychometers (Rawlins and Campbell, 1986), and hydraulic conductivity measurements with permeameters (Kodešová et al., 1998) and infiltrometers (Muntz et al., 1905).These techniques can yield data with great resolution at one location and give information on the dynamics at the field scale (Vereecken et al., 2008).In addition, measurements taken at various locations can help to describe the distribution of water content, and thus, allow a good characterization of the state of the soil.For measurements with sensors, however, their installation is often laborious, time-consuming, and destructive (Huisman et al., 2003;Dal Bo et al., 2019).Furthermore, their reliability requires an accurate calibration (Robinson et al., 2008).
Other techniques use non-invasive devices that don't have to be in direct contact with the soil, like remote sensing and hydrogeophysical methods.Remote sensing techniques use devices that operate remotely and relatively far from the ground, such as unmanned aerial vehicles thermal infrared imagery (Zhang et al., 2019) or airborne ground penetrating radar (Edemsky et al., 2021).These methods provide the mapping of water content at a large scale and in locations where contact-based sensing measurements cannot be conducted.However, remote sensing methods exhibit a penetration depth of only a few centimeters and are often limited by the vegetation density (Vereecken et al., 2008;Robinson et al., 2008).
Common hydrogeophysical methods include electromagnetic induction (Doolittle and Brevik, 2014), direct current resistivity (de Jong et al., 2020), nuclear magnetic resonance (Costabel and Günther, 2014), and ground penetrating radar (GPR) (Huisman et al., 2003;Klotzsche et al., 2018) methods.These techniques supply indirect information on hydraulic properties or states, at various scales, from estimated geophysical properties.As mentioned by Binley et al. (2015), such conversion from geophysical to hydraulic properties or states requires the use of robust petrophysical relationships to provide reliable estimates of hydraulic parameters.
Nowadays, GPR is highly used in the field of hydrogeophysics.Different techniques have been reviewed and discussed by Huisman et al. (2003) and Klotzsche et al. (2018).Indeed, GPR is highly sensitive to water content, and, as such, it can close the gap between the spatial scales covered by direct and remote sensing techniques (Klotzsche et al., 2018).Note however that the hydraulic properties estimated from GPR data are subject to an inherent compromise between a deep investigation and a fine spatial resolution.For instance, the lowest frequencies (typically from 1 GHz down to 100 MHz) allow deeper penetrations (until a maximum depth between 1 m and 3 m in most organic media).The temporal variability of the soil water content can be characterized from time-lapse GPR measurements.In this case, the GPR method is applied in a static approach, where, instead of classically imaging the spatial variation of the properties of the subsurface, the device is set immobile and captures how the properties of the soil change over time.GPR data can be collected during artificial hydraulic processes (e.g., infiltration, runoff, drainage, imbibition) that can provide interesting information on the flow characteristics.Compared to other hydraulic processes, artificially forced infiltration is particularly fast.It also induces a rapidly evolving transient hydraulic perturbation.Time-lapse GPR is characterized by a high spatial and temporal resolution and is therefore well adapted for monitoring such a fast hydraulic process.Artificial infiltration process is also easy to establish since it only requires the application of a positive water pressure head on the soil surface.Hence, time-lapse GPR monitoring of artificial infiltration experiment is usually effortless and time-saving.Furthermore, except in the case of borehole investigations, the GPR device can be laid on or raised above the surface.For these reasons, time-lapse GPR monitoring of artificial infiltration is fast and easy-to-apply and repeat at multiple locations, and, when used on or above the soil surface, non-destructive.Therefore, it is one of the cheapest approach that fits well in the context of mapping the unsaturated soil parameters' heterogeneity at a small catchment scale.
At the laboratory scale, Léger et al. (2020) have monitored imbibition-drainage experiments using a single-offset surface GPR.Jaumann and Roth (2018) conducted similar experiments but at the test site scale, where they showed reasonable results when estimating the soil unsaturated parameters and the subsurface architecture.As already pointed out however, this hydraulic process can take longer than an infiltration to reach a steady state and is also practically harder to conduct at the field scale.Busch et al. (2013) calibrated the Mualem-van Genuchten parameters of their model by monitoring natural precipitation and evapotranspiration events at the field scale.Such slow hydraulic process can last several days or months and is therefore not suitable for easy and fast characterization.Infiltration experiments have been conducted at the laboratory scale by Moysey (2010), where they considered the GPR two-way travel time (TWT) from various sources of reflection.They showed that the Mualem-van Genuchten shape parameter n is the most poorly constrained among all unsaturated soil parameters On the field, infiltration processes have been monitored with borehole GPR (Scholer et al., 2011), single-offset (Léger et al., 2014;2016) and multi-offset (Saito et al., 2018) surface GPR, or off-ground GPR (Jadoon et al., 2012;Jonard et al., 2015).For practicality, surface GPR is preferred over off-ground and borehole GPR, the latter also being destructive by nature.Saito et al. (2018) used a more complex multi-offset and multi-channel surface GPR to directly monitor the wetting front progression.Mono-channel multi-offset technique is usually not suited for monitoring experiments with high temporal variability, as the offset must be adjusted between each measurement.The multi-channel technique has the advantage to be multi-offset and is, therefore, able to simultaneously determine the propagation speed and the depths of reflectors.
In the present study, we are interested in using a quick, easy-to-apply, and cheap field scale method to characterize the unsaturated soil parameters.To this end, time-lapse GPR monitoring of artificial infiltration is a well suited protocol.It is similar to ring infiltrometry methods but with additional information from GPR measurements.In the literature, the work of Léger et al. (2014) is the closest one considering this protocol for parameter estimation.The authors have demonstrated the relevance of such a methodology to evaluate the hydraulic parameters of sandy soil.They have investigated synthetic and field examples and showed that the inverted parameters were in agreement with the values obtained in the laboratory for soil samples and with disk infiltrometer measurements.However, in their study, Léger et al. (2014) used an optimization-based inversion algorithm which did not allowed to assess the reliability of the estimated values since the uncertainty associated with the calibrated parameters has not been evaluated.Furthermore, Léger et al. (2014) employed only the TWT data obtained from the GPR reflection on the wetting front for the calibration of the soil parameters, which was satisfactory enough for them to obtain such remarkable results.The original work presented here aims to extend the actual state of the art by: -Considering different reflectors at different depths: a moving reflector which corresponds to the infiltration dynamic wetting front and two fixed reflectors located at different depths in the soil.
-Investigating the influence of all soil parameters (the saturated hydraulic conductivity, the saturated and residual water contents, and the Mualem-van Genuchten shape parameters α and n) on the GPR TWT data of the three reflectors using Global Sensitivity Analysis (GSA).
-Performing statistical calibration of soil parameters using the Markov Chain Monte Carlo (MCMC) method and evaluating the reliability of the estimated parameters by analyzing not only the calibrated model parameters but also their associated uncertainty.
-Evaluating the impact of the type of reflector (moving or fixed) by analyzing the calibrated model parameters and their confidence intervals for different scenarios.
The plan of the paper is as follows: Section 2 describes the test case as well as the mathematical and numerical hydrogeophysical models.Section 3 reports on the GSA results of the different TWT signals.Then, Section 4 discusses the results of soil parameter estimation with MCMC for different scenarios.
2 Test case description and numerical solution

Test case description
In this work, we conduct a synthetic study on the time-lapse GPR monitoring of artificial infiltration protocol, prior to applying it in real conditions.The idea is to perform synthetic experiments under the same conditions of real experiments to better understand the pertinence of the investigated protocol when used for estimating the unsaturated soil parameters.The test case considered is a hypothetical one-dimensional experiment of water infiltration in a homogeneous sandy soil of 150 cm (Fig. 1a).The approach used to drive the artificial infiltration is comparable to other techniques commonly used to estimate the properties of the porous medium, such as single or double ring infiltrometry.As evidenced in other studies (e.g., Léger et al., 2014), the idea is to add information from the GPR data monitored during the infiltration to have access to more of the hydrodynamic parameters.In the present synthetic case, a constant pressure head of 10 cm is applied at the surface of the soil (i.e., a 10 cm water ponding Dirichlet-type boundary condition is maintained at the top).The medium is initially at the hydrostatic equilibrium with a water table maintained at 100 cm below the soil surface (Fig. 1b).The domain is initially formed by an unsaturated zone of 100 cm thick above a saturated zone of 50 cm thick.We assume the experiment to be monitored with a surface GPR.The propagating time (i.e., the TWT) of the GPR waves reflected by two types of reflectors are considered (Fig. 1c): (i) the moving infiltration wetting front and (ii) two fixed reflectors corresponding to a local heterogeneity at two different depths.For instance, these can be small objects that are artificially buried (e.g., moisture sensing probes) or naturally embedded (such as small rocks) in the porous medium.The fixed reflectors are supposed to be small enough compared to the section of the infiltrated area, so they do not significantly perturb the vertical flow.The upper fixed reflector, R50, is located in the initially unsaturated zone at 50 cm depth.The reflector R120 is located in the saturated zone, under the water table, at a distance of 120 cm from the soil surface (Fig. 1a).In the following, the time-lapse TWT signal for reflection caused by the infiltration wetting front is noted TWT f and that from the two immovable diffracting points R50 and R120, are respectively noted TWT 50 and TWT 120 .

Unsaturated flow model
Water infiltration in unsaturated/saturated soils is governed by the one-dimensional Richards' equation (Richards, 1931): where h (cm) is the pressure head; z is the depth (cm), taken positive in the downward direction; t is the time (s), θ (cm 3 /cm 3 ) is the actual water content, and K(θ) (cm/s) is the hydraulic conductivity which is a function of water content.The initial condition is a hydrostatic pressure distribution corresponding to a water table at 100 cm depth.The boundary condition at the top of the domain is a fixed Dirichlet condition of 10 cm maintained during the experiment.The boundary condition at the bottom is a piezometric head fixed at -100 cm which corresponds to the water table position (Fig. 1).
The interdependencies of the pressure head, conductivity, and water content are described using the standard models of Mualem (1976) andvan Genuchten (1980): where S e (h) (-) is the effective saturation, θ s and θ r (cm 3 /cm 3 ) are the saturated and residual water contents, respectively, K s (cm/s) is the saturated conductivity, m = 1 − 1/n, α (1/cm), n (-) are the Mualem-van Genuchten shape parameters, and L (-) is a parameter characterizing the tortuosity of the flow paths of moving water in the interconnected pores of the soil.It is set at L = 0.5 here, following the works of Mualem (1976) and van Genuchten (1980).

Petrophysical and Geophysical relationships
In GPR sounding, pulses of radiofrequency (MHz to GHz) electromagnetic waves are emitted from a transmitting antenna through the sounded medium.The electromagnetic response is then acquired with a receiving antenna.With a surface GPR, both antennas are installed at the surface of the soil (Fig. 1).To monitor the experiment of water infiltration with time-lapse GPR, the sounding system is set immobile above the infiltration zone in order to capture the time variation of the electromagnetic response due to the change of saturation.
To describe the dependency of the dielectric permittivity on the water content, we use the complex refractive index model (Birchak et al., 1974).This petrophysical relationship relates the dielectric constant ϵ (-) of a three-phase (water-solid-air) medium to its water content by: where ϕ (-) is the porosity, considered equal to the saturated water content θ s , ϵ w = 80, ϵ s = 2.5 (Léger et al., 2014) and ϵ a = 1 are the dielectric constants of water, silica (sand) and air, respectively.
In this work, the soil is considered as a linear and isotropic non-magnetic medium.When working with frequencies below 1 GHz, the soil electrical conductivity can be neglected.In this case, the electromagnetic waves propagate at a speed V (cm/ns) (Annan, 2003): where c ≈ 30 cm/ns is the speed of electromagnetic waves in air, and ϵ (-) is the dielectric constant of the porous medium.
Equations ( 4) and ( 5) evidence that GPR waves propagate at a much lower speed in wet conditions.Any source of reflection in the sounded soil produces a reflected wave that is recorded at a time corresponding to the duration of its propagation, from the transmitting antenna, down to the source of reflection, then back up to the receiving antenna, i.e., the TWT of the reflected wave.
We consider a one-dimensional scenario (the offset between the antennas is null) and discretize the domain into N cells i, centered at a depth z i , with element boundaries at z i−1/2 and z i+1/2 .The TWT for the reflection occurring at the interface (i − 1/2) between the elements i − 1 and i can be expressed as the sum of the vertical TWT in each element above i: in which |l j | (cm) is the length of the element j above i and V j (cm/ns) is the GPR propagation speed in the element j.
A reflection occurs at the interface between two successive elements if the reflection coefficient is not zero.The reflection coefficient expresses the contrast of dielectric constant (due to the contrast of water content) at the interface between the two elements i − 1 and i.When the offset between transmitting and receiving antennas is null, the reflection coefficient at interface where ϵ(z i ) is the dielectric constant of the element i.
For an 800 MHz antenna, the wavelength can typically vary from 6 cm in a wet medium to around 18 cm in a dry medium.The abrupt change in the reflection coefficient at the wetting front makes it easily detectable.This statement is true in the presented test case and for any parameter value taken from the prior distributions tested (Table 1).On the contrary, the water table may be hidden due to the softer change in the reflection coefficient at the capillary fringe (Bano, 2006;Saintenoy and Hopmans, 2011).

The numerical model
The variation of the water content in the soil during the infiltration is computed using the WAMOS-1D code (Belfort et al., 2018).The model describes the water movement in the porous medium using Richards' equation (1), and the constitutive relationships between the pressure, the hydraulic conductivity, and the volumetric water content given by Eq. ( 2) and Eq.
(3).The domain of 150 cm depth is discretized with uniform elements of 1 cm thick with homogeneous properties.Such discretization allows an appropriate model precision and a low enough computation time.The WAMOS-1D code solves the system of Eqs. ( 1)-( 3) and yields the vertical distribution of water content at each time step.This distribution is then converted into a vertical dielectric permittivity profile ϵ using the petrophysical relationship Eq. ( 4) and into a GPR wave propagation speed profile V using Eq. ( 5).Then, the time-lapse TWT signals for the fixed objects, TWT 50 and TWT 120 , are calculated at each time step using Eq.(6) (dashed and dotted curves in Fig. 2).
The time-lapse signal TWT f , induced by wave reflection on the wetting front because of the sharp water content variation at the front position is calculated in two steps.First, we search the wetting front position z * i−1/2 , which corresponds to the interface position having the maximum reflection coefficient from Eq. ( 7) as illustrated in Fig. 1.Then, the TWT signal of the wetting front is obtained using TWT f = TWT(z * i−1/2 ) from Eq. ( 6) (solid curve Fig. 2).
Note that TWT 50 and TWT 120 signals are induced by fixed objects, thus, these signals exist regardless of the position of the infiltration front.On the other hand, TWT f is induced by the infiltration wetting front whose position varies over time.
Besides, contrarily to TWT 50 , and TWT 120 , the TWT f signal disappears when the wetting front reaches the water table.To avoid numerical issues when simulations are performed with different soil parameter sets, the value of TWT f when the water table is reached, is artificially maintained for the remaining time steps until the end of simulation time.The water table is assumed to be reached when the maximum reflection coefficient of Eq. ( 7) is under a threshold of 10 -2 .This reflects a fully saturated domain with an almost uniform water content distribution (solid curve Fig. 2).An explanation of the computation of all TWT signals is summarized in Fig. 3. 3 Global sensitivity analysis of TWT signals

GSA method
The GSA method evaluates how the outputs of a model are influenced by the variation of the input parameters (Mara and Tarantola, 2008).Among the various forms of GSA, a variance-based sensitivity analysis, allowing the calculation of Sobol sensitivity indices (Sobol ′ , 2001) is employed.Such indices depict the contribution of the variation of any input variable x to the total variance of an output variable y.In our case, the input variables are the unsaturated soil parameters (K s , θ s , θ r , α, n) and the output variables are the TWT signals (TWT f , TWT 50 , TWT 120 ).

250
Given a model with a set of p independent random parameters X = {x 1 , x 2 , ..., x p } that yields a random response y(X), the two variance-based sensitivity measures, also called Sobol indices (Sobol ′ , 2001) are: the first-order sensitivity index: Ks (cm/s) θs (cm 3 /cm 3 ) θr (cm 3 /cm 3 ) α (1/cm) n (-) the total sensitivity index: where x −i = X \ x i is the set of all parameters except x i , E() and E(.|.) are the expectation and the conditional expectation operators, respectively, Var() and Var(.|.) are the variance and the conditional variance, respectively.The first-order index S i quantifies the contribution of the parameter x i alone to the total variance of y(X), while ST i also includes all interactions of x i with the other parameters x −i .
To perform a variance-based GSA, a practical approach (to save computational time) is to use Polynomial Chaos Expansion (PCE; Wiener, 1938).The PCE approach consists in developing any signal y(X) as a set of orthonormal multivariate polynomials of a degree not exceeding D: where Large prior distribution intervals are considered for all unsaturated soil parameters (Table 1).Such combination of parameters investigated in the GSA is exhaustive and allows to consider a large panel of soil types.Notice that, while simulations with values of n comprised between 1 and 1.5 would allow to investigate a wider range of porous media, they also take much longer to end.
The number of coefficients for a full PCE representation is P = (p + D)!/p!D!.A training dataset of M realizations of the forward coupled hydrogeophysical model is used to build the PCE surrogate model of order D (Fajraoui et al, 2011;Shao et al., 2017;Younes et al., 2013).The coefficients of the PCE are obtained by searching the best fit (in the least square sense) of the PCE surrogate model to the hydrogeophysical model for the M realizations.To work with low-discrepancy sets, the M realizations correspond to sets of input parameters sampled from their prior probability distributions, using quasi-random Sobol sequences (Shao et al., 2017).Because each parameter varies in its own range and has a proper unit, the parameter prior intervals are normalized to [−1, 1] during PCE computation.We illustrate the principle of the construction of the PCE with our hydrogeophysical model in Fig. 3 2 show that the relative difference between the two variances is very small for all investigated times.
Note that although the relative variance error for the TWT 50 at t = 2000 s is the largest (around 7%), it remains insignificant since the total variance of the signal at this time is negligible (less than 2 ns 2 ).The variance of the forward hydrogeophysical model is therefore well reproduced by the PCE surrogate model which will be employed for the GSA of the TWT signals using the variance decomposition.

GSA results
The temporal distribution of the output variance of the three TWT signals (TWT f , TWT 50 and TWT 120 ) are represented Fig. 4.
For each TWT signal, the variance is represented by the black curve and the relative contributions of the uncertain parameters to the variance are represented by the shaded area.The blank region between the variance curves and the shaded area represents interactions between parameters.
TWT f has a different behavior from the TWT signals of fixed reflectors TWT 50 and TWT 120 (Fig. 4).Although the TWT signals of fixed reflectors have different variance magnitudes, they exhibit similar behavior (Fig. 4b and 4c).The variance of the TWT signal is five times more significant for TWT 120 than for TWT 50 .This is in agreement with the physics since the zone of the porous medium affecting the GPR wave is more important for the TWT 120 signal than for the TWT 50 .In addition, the period of influence of the unsaturated parameters (θ r , α, n) is also more important for TWT 120 than for TWT 50 since saturated conditions for the reflector R120 are reached much later than for R50.Since fixed reflectors exhibit similar behavior, in the 300 following, we comment on the results of TWT f and TWT 120 signals.

GSA of the TWT f signal
TWT f variance is zero at the beginning of the infiltration (Fig. 4a) which means that the TWT f signal is not affected by the initial conditions.Indeed, the infiltration wetting front and the TWT f signal start at zero for all parameter sets.Then, the variance of the signal increases until a maximum of 60 ns 2 , reached at around 3 min.After that, the variance decreases, but keeps a significant value of around 25 ns 2 (Fig. 4a).Concerning parameter sensitivities, at the beginning, the TWT f signal is mainly affected by K s .The influence of this parameter decreases over time and reaches zero for long times when steady-state conditions (corresponding to a fully saturated soil) are reached.The parameter θ s has a moderate influence on the TWT f signal.
Its influence is not observable at short times since unsaturated conditions occur.Overall, the most influential parameter on the TWT f signal is the van Genuchten parameter α.This parameter seems influential even for saturated conditions.Note that this numerical artifact is observed because the value of TWT f is artificially maintained when the infiltration wetting front reaches the water table, while physically the signal disappears.The effects of the parameters θ r and n are not observable (Fig. 4a).The blank region between the variance curve and the shaded area in this figure is due to the interaction between the parameters.
To estimate this interaction, we plot the difference between the total (ST i ) and the first order (S i ) sensitivity indices for all parameters (Fig. 5a).This difference reflects the interaction between the parameters over time.Interactions between parameters are negligible for all parameters (ST i ≈ S i ), except for K s and α (Fig. 5a).Hence, the interaction between these two parameters affects the variance of the TWT f signal as represented by the blank region between the variance curve and the shaded area (Fig. 4a).
To evaluate further the effect of the unsaturated soil parameters on the TWT f , we plot the marginal effect of each parameter (Fig. 6).The marginal effect can be easily derived from the PCE coefficients and reflects the effect of one parameter on the output signal.Fig. 6 depicts the marginal effects of each hydraulic parameter, i.e., their influence on the TWT signals as a function of their value when they vary over the range of their prior distribution interval, while the other hydraulic parameters are kept fixed at their center value.This representation allows determining the regions of influence of the hydraulic parameters, given that the stronger the slope of the marginal effect curve, the higher the influence of the parameter.These marginal effects can vary over time, so we represent them at the three time steps (t 1 = 1 min, t 2 = 5 min, and t 3 = 200 min) highlighted with dotted vertical black lines in Fig. 4. The oscillations are caused by numerical artifacts related to the degree of the polynomials used in the PCE model.From Fig. 6a, it can be noticed that: -K s is highly influential at the beginning of the experiment.At t 1 = 1 min, the TWT f signal varies almost linearly with K s .Indeed, at the beginning of the experiment, when K s increases, the wetting front is more advanced, thus, the GPR wave propagates at a lower speed and the TWT f signal increases.At t 2 = 5 min, the TWT f signal is sensitive only for small K s values.Indeed, for high K s values, the soil is fully saturated and the perturbation of the high value of K s doesn't change the TWT f signal.At t 3 = 200 min, the soil is fully saturated for almost all K s values, thus, the TWT f signal becomes insensitive to K s .
θ s has no influence at the first times (t 1 = 1 min) since unsaturated conditions occur.For long times, the TWT f signal is very sensitive to θ s with an almost linear behavior.Indeed, when the soil is fully saturated, the dielectric permittivity and thus the TWT f signal is almost proportional to θ s .
the sensitivity of TWT f to θ r is moderate and can be observed only at the beginning of the experiment (unsaturated conditions) with an almost linear behavior observable at t = 1 min and 5 min.The positive slope of the curve is consistent with the physics of the process (when θ r increases, the speed of the electromagnetic wave decreases, and the TWT f signal increases).
-The van Genuchten parameter α is highly influential notably for long times (t 3 = 200 min).A small variation of the parameter α can induce a strong variation of the TWT f signal.Notably, the sensitivity of α is very high for α ≤ 0.05 -The sensitivity of TWT f to the parameter n is almost zero (flat curves) at all times (t = 1 min, 5 min, and 200 min).
The parameter n has therefore a negligible effect on the TWT f signal and, as a consequence, it is expected to be poorly identifiable from the TWT f data.

GSA of the TWT 120 signal
The variance of the TWT 120 signal is nonzero at the beginning of the experiment which means that the TWT 120 signal is affected by the initial conditions (Fig. 4c).Indeed, at the very beginning, the pressure distribution is hydrostatic and the water content distribution in the column is obtained from Eq.2 which depends on all soil parameters except K s .Therefore, the speed of the GPR wave depends on the initial water content distribution which is dependent on the unsaturated soil parameters θ s , θ r , α, and n.The most influential parameter at the beginning of the experiment is the parameter α.Over time, the effect of this parameter reduces, whereas the effect of θ s increases.For long times, θ s becomes the only sensitive parameter.The parameter K s is also very sensitive.Its effect starts at zero, and increases until a maximum is reached at around 3 min, then it slowly decreases and becomes negligible after 100 min.As with the TWT f signal, interactions between parameters are moderate.The difference between the total and first-order Sobol indices is negligible for all parameters except after 1 min for the parameters K s , α and θ s (see Fig. 5b).This interaction corresponds to the blank region, between the variance curve and the shaded area in -As for the TWT f signal, K s is highly sensitive, especially for t = 1 min and 5 min.
-The saturated water content θ s is very influential for all times.The TWT 120 varies almost linearly with θ s even at the beginning (t 1 = 1 min), since the fixed reflector is located in the lower saturated region.
-As for the TWT f signal, θ r is sensitive only at the beginning of the experiment (unsaturated conditions) with an almost linear behavior at t = 1 min and 5 min.When θ r increases, the water content increases, and hence, the TWT 120 increases.
-The van Genuchten parameter α is highly sensitive.However, contrarily to the TWT f signal where α is highly sensitive at long times (t 3 = 200 min), the sensitivity of α for the TWT 120 signal is high at short times (t 1 = 1 min).For long times, the influence of α disappears since the soil becomes fully saturated.The negative slope of the curve of the TWT 120 signal as a function of α observed at the beginning of the experiment is consistent with the physics of the process.Indeed, when α increases, the capillary fringe thickness decreases, hence, the water content in the unsaturated zone decreases, and thus the TWT 120 signal decreases.
-Surprisingly, and contrarily to the TWT f signal which showed a flat curve for the marginal effect of the parameter n for all parameter values and at all investigated times, the TWT 120 signal is sensitive to n at the beginning of the experiment (t 1 = 1 min) with a high sensitivity for n < 3.5 and a moderate sensitivity (the curve has a small slope) for n ≥ 3.5.
4 Bayesian soil parameter estimation from the TWT signals In this section, we estimate the unsaturated soil parameters in a Bayesian framework using the Markov Chain Monte Carlo (MCMC) sampler (Vrugt and Bouten, 2002;Vrugt et al., 2008).The statistical calibration is performed for a GPR monitored infiltration experiment in order to address the following questions: 1. Can we obtain an appropriate estimation of all unsaturated soil parameters from TWT data?
2. What is the impact of the kind of TWT data (moving/fixed reflectors) and of the number of reflectors on the calibrated model parameters and their confidence intervals?
3. What is the optimal set of TWT measurements that yields good reliability of all unsaturated soil parameters?
The MCMC method has been successfully employed in various inverse hydrological problems (e.g., Fajraoui et al., 2011;Younes et al., 2016;Younes et al., 2017;Younes et al., 2018).The method generates random sequences of parameter sets that asymptotically converge toward the target joint posterior distribution by searching the ensemble of possible parameter sets that satisfactorily fit the observations.The converged sets can then be used to assess the quality of the parameter estimation such as the optimal parameter values and the 95% Confidence Intervals (CIs) which allow for evaluating the reliability of the parameters via uncertainty quantification.
In the sequel, the MCMC method is performed with the DREAM (ZS) (DiffeRential Evolution Adaptive Metropolis) software (Laloy and Vrugt, 2012;Vrugt, 2016).This software samples the posterior probability density function (pdf) by running multiple Markov chains simultaneously for global exploration of the parameter space.The prior distributions of the parameters are the same than in the GSA (Table 1).The DREAM (ZS) then automatically tunes the scale and orientation of the proposal distribution until we get the posterior target pdf.A MATLAB toolbox of the DREAM (ZS) algorithm is available for Bayesian inference in fields ranging from physics, chemistry and engineering, to ecology, hydrology, and geophysics.The vector of unknowns is formed by the five unsaturated soil parameters (K s , θ s , θ r , α, n).Compared to the GSA, which allows an investigation of a large panel of soil types, the parameter estimation is demonstrated on a single synthetic case.A reference solution is generated by simulating the hydrogeophysical problem formed by the system of equations ( 1)-( 6) using the following reference parameter values K * s = 0.08 cm/s, θ * s = 0.4, θ * r = 0.07, α * = 0.145 cm -1 , n * = 2.68, as shown in Table 3.These parameter values corresponds to those of a sandy porous medium present in an experimental platform where we test the protocol under real conditions.The modeled TWT f , TWT 50 , and TWT 120 signals used as synthetic calibration data are deduced from the results of the simulation using the reference parameter values.These TWT signals are then independently corrupted using a normally distributed noise with a standard deviation σ = 0.5 ns.This error corresponds to an uncertainty of 1 ns, which is realistic in the instance of an 800 MHz GPR antenna.
The TWT f , TWT 50 and TWT 120 calibration signals, illustrated before noise corruption in Fig. 2, increase almost linearly until reaching a plateau.For the TWT 50 signal, the plateau is reached when the infiltration front attains the R50 reflector and the value of the plateau corresponds to the time needed by the electromagnetic wave to make a round trip from the surface to a 50 cm depth of a full saturated porous medium.For the TWT 120 signal, the plateau signal is reached when the infiltration front attains the water table (the domain becomes fully saturated) and the value of the plateau corresponds to the time needed by the electromagnetic wave to make a round trip from the surface to a 120 cm deep of a fully saturated porous medium.For TWT f , the plateau value is also reached when the infiltration front attains the water table and the value of the plateau corresponds to the time needed by the electromagnetic wave to make a round trip from the surface to the water table at 100 cm deep.
The reliability of the unsaturated soil parameters is assessed for 5 different scenarios of measurement sets.In the first scenario, only data of the wetting-front TWT f signal are used for the calibration.The second and third scenarios use only the TWT 50 and TWT 120 signal, respectively, obtained from reflection on the fixed reflector R50 and R120.The fourth scenario uses both data of TWT f and TWT 120 as fitting data.The last scenario investigates the benefit of adding a fixed reflector by using data of the TWT f , TWT 50 and TWT 120 signals as conditioning information.
In the five scenarios, the MCMC sampler uses three parallel chains and a total number of 50000 runs.The last 25% of the runs that adequately fit the model onto observations are used to estimate the joint posterior distribution.
The MCMC results of the five studied scenarios are given in Table 3 which depicts, for each parameter, the mean estimated value, its posterior CI size, and the ratio of prior to posterior intervals.Note that the CI and the last indicator are calculated from the standard deviation by assuming a Gaussian posterior distribution.A small CI indicates an accurate estimation of the parameter.A significant difference between the prior and posterior intervals is a sign of the high sensitivity of the model to that parameter (Dusek et al., 2015).The posterior histograms and the derived statistics are obtained from the the last 12500 simulations, as mentionned above, for which the Gelman Rubin (Gelman and Rubin, 1992) criterion is verified and the chains are stable and not autocorrelated.
Results of table 3 for scenario 1 using only data of the TWT f signal for the estimation of the unsaturated soil parameters show that: -An accurate estimation of K s , the most sensitive parameter (Fig. 4a), is obtained with a CI of 0.037 cm/s and a variation interval reduced by 4.
-A fair estimate of the parameters θ s with a standard deviation of 0.031 (-) and a reduction of the interval of variation by 5.This result is relatively surprising as this parameter did not show a strong influence on TWT f sensitivity (Fig. 4a).-The parameter θ r is not well estimated.Indeed, although its mean estimated value is very close to its reference value, the associated uncertainty of 0.14 is large and the posterior interval is as large as the prior one, which indicates the low reliability of the estimation.
-A poor estimation of α, while the sensitivity analysis showed it has a strong influence on TWT f (Fig. 4a).Its CI is large, with a value of 0.167 cm -1 and its posterior interval size is half the prior one.

435
-The TWT f signal yields a mean estimated value n = 2.75 ± 0.47 which is close to the reference value n * = 2.68.The parameter n is quite well identified since its posterior interval is 9 times smaller than the prior interval.This is relatively surprising since the sensitivity of n is negligible (Fig. 4a and 6a5).n values comprised between 1.5 and 3 could represent silty loam, sandy loam or sand.Therefore, it is not obvious to consider this result as a good identification.We have performed another estimation (the results are not presented here) with a target value of the n parameter equal to 6, which is located in the low sensitivity region of the parameter.In this case, the inversion led to a relatively good estimation (estimated mean value of 6.97).However, the parameter was poorly identified since its posterior interval was still large, though it was a bit smaller than the prior parameter interval.
In summary, using only data of the TWT f signal as conditioning information for the hydrogeophysical model calibration yielded well mean estimated parameter values, close to the reference values for all unsaturated soil parameters.However, the examination of the associated uncertainties, showed that only K s , θ s , and n are correctly identified (with narrow posterior intervals with respect to the prior ones).This points out the importance of statistical calibration methods for highly nonlinear problems to investigate not only estimated parameter values but also the associated uncertainties.
The estimation of the unsaturated soil parameters for scenarios 2 and 3, using only data of the TWT 50 or TWT 120 signal for the calibration shows that: -The parameters K s and θ s , which are the most sensitive parameters during most of the experiment (Fig. 4b and 4c), are well identified with small CI size and strong reductions by at least 6 for K s , and 19 for θ s , of their intervals of variation.
We note that the TWT 120 signal allows a much better estimate of both K s and θ s as their CIs are smaller than the ones estimated with TWT 50 .It is especially true for K s where there is almost a factor 2 between the reduction ratios.
-The soil parameters θ r , α, and n, although sensitive (Fig. 4b and 4c), cannot be identified from the TWT 50 and TWT 120 signals since their posterior intervals are as large as, or at best two times smaller than their prior ones.
The results of scenario 4 which combines data of TWT f and TWT 120 signals show that: -Both parameters K s , θ s , and n are very well identified, with very narrow posterior intervals showing a strong reduction by 46, 37, and 17 of their prior intervals, respectively.
-The parameters θ r and α are reasonably well estimated with mean values very close to their reference and intervals of variation reduced by 8 and 9, respectively.Fig. 7 shows the posterior histograms obtained from the scenarios 1, 3, and 4. For all parameters, the displayed intervals correspond to the prior upper and lower limits of Table 1.
Finally, the results of the last scenario which combines data of TWT f , TWT 50 and TWT 120 signals, show performances very similar to scenario 4. Additional information from TWT 50 helped to reduce slightly the posterior intervals of K s , θ s , and α that in that case show a reduction of 49, 44, and 10 times their prior intervals, respectively.
The results of MCMC for this last scenario are shown in Fig. 8 where diagonal plots depict the inferred posterior parameter distributions and the off-diagonal scatterplots represent the pairwise correlations in the MCMC draws.Almost bell-shaped posterior distributions are obtained for all unsaturated soil parameters.Negligible correlations are observed between the parameters, except moderate correlations observed between K s and θ r (r=-0.78)and between n and θ r (r=0.64) .
Note that the parameter n is relatively well estimated as the target reference value 2.68 is located in the high sensitivity region (n < 3.5) (Fig. 6).In the case of a reference value located in the low sensitivity region (n ≥ 3.5), the calibration of the The displayed parameter intervals correspond to the prior upper and lower limits of Table 1.
hydrogeophysical model using TWT f and TWT 120 signals yields a much poorer identification of the parameter n.For instance, using scenario 5 with a reference value n * = 4.25, the estimated mean value is 4.84 with a posterior CI size of 3.6, which corresponds to a reduction of the interval of variation by only 2.

Figure 1 .
Figure 1.Test case and experimental device illustration at an advanced time step (a).R50 and R120 are fixed reflectors considered in this experiment.TX and RX refer to the transmitter and receiver antennas of the GPR system.Effective saturation Se (b) and reflection coefficient r (c) profiles with depth. 165

Figure 3 .
Figure 3. Summary of the working process of the forward hydrogeophysical model and how it is used to build the PCE surrogate model.
s β are the PC coefficients, Ψ β are the generalized polynomial chaos of degree |β| = p i=1 β i .In this work, Legendre polynomials are used since uniform distributions are assumed for all uncertain parameters.Uniform distributions express the absence of prior information.This makes all parameter values in the given prior intervals equally likely.

Figure 4 .
Figure 4. Time distribution of the variance of TWTf (a), TWT50 (b) and TWT120 (c).The shaded area under the variance curve represents the partial marginal contributions of the uncertain parameters; the blank region between the shaded area and the variance curves represents the contribution of interactions between the parameters.The marginal effects shown in Fig.6 are represented at three time steps t1 = 1 min, t2 = 5 min, and t3 = 200 min, highlighted here (dotted black lines).

Figure 5 .
Figure 5. Difference between the total (STi) and the first order (Si) sensitivity indices for all parameters for the TWTf (a) and the TWT120 (b) signals.

Fig. 4c .
Fig.4c.The marginal effects of the soil parameters on the TWT 120 signal are plotted in Fig.6b for t = 1 min, 5 min, and 200 min.The curves in this figure show that: First line: Reference values used to build the synthetic calibration data.Then for the different scenarios: estimated mean values (bold), size of the posterior confidence intervals (CIs) (between brackets), and ratio of prior to posterior intervals (italic).

Figure 7 .
Figure 7. MCMC solutions using scenarios 1 (a1-a5), 3 (b1-b5), and 4 (c1-c5) for calibrating the hydrogeophysical model.The histograms are built from the posterior distributions.The estimated mean values are represented in dotted black line and compared to the exact target value (hard red line).The displayed parameter intervals correspond to the prior upper and lower limits of Table

Figure 8 .
Figure 8. MCMC solutions using TWTf, TWT50 and TWT120 signals for the calibration of the hydrogeophysical model.The diagonal plots represent the inferred posterior parameter distributions, showing the estimated mean value (dotted black line) and the target value (hard red line).The off-diagonal represents the pairwise correlations between parameters.

Table 1 .
Prior intervals of the unsaturated soil parameters for both GSA and Bayesian estimation.

Table 2 .
. A PCE is constructed at each time step for all model responses (TWT f , TWT 50 , and TWT 120 ) since we deal with transient simulations.In this work, M = 2048 hydrogeophysical model realizations are employed to obtain PCEs of degrees D = 5 Variance of TWTf, TWT50 and TWT120 signals at t = 50 s, 150 s and 2000 s calculated with the PCE surrogate model and with the hydrogeophysical model.
containing P = 252 coefficients.The obtained PCEs are sufficiently accurate as the variance of the TWT output signals is calculated with the surrogate PCE model and the forward hydrogeophysical model at three different times t = 50 s, 150 s, and 2000 s.The results of Table