The biophysical processes occurring in the unsaturated zone have a direct impact on the water table dynamics. Representing these processes through the application of unsaturated zone models of different complexity has an impact on the estimates of the volumes of water flowing between the unsaturated zone and the aquifer. These fluxes, known as net recharge, are often used as the shared variable that couples unsaturated to groundwater models. However, as recharge estimates are always affected by a degree of uncertainty, model–data fusion methods, such as data assimilation, can be used to inform these coupled models and reduce uncertainty. This study assesses the effect of unsaturated zone models complexity (conceptual versus physically based) to update groundwater model outputs, through the assimilation of actual evapotranspiration rates, for a water-limited site in South Australia. Actual evapotranspiration rates are assimilated because they have been shown to be related to the water table dynamics and thus form the link between remote sensing data and the deeper parts of the soil profile. Results have been quantified using standard metrics, such as the root mean square error and Pearson correlation coefficient, and reinforced by calculating the continuous ranked probability score, which is specifically designed to determine a more representative error in stochastic models. It has been found that, once properly calibrated to reproduce the actual evapotranspiration–water table dynamics, a simple conceptual model may be sufficient for this purpose; thus using one configuration over the other should be motivated by the specific purpose of the simulation and the information available.

Actual evapotranspiration (AET) and groundwater recharge to the water table (WT) are two interrelated components of the water cycle. This is because AET is a function of the soil water content within the root zone, as the root water uptake is distributed along the entire root system

AET is often simulated through a variety of numerical models that reproduce the soil water–vegetation interaction with different level of details. Advanced integrated surface water–groundwater models (e.g. Hydrogeosphere,

Conceptual unsaturated zone models (UZMs) simplify the processes occurring in the unsaturated zone and are widely used for spatially distributed hydrological simulations

Given the spatial variability and number of parameters (e.g. the water retention curve and detailed vegetation characteristics) required by physically based models, their application, particularly in data-scarce areas, can be challenging

One way to make use of the remote sensing observations is through data assimilation, which combines model results with independent observations to reduce model uncertainty. In the field of hydrology, there is a plethora of studies on the assimilation of diverse observations such as soil moisture (SM), leaf area index, and streamflow and groundwater levels

All satellite observations present a trade-off between accuracy, time frequency, and spatial coverage. In addition, no satellite retrievals are free from errors, as discussed in

This study aims to perform the validation of the AET assimilation framework proposed in

The study area is situated in the south-eastern part of South Australia, north of the city of Mount Gambier (Fig.

Localisation of the study area within Australia

The study site is a

AET data are derived from the remotely sensed CSIRO MODIS-reflectance-based scaling evapotranspiration (CMRSET) algorithm

The tests presented in this study used two different configurations of coupled groundwater–unsaturated zone models, which are depicted in Fig. 2. The following sections describe the models as well as the coupling framework.

Coupled models' representation.

The UnSAT (Unsaturated zone and SATellite) UZM is a one-dimensional soil water balance model. The unsaturated zone is divided into layers, and the water balance of each layer is solved at every time step. Water flows downward from the top layer to the last, and the latter delivers recharge (Fig.

The size of the layers (

AET is calculated as

For the layers below the first, including the last layer, which delivers recharge to the groundwater model, the water balance equation is

The Soil Water Atmosphere Plant (SWAP v. 4.0) model, developed by Alterra, is one of the most used physically based UZMs

In SWAP, the Richards equation is solved for the pressure head using finite differences. The soil hydraulic retention functions are based on the analytical formulations proposed by

The groundwater model chosen for the study is MODFLOW 2005

FloPy

UZMs require a shorter time step than MODFLOW as the water content varies at a higher frequency than the depth to the WT in the groundwater model

Configuration-1 (Fig.

For Configuration-2 (Fig.

The model configurations were applied to a domain of

Schematic of the model domain.

UnSAT can account for the decrease of

In order for the system to be observable, the link between WT levels and AET has to be accurately reproduced. It should be noted that this link has been described in the literature

Calibrated parameter values used for the simulations and their coefficient of variation.

Applying a calibration–validation approach, the observation data sets were divided into two periods. For calibration, 46 time steps covering roughly the year 2001 were used, while the rest of the data set (4.5 years in total) was used for validation.

The EnKF

Usually, in data assimilation studies, the assimilated observations are model states (also called prognostic variables) such as SM, pressure head, and WT levels. This paper uses AET flux observations, which are diagnostic variables. Therefore, the interaction between AET and model states occurs in the UZM, of which AET is a model result. Following

The two configurations apply a similar scheme of the EnKF, the difference lying in the composition of the aggregated state vector, as the state variables of the UZMs are different. Specifically, the state vector of Configuration-1, for a single ensemble member (

For Configuration-2, the vector of soil water pressure heads is

The average state vector reads

The observation from the CMRSET for the

The matrix for observation–simulation deviation is composed as

Combining the matrices calculated above, it is possible to calculate the background state covariance matrix

According to

The generation of a statistically meaningful ensemble, which preserves the relationship between AET and WT levels obtained during the calibration, is crucial for the application of the EnKF

First, a simple perturbation of forcing inputs, by adding a random number sampled from Gaussian distributions with different standard deviations, as performed by

In this section, the results for the open loop and assimilation runs are assessed. This is conducted by analysing the error between the prediction of the model and the observations. The common error metrics used to assess the overall errors in these models are the root mean square error (RMSE), the Pearson correlation coefficient (

The RMSE and

The next component for analysing the performance in verifying the results is the Pearson correlation coefficient to understand the relationship between the observed values and the predicted model values. The Pearson correlation coefficient (

In particular, this investigates the strength of the linear relationship between the predicted and observed values as they proceed through time. A value of

The CRPS is a measure to quantify the difference between the predicted value and the observed cumulative distribution in terms of the probabilistic distributions for each time step. The CRPS is calculated, at a specific time step, from the cumulative distribution function given by the ensemble simulation of the variable of interest

This is calculated over the entire simulation period, and the average CRPS is defined as

During the calibration with the PSO, the dynamics of the parameter optimisation algorithm was monitored, showing that the MODFLOW saturated hydraulic conductivity (

With the calibration technique proposed in Sect.

Observed and modelled

Results for the calibrated runs.

The soil heterogeneity is represented differently by the two configurations. Configuration-2, which is physically based, can represent the heterogeneity of the soil column, as shown in Fig.

Temporal evolution of the SM contents and WT levels. Panels

For AET, Configuration-1 yields good results with a lower RMSE and similar correlation when compared to Configuration-2. In particular, Configuration-2, being physically based, underestimates the simulated AET for the Southern Hemisphere late summer and early autumn, as shown in Fig.

The generation of the ensemble was found to be a key step of the method. The simple perturbation of forcing inputs was not able to generate a sufficiently broad ensemble spread, particularly for Configuration-2. For both configurations, the combined perturbation of parameters and forcing inputs induced more accurate ensembles, in accordance with the ensemble validation skills calculated on the first year of the data set, excluding the 10 first time steps to avoid the influence of the initial conditions; the validation is thus applied from the 10th to the 45th time step. For the meteorological data, the best ensembles are obtained by perturbing the input with a random number sampled from a Gaussian distribution having a standard deviation proportional to the value of the forcing inputs (i.e. 50 % for Configuration-1 and 10 % for Configuration-2). For parameters, the last column of Table

WT levels and AET and spread of the open-loop ensembles for Configuration-1

In the case of Configuration-1, which is conceptually based, the WT level spread of the open-loop ensemble is consistently covering the observations (Fig.

WT levels and AET and spread of the assimilation run for Configuration-1

The spread of the WT levels for Configuration-2 (see Fig.

RMSE, correlation and

Table

In particular there are instances in Configuration-2 where the assimilation is not able to improve AET in the first quarter of 2001 and, to a lesser extent, at the beginning of 2003. This causes poorer WT simulations' performances during these periods, as seen in Fig.

CRPS WT levels and AET for Configuration-1

In Fig.

For both configurations, the assimilation improves the RMSE and

For SM, the results are reported in Table 3, divided into the upper and lower soil. The open loop of Configuration-1 presents a RMSE of 0.045

Generally, these results consolidate the synthetic approach in

This study validates the assimilation of the satellite-based actual evapotranspiration (AET) data set (CMRSET) into two unsaturated zone models (UZM) coupled to MODFLOW. The two UZMs form two configurations, one using a conceptual water balance model (UnSAT) and the other using a physically based agro-hydrological model (SWAP). These configurations are applied to a semi-arid pine plantation in the south-east of South Australia, where the WT is within reach of the trees' root system.

The most important findings can be summarised as follows:

In conclusion, the numerical experiment explored the added value of AET information for constraining unobservable estimates (i.e. net recharge) calculated by hydrogeological models. Improving the AET fluxes led to better recharge estimates. Thus, as recharge is a key quantity driving the WT dynamics, the link between AET and WT in the model is strengthened. It was shown that it is possible to use either a conceptual or a physically based UZM in the assimilation of satellite-based AET estimates to inform hydrogeological models. The assimilation results have been quantified using standard metrics, such as RMSE and

This study contributes to unlocking the potential of using AET observations to inform hydrological models, with the aim of reducing the uncertainty in the outputs, and it represents a step towards the use of satellite-based AET retrievals for water resources management. For future applications at larger scales, more research is to be conducted in areas with different groundwater, vegetation, and soil conditions, with the intent of prioritising regions where the AET assimilation is more effective.

Model forcing inputs, assimilated evapotranspiration from CMRSET, and experiment results are available at

SG performed the modelling work and wrote the manuscript. VRNP supervised the implementation of the EnK, ED supervised the implementation of the UnSAT model, JvD supervised the implementation and coupling of the SWAP model, NFY contributed to the results analysis, and RD supervised the entire project. All co-authors provided input to the writing of the manuscript.

The authors declare that they have no conflict of interest.

This article is part of the special issue “Data acquisition and modelling of hydrological, hydrogeological and ecohydrological processes in arid and semi-arid regions”. It is not associated with a conference.

Simone Gelsinari acknowledges the financial support by the Faculty of Engineering at Monash University through the Graduate Research International Travel Award and thanks the chair group of Hydrology and Quantitative Water Management at Wageningen University & Research for the support during his visit. Simone Gelsinari also thanks Karina Gutierrez Jurado for her support and suggestions during the preparation of this paper.

This research has been supported by the Commonwealth Scientific and Industrial Research Organisation (Effective Floodplain Management Project).

This paper was edited by Harrie-Jan Hendricks Franssen and reviewed by Manuela Girotto and two anonymous referees.