Integrating coarse-scale uncertain soil moisture data into a fine-scale hydrological modelling scenario

In a hydrological modelling scenario, often the modeller is confronted with external data, such as remotelysensed soil moisture observations, that become available to update the model output. However, the scale triplet (spacing, extent and support) of these data is often inconsistent with that of the model. Furthermore, the external data can be cursed with epistemic uncertainty. Hence, a method is needed that not only integrates the external data into the model, but that also takes into account the difference in scale and the uncertainty of the observations. In this paper, a synthetic hydrological modelling scenario is set up in which a high-resolution distributed hydrological model is run over an agricultural field. At regular time steps, coarse-scale fieldaveraged soil moisture data, described by means of possibility distributions (epistemic uncertainty), are retrieved by synthetic aperture radar and assimilated into the model. A method is presented that allows to integrate the coarse-scale possibility distribution of soil moisture content data with the fine-scale model-based soil moisture data. The method is subdivided in two steps. The first step, the disaggregation step, employs a scaling relationship between field-averaged soil moisture content data and its corresponding standard deviation. In the second step, the soil moisture content values are updated using two alternative methods. Correspondence to: H. Vernieuwe (hilde.vernieuwe@ugent.be)


Introduction
Soil moisture, one of the leading actors in the hydrological cycle, influences considerably evaporation, infiltration, and runoff processes.Other processes such as plant growth and bio-geochemical fluxes in the terrestrial hydrosphere, are also largely determined by soil moisture.Soil moisture is therefore a key variable in hydrological models.Assimilation of soil moisture observations, which boils down to objectively combining soil moisture observations with the model results at the same time step in order to produce a "best" soil moisture estimate, can improve the predictive capability of hydrological models.(e.g.Crow and Ryu, 2009;Brocca et al., 2009;De Lannoy et al., 2007a,b;Merlin et al., 2006).However, acquiring in situ soil moisture measurements with a high space-time resolution is often expensive and labour intensive.Therefore radar remote sensing is often presented as an alternative to offer high spatial resolution soil moisture data in hydrological data assimilation studies.Yet, soil moisture estimation from radar remote sensing as for instance from the Synthetic Aperture Radar (SAR) backscattered signal is hampered by, among other things, the difficulty to correctly parameterize soil roughness ( Álvarez-Mozos et al., 2009;Lievens et al., 2009;Verhoest et al., 2008;Callens et al., 2006;Davidson et al., 2000;Oh and Kay, 1998).Furthermore, as explained by Verhoest et al. (2007), it is unfeasible to measure soil roughness at each bare soil field when radar remote sensing is to be applied at the catchment or regional scale.Therefore, Verhoest et al. (2007) suggest to use a priori roughness information based on the known tillage state of the field.For each tillage type, a possibility Published by Copernicus Publications on behalf of the European Geosciences Union.distribution (see Subsect. 2.2) of roughness parameters, reflecting the possible values of soil roughness parameters is then used in the soil moisture retrieval algorithm.Using the possibility distribution of soil roughness in the soil moisture retrieval procedure from SAR results in a possibility distribution of soil moisture content.Vernieuwe et al. (2011) further elaborated on the use of possibility distributions in soil moisture estimation from SAR by taking into account the interactivity between the roughness parameters as to reduce the non-specificity in the possibility distribution of soil moisture content.
Generally, the soil moisture content estimation of an agricultural field on the basis of high resolution SAR images is performed using field-averaged backscatter values.Application of the possibilistic soil moisture retrieval technique as in Verhoest et al. (2007) and Vernieuwe et al. (2011), using a field-averaged backscatter value hence leads to a possibility distribution of soil moisture content reflecting its possible field-averaged values.When this possibility distribution of soil moisture content is then assimilated into a distributed hydrological model, some elementary difficulties arise.First, a scale gap exists when the distributed model is employed at a finer resolution than the field scale, for instance as to meet precise agricultural needs or to model rainfall/runoff in a small hydrological catchment.The within-field variability of soil moisture may be significant (Minet et al., 2011b;Hupet and Vanclooster, 2002;Western and Blöschl, 1999) and may have a significant impact on field-scale hydrological behaviour (Minet et al., 2011a;Mallants et al., 1996), which justifies the spatial distribution of hydrological parameters within a field plot for accurate hydrological modelling.Second, the possibility distribution of soil moisture content represents the possible field-averaged soil moisture contents for a given field, whereas an empirical probability distribution, reflecting the within-field soil moisture variability can be computed on the basis of the fine-scale soil moisture content values predicted by the hydrological model.So, different uncertainty representations of the field-averaged soil moisture content are to be considered.Therefore, in order to integrate a coarse-scale possibility distribution of soil moisture content within a fine-scale modelling framework, a technique is needed that can deal with the difference in scale and that can furthermore take into account the different types of uncertainty incorporated in both data types.
A number of studies already dealt with the relationship between the averaged soil moisture content and soil moisture variability (e.g.Ivanov et al., 2010;Famiglietti et al., 2008;Vereecken et al., 2007;Western et al., 2003;Hupet and Vanclooster, 2002;Western and Blöschl, 1999).Famiglietti et al. (2008) and Western and Blöschl (1999) noted that soil moisture variability depends on the overall coverage or the extent within which soil moisture is measured.In Famiglietti et al. (2008), an exponential-based relationship that relates soil moisture variability, expressed as its standard deviation, and the averaged soil moisture content value is introduced.
Alternatively, Penna et al. (2009) and Ivanov et al. (2010) mention a parabolic relationship.Integrating an estimation of the field-averaged soil moisture content value, represented by means of a possibility distribution, on the one hand, and the within-field soil moisture variability, represented by means of a probability distribution, on the other hand, both stemming from different sources of information, should therefore take into account the existence of such a relationship.
In addition, some papers already exist in which probabilistic and possibilistic uncertainty are combined.Guyonnet et al. (2003) proposed a hybrid method in which Monte Carlo random sampling of probability distributions is combined with fuzzy interval analysis.They demonstrated their method in a risk assessment case study of human exposure to the presence of cadmium.Baudrit et al. (2006) further elaborated on this method.Baudrit et al. (2007) then compared different methods for the propagation of probabilistic and possibilistic uncertainty in a risk assessment case study of groundwater contamination.
In this paper, a method is developed that integrates coarsescale uncertain field-averaged soil moisture content and finescale soil moisture variability.The method is based on a scaling relationship reflecting the relationship between the within-field soil moisture variability and its averaged soil moisture content.The scaling relationship is identified on the basis of synthetically generated soil moisture data and the method is demonstrated in a data assimilation framework by means of a twin experiment.
This paper is organised as follows.Section 2 provides some background on possibility theory and the possibilistic retrieval method to obtain the possibility distributions of field-averaged soil moisture content.Section 3 describes the identification of the relationship between mean soil moisture content and its variability.Section 4 then explains the method used to integrate the coarse-scale possibility distribution with the fine-scale modelled soil moisture content values.The integration method is subdivided into two steps.A first "disaggregation" step (Subsect.4.1) describes how the possibility distribution of soil moisture content is combined with the relationship between field-averaged soil moisture content and its standard deviation in order to establish a bundle of cumulative normal distribution functions.A second "update" step (Subsect.4.2) then demonstrates, by means of a data assimilation twin experiment how this bundle can be further employed to update the soil moisture contents in a spatially distributed hydrological model.Finally, Sect. 5 formulates the conclusions and a few perspectives for future research.

SAR-based soil moisture estimation
It is widely known that one of the advantages of Synthetic Aperture Radar (SAR) is its potential to offer high resolution soil moisture content data at a regional extent.Several models have been proposed to relate soil moisture to the backscatter signal, ranging from purely empirical relationships to physically-based models.In this study, the Integral Equation Model (IEM) for small and medium roughness, developed by Fung (1994), is applied.This model, which only simulates the single scattering component of the backscattering process, has already been applied successfully in several remote sensing studies (Altese et al., 1996;Álvarez-Mozos et al., 2005Álvarez-Mozos et al., , 2006;;Hoeben and Troch, 2000;Mancini et al., 1999).It is only valid for surfaces with a single-scale roughness having small to moderate surface root mean square (rms) heights (ks ≤ 2, with k the wave number (k = 2π/λ (λ being the wavelength)) and s the rms height).
The autocorrelation function is considered to be isotropic and is represented by an exponential function.Besides the roughness parameters, the model uses the dielectric constant of the soil to compute the backscattering value.After applying the inverse IEM, the obtained dielectric constant is converted to volumetric soil moisture using the dielectric mixing model (Dobson et al., 1985).If the latter results in soil moisture values larger than saturation, then the soil is considered to be saturated, whereas if the retrieved moisture value obtained is smaller than the residual moisture content, it is replaced by the latter value.This operation, in accordance with Verhoest et al. (2007), is performed in order to ensure that only soil moisture values are retrieved that are physically possible.However, apart from the soil moisture content, the backscattered radar signal is also influenced by the soil roughness state of the field under consideration.Numerous studies already reported the difficulty of determining the correct values of bare soil surface roughness parameters, described as root mean square (rms) height and correlation length (e.g.Álvarez-Mozos et al., 2009;Lievens et al., 2009;Verhoest et al., 2008;Callens et al., 2006;Mattia et al., 2003;Davidson et al., 2000;Oh and Kay, 1998)

Possibility distributions
In contrast to a probability distribution representing uncertainty originating from variability (e.g. the distribution of precise roughness parameters within a field), a possibility distribution represents uncertainty stemming from a lack of knowledge, also called epistemic uncertainty.It assigns possibility degrees π v (x) ∈ [0,1] to the values x of a variable v, for which unsurprising parameter values receive a possibility degree equal to 1, whereas impossible parameter values receive a possibility degree equal to 0. The set of values that have a possibility degree greater than or equal to α, with 0 < α ≤ 1: is called the α-cut of the possibility distribution.An example of a trapezoidal possibility distribution in which an α-cut is indicated, is presented in Fig. 1.The possibilistic retrieval method as described in Verhoest et al. (2007) uses trapezoidal possibility distributions for rms height and correlation length, which are then propagated through the inverse IEM as to obtain a possibility distribution of soil moisture content, following Zadeh's extension principle (Zadeh, 1975): with x, y and z values of rms height s [L], correlation length [L] and soil moisture content θ [-], respectively.The function f represents in this particular application the inverse IEM.By applying the extension principle, the couples (x,y) that are mapped to z are selected and their joint possibility degree is calculated as min(π s (x),π (y)).The possibility degree of z is then obtained by taking the supremum of all joint possibility degrees of the respective couples (x,y).For a continuous function f and for possibility distributions whose α-cuts are closed intervals, the extension principle can be applied more practically on the basis of α-cuts (Nguyen,  1978).In a first step, a level α is selected for which the αcuts of the possibility distributions of the input variables are determined.Next, interval analysis or interval computation is applied to identify the corresponding α-cut of the possibility distribution of the output variable, which is the interval determined by the minimum and maximum output value obtained through application of f on the α-cuts of the input variables.By repeating the procedure for different α-levels, the possibility distribution of the output variable can be established.The minimum operator used in Eq. ( 2) furthermore indicates that the variables s and are treated as if they are separable or non-interactive.However, if the variables are interactive, a joint possibility distribution can be defined, and directly used in Eq. ( 2) instead of the minimum operator on the individual possibility distributions.Vernieuwe et al. (2011) used the possibilistic Gustafson-Kessel fuzzy clustering algorithm (Krishnapuram and Keller, 1993) to determine the joint possibility distribution of rms height and correlation length as to take into account the interaction between both variables.

Identification of the SAR-based possibility distributions
In this paper, the joint possibility distribution of soil roughness parameters, as determined in Vernieuwe et al. (2011), for the rotary tilled roughness class (corresponding to seedbed) and a profile length of 4 m is employed (see Fig. 2).All applications of the possibilistic soil moisture retrieval procedure employed in this paper are performed with the inverse IEM for a VV polarised, C-band (frequency of 5.3 GHz) radar configuration, an incidence angle of 17 • and an agricultural field for which an exponential correlation function is used.The soil characteristics of this agricultural field are listed in Table 1.Furthermore, as the field-averaged backscatter values were used as input to the possibilistic retrieval procedure, it is important to note that the obtained possibility distribution of soil moisture content reflects the possible values of field-averaged soil moisture content.

Identification of the scaling relationship
Several studies have already confirmed the existence of a relationship between the standard deviation of the soil moisture content values within a given extent and the corresponding averaged value.However, if such a scaling relationship is to be used within a modelling framework, the expected variability at the model resolution should be related to the average soil moisture value at a coarser scale (e.g. the field level).In order to establish such a scaling relatonship, a large number of detailed within-field observations should be performed for very dry to very wet conditions.One way to obtain such information is through detailed soil moisture monitoring campaigns, using for instance nearby remote sensing platforms such as GPR platforms (e.g.Minet et al., 2011b).Unfortunately, due to a lack of sufficient measured field data to cover the full range of soil moisture conditions, modelled soil moisture content values are used in the present work.Therefore, the TOPMODEL-based (Beven and Kirkby, 1979) land-atmosphere transfer scheme (TOPLATS) (Famiglietti and Wood, 1994) was employed to synthetically generate the scaling relationship between mean soil moisture content θ m and its standard deviation θ s for the bare soil agricultural field under consideration (with soil parameters listed in Table 1) for which heights range from 125 m to 139 m.To this end, the topographic index for this field was determined at a 5 m × 5 m resolution, yielding values from 5 to 16.The other model parameters were obtained through a lumped application of TOPLATS to the Zwalm catchment as described in Pauwels et al. (2001).The model was forced with a four and a half year spanning hourly meteorological data set containing air and dew point temperature ( • C), solar radiation (W m −2 ), wind speed (m s −1 ) and precipitation (m s −1 ).In order to oblige the model to reach lower fieldaveraged soil moisture contents, the model was further forced with the same meteorological data set except for the precipitation data that were decreased to 10 % of their original value.Figure 3 shows the obtained plot of field-averaged soil moisture content values versus their corresponding standard deviation.A large variability of the standard deviation around the mean soil moisture content values is observed.Several authors (e.g.Ivanov et al., 2010;Teuling et al., 2007;Vereecken et al., 2007;Teuling and Troch, 2005) already argued that different factors such as hysteresis, climate variability, topography, antecedent states and soil heterogeneity influence the spatial soil moisture variability and lead to a non-unique relationship.In that respect, Western et al. (2003) found a large variability of relationships between field-averaged and standard deviation of soil moisture content when comparing a large number of studies based on field measurements.Ivanov et al. (2010) hypothesize the existence of an attractor in the phase space of the hydrological system, explaining the existence of hysteresis in this relationship.As interactions between past weather conditions, topography, vegetation patterns and soil characteristics actually govern the spatial structure of soil moisture, one can argue that no unique relationship exists between mean soil moisture and its variability, but rather that the relationship moves between an upper and lower envelope set mostly by soil properties (Salvucci, 1998), mainly as a function of past climate (Teuling et al., 2007).However, modeling this behaviour is not straightforward.Still, in the remainder of this paper, as to simplify the method presented hereafter, it was decided to ignore the different factors and processes underlying the non-uniqueness of this scaling relationship and to introduce a unique relationship which is fitted to the data in Fig. 3.Of course, if a model would be available that describes the scaling relationship as a function of past weather conditions, topography, vegetation patterns and soil characteristics, one could use it instead of the simplified unique relationship applied in this paper.
In order to identify a single scaling relationship to the data, presented in Fig. 3, four models were tested in a 10fold cross-validation strategy.To this end, the data set was first randomly divided into 10 folds or groups.Each model was then identified using 9 folds, and its performance was then validated on the remaining fold.This procedure was repeated ten times, such that each fold once served to validate the model.The models identified on the data consist of the exponential-based relationship proposed by Famiglietti et al. (2008) (mod exp ): with k 1 and k 2 the model parameters, two concave cubic spline models with 3 and 4 knots respectively (mod s3 and mod s4 ), and a second order polynomial (mod p ).In order to fit the splines, the Shape Language Modelling (SLM) toolkit  2008) is outperformed by the other three models.In order to decide on the best model, a non-parametric Kruskal Wallis test (Kruskal and Wallis, 1952) was carried out to test for significant differences between the performance values of all models, followed by a non-parametric comparison of a control group to other groups (Zar, 1999), to seek one-tailed significant differences between one group, the control group and each of the other groups.As the lowest RRSE values are obtained by the spline model with 4 knots it is assumed that this model fits the data best.This group of RRSE values is therefore chosen as the control group.The results of this latter test reveal that the spline model with 4 knots outperforms the exponential-based model and the second order polynomial, yet no significant differences are found between the two spline models.The spline model with 4 knots was chosen to be used throughout the remainder of this paper.

Integration method
The integration method integrates the coarse-scale SAR measurements into the fine-scale distributed model and is subdivided into two steps.The first step is called the disaggregation step and disaggregates the coarse-scale possibility distribution of field-averaged soil moisture content into a bundle of cumulative normal distribution functions.The second step, the update step, first establishes the empirical distribution function of modelled soil moisture content values, uses the information of the bundle to update this distribution and updates the modelled soil moisture content values.

Disaggregation step
The first step in the integration method consists of disaggregating the field-averaged soil moisture content to take into account the difference in scale between the model extent (e.g. the field) and the model resolution.On the one hand, the modeller has at his disposal a possibility distribution of soil moisture content obtained by the SAR (see Fig. 4) that reflects the more or less possible field-averaged soil moisture content values.On the other hand, a spline has been determined from TOPLATS simulations (see Sect. 3) that relates field-averaged soil moisture content values with their corresponding standard deviations.In order to integrate the information present in the possibility distribution with the soil moisture values predicted by the hydrological model, a method is established that employs a scaling relationship.Ryu and Famiglietti (2005) concluded that the soil moisture's spatial variability within a satellite footprint can be described by means of a normal distribution.Other authors (e.g.Nyberg, 1996;Wilson et al., 2003) also reported that soil moisture content is approximately normally distributed.In this disaggregation step, the simple assumption that a normal distribution can be used to describe the within-field soil moisture variability is hence adopted.For each field-averaged soil moisture content value with a given possibility degree, the corresponding standard deviation is obtained through the scaling relationship and the corresponding cumulative normal distribution is determined.As the possibility distribution of soil moisture content has closed α-cuts, and the spline model is a continuous function, the α-cut method, i.e. applying interval analysis to the α-cuts in the possibility distribution, can be employed.In Vernieuwe et al. (2011), a residual uncertainty = 0.025 was assigned to the entire interval of soil moisture content values.The lowest possibility level to which the disaggregation step is applied therefore corresponds to α 0 = + δ with δ > 0 a small value.Algorithm 1 describes this disaggregation step.θ ml and θ mr denote respectively the left and right endpoints of the α-cuts, and θ m the discretisation step.(0.46).The bundle of cumulative normal distributions obtained after performing the procedure for 11 possibility levels, i.e. α 0 = + δ,α 1 = 0.1,α 2 = 0.2,,...,α 10 = 1 is presented in Fig. 6.In this figure, distributions given in solid lines have a mean value that corresponds to the endpoints of the α-cuts.The mean value of the central distribution, given in bold corresponds to the single field-averaged soil moisture content value in (π ) 1 .As the mean value of each cumulative normal distribution originates from the possibility distribution and therefore has a possibility degree, this degree is transferred to the cumulative distribution.Therefore, a third dimension, reflecting the possibility degree of the cumulative normal distributions, is associated with the bundle.This indicates that, if a cross-section of the bundle is taken at a particular probability degree, a possibility distribution of soil moisture content values is obtained (see Fig. 7).

Update step
The second step in the integration method demonstrates the applicability of the bundle of cumulative normal distribution functions to update the modelled soil moisture content values.To this end, a synthetic data assimilation twin experiment is set up, in which a modelling scenario employing a distributed hydrological model is mimicked.At certain time steps in this experiment, SAR measurements of soil moisture content become available, yet are represented by means of a possibility distribution of field-averaged soil moisture content.At these time steps, the disaggregation step is then first used to establish a bundle of cumulative normal distributions, followed by the update step to modify the modelled soil moisture content values using the information present in the bundle.

Twin experiment set up
For the twin experiment, TOPLATS (Famiglietti and Wood, 1994) was run on the agricultural field under consideration on a fine-scale basis of 5 m × 5 m.The reference, which will be referred to as the truth was obtained with the same parameter values used to generate the scaling relationship between field-averaged soil moisture content values and the corresponding standard deviations.The model was forced with an hourly time series of meteorological data spanning half a year, different from the one used in Sect.3, containing information about air and dew point temperature ( • C), solar radiation (W m −2 ), wind speed (m s −1 ) and precipitation (m s −1 ).In a next step, field-averaged soil moisture values were sampled from this model run at four different time steps (DOY42, DOY105, DOY117 and DOY162) that were not directly followed by a rain event and converted into a corresponding backscatter value by means of the IEM.To this end, the radar configuration described in Sect. 2 together with the roughness parameters corresponding to the centre of the joint possibility distribution were used.Subsequently, by applying the possibilistic soil moisture retrieval procedure with the joint possibility distribution (see Fig. 2) of soil roughness parameters to these backscatter values, the corresponding possibility distributions of field-averaged soil moisture content were obtained.
Next, a model scenario was obtained by modifying the values of two model parameters of TOPLATS, i.e. the exponential coefficient of the topmodel baseflow equation and the water table depth, as to allow the model output to deviate from the soil moisture content values obtained by the truth.The soil moisture content values of this model run (further referred to as the baseline run) were then obtained by forcing the model with the same meteorological data as used in the truth, however with these modified model parameters.

Update step using possibility degrees in the bundle
At the time steps corresponding to the sampling time steps (DOY 42,DOY 105,DOY 117 and DOY 162) at which the SAR-retrieved possibility distributions were acquired the disaggregation step (see Sect. 4.1) was employed to establish a bundle of cumulative normal distributions.According to the information present in this bundle, i.e. the cumulative normal distributions and their possibility degrees, the soil moisture content values predicted by the model with the modified parameters at the considered time step, can be updated, i.e. the external data of soil moisture content values are assimilated into the model (the so-called assimilated run).To this end, an empirical cumulative distribution function (cdf) of modelled soil moisture content values for the agricultural field, F (θ;θ m ,θ s ) having mean θ m and standard deviation θ s , is first established (the green distribution in Fig. 8).This empirical distribution function is then optimised according to the information present in the bundle following an iterative optimisation procedure.Therefore, the empirical mean value is shifted and the corresponding standard deviation calculated according to the scaling relationship.The empirical distribution is then recomputed such that its mean value and standard deviation correspond to these new values.At each probability level of the modified empirical distribution, a cross section of the bundle similar to the ones shown in Fig. 7 can be taken such that a possibility distribution is obtained.For each soil moisture content value in the modified empirical distribution, a possibility degree can then be determined from the possibility distribution of the cross section corresponding to the probability level of that soil moisture content value.The "optimal" empirical distribution is then the one that is located as good as possible in the bundle i.e. the one for which the minimum of all these possibility degrees is maximized: with F , the cdf of modelled soil moisture content values to be optimised, and pos the possibility degrees of F in the bundle.It is important to note that, by using this method, the field-averaged soil moisture value, given by the SAR, and present in the bundle is, fully trusted.Yet, the soil moisture pattern as predicted by the hydrological model is preserved.The optimisation procedure was performed in this experiment using the golden section search combined with the parabolic interpolation method (Forsythe et al., 1976;Brent, 1973) (available in Matlab®).The search interval was bounded by the mean values of the outer left and right distributions in the bundle, i.e. the distributions with possibility degree + δ.In order to enhance the sensitivity of the optimisation procedure, only possibility degrees higher than were taken into account and intermediate possibility degrees were interpolated on the basis of the original 11 possibility levels obtained from the SAR-retrieved possibility distribution of field-averaged soil moisture content values.In this way, a new empirical cdf is obtained (the red cdf in Fig. 8) according to which the modelled soil moisture content values are updated such that the formerly wettest (driest) pixels receive the new wettest (driest) soil moisture content values.
In order to insert these soil moisture content values into the hydrological model along the different soil layers, a nudging procedure was carried out as follows.In this experiment, the Hydrol.Earth Syst.Sci., 15, 3101-3114, 2011 www.hydrol-earth-syst-sci.net/15/3101/2011/  soil was divided into four soil layers, a root zone of 0.05 m, two soil layers of 0.1 m and 0.2 m and a bottom layer.It was furthermore presumed that soil moisture was uniformly distributed along each soil layer.The change in soil moisture content for each soil layer was interpolated, according to the soil depth at the beginning of the soil layer, between a maximum soil moisture change in the root zone and a zero soil moisture change in the bottom layer.The results of this experiment are shown in Fig. 9 as a time series of root zone field-averaged soil moisture content values.The truth, baseline and assimilated run are given.From this figure, it can be seen that applying the integration method and Eq. ( 4) in this twin experiment to update the modelled soil moisture content values, results in a shift of the baseline values towards the truth.This effect slightly persists after the third assimilation time step (DOY 117) at the lower soil moisture content values.This can also be noticed in a small improvement of the value of the RMSE calculated on a seven days time window starting at the assimilation time steps.Table 3 lists these RMSE values in which RMSE bundle and RMSE baseline respectively compare the assimilated and baseline run with the truth.Figure 9b, c, d and e also shows a more detailed view of the soil moisture assimilation.Each subfigure shows the soil moisture time series 7 days before and after the assimilation time steps.In Fig. 9b, it is observed that the field-averaged soil moisture content value at the assimilation time step exceeds the truth.This is due to the optimisation procedure in which the empirical cdf has been optimised instead of only its mean value.Furthermore, it has been assumed that the within-field soil moisture variability is normally distributed, cumulative normal distributions were therefore used in the establishment of the bundle.However, it can be seen in Fig. 8 that a difference in shape exists between the cumulative normal distributions in the bundle and the empirical cdf.

Update step using central cumulative distribution function
In order to check whether there is an added value of optimising the empirical cdf by taking into account the possibility  4), but according to the minimum distance to the cumulative normal distribution with possibility degree equal to 1, i.e. the distribution corresponding to the mode of the possibility distribution.This distribution is shown in boldface in Fig. 8.The soil moisture content values were then updated according to this optimised distribution and inserted into the model similarly as described in Subsect.4.2.2.In the latter optimisation approach, the Wasserstein distance (Gibbs and Su, 2002) between two cdfs F and G, with F −1 and G −1 their corresponding inverse functions, is employed: (5) Figure 9 shows the time series of the field-averaged soil moisture content values as obtained by the truth, the baseline run, the assimilated run when Eq. ( 4) is used and the assimilated run when the Wasserstein distance (Eq.5) is minimised.Figure 9b, c, d and e shows a more detailed view of these assimilations.From this figure, it can be seen that no major differences exist between both optimisation methods, which is confirmed by the RMSE values, RMSE bundle vs. RMSE distance in Table 3 for the first and the second optimisation procedure, respectively.Slightly lower RMSE values are obtained with the first optimisation method for the first two assimilation time steps, whereas slightly lower RMSE values are obtained with the second optimisation method for the other two assimilation time steps.Figure 10 shows the different cdfs for the four time steps.This figure shows that both optimised distribution functions only slightly differ for the assimilation dates DOY 105 and DOY 117, whereas a somewhat larger difference exists at the other two assimilation dates.Similarity Index between the updated soil moisture images and the images of the truth were also calculated (De Baets et al., 2009;De Baets and De Meyer, 2005): with N the number of pixels, θ t the soil moisture content value [-] in the image of the truth, and θ u the updated soil moisture content value [-].The values of the Wasserstein distances and the Jaccard Similarity Indices show that, apart from the distance and value of the similarity index at DOY 42, the cdfs and the corresponding soil moisture images that were optimised using Eq. ( 4), slightly better resemble the cdfs and images of the truth.Yet, although taking into account the information present in the bundle, i.e. the possibility degrees of all cumulative normal distributions, is more informative from a mathematical point of view, no large difference is observed when the more practical procedure of optimising the empirical cumulative distribution according to the distance between two cumulative distribution functions, is employed.

Conclusions
In a hydrological modelling scenario, often the problem arises that soil moisture content measurements become available at a certain time step.However, this information is not necessarily provided at the same scale at which the hydrological model is run.Furthermore, if field-averaged SARretrieved backscatter values are inverted using the possibilistic retrieval method (Verhoest et al., 2007;Vernieuwe et al., 2011), only a possibility distribution of field-averaged soil moisture content values can be obtained.Therefore, a method has been introduced in this paper that integrates soil moisture measured at a coarse scale (field scale) and represented by means of a possibility distribution, with modelled soil moisture contents at a fine scale.To this end, a scaling relationship between the field-averaged soil moisture content and its corresponding standard deviation is employed.
In the first step of the method, a unique scaling relationship was fitted to synthetically obtained soil moisture data as to obtain a mathematical expression for the scaling relationship.To this end, four different models were tested, out of which the splines yielded the best results.The integration method, in which a possibility distribution of fieldaveraged soil moisture content values is combined with the unique scaling relationship as to obtain a bundle of cumulative normal distributions, was then demonstrated by means of a twin experiment in which TOPLATS was employed at a fine scale (5 m × 5 m).At certain time steps in the modelling scenario, the situation was mimicked in which possibilistic SAR-retrieved soil moisture data became available: fieldaveraged soil moisture content values were sampled from the truth and converted into possibility distributions of soil moisture content by means of the possibilistic retrieval method.In a real-world situation, however, field-averaged backscatter values would be obtained from the SAR and converted into a possibility distribution of soil moisture content values.By combining the possibility distributions of soil moisture content with the spline, a bundle of cumulative normal distributions was established, according to which the empirical cdf as obtained by TOPLATS, was optimised.The modelled soil moisture content values at the fine scale were then changed following the optimised cdf.Two procedures were compared for the optimisation of the empirical cdf.In the first optimisation procedure, the empirical cdf was changed such that its minimum possibility degree in the bundle is maximised, whereas the second procedure only minimises its Wasserstein distance to the central cumulative normal distribution in the bundle, i.e. the cumulative normal distribution with a possibility degree of 1.
The results showed that for both methods, soil moisture content values were shifted towards the truth after assimilation, and that this effect slightly persists after the third assimilation time step at the lower soil moisture content values.This improvement only resulted in a small improvement of the RMSE values, calculated on a seven days time window starting at the assimilation time steps.It was furthermore observed that no major differences between both optimisation procedures exist.This indicates that, although using the entire information present in the bundle is more correct from a mathematical point of view, no clear effect can be observed if one would only take into account the cumulative normal distribution corresponding to the mode of the possibility distribution.
In order to simplify the method, some assumptions were made that can be addressed in future research.First, an essential part of the methodology presented in this paper concerns the scaling relationship used.In this paper, a simplified unique relationship is fitted to modelled results, neglecting hysteresis issues and the impact of climate variables, topography, vegetation and soil characteristics on the soil moisture pattern.However, if the dependence of the scaling relationship on hysteresis and external variables could be modelled, then one could use this modelled relationship rather than the simplified relationship suggested in this paper.This relationship was furthermore identified on the basis of a synthetically generated data set.It would be more appropriate to use an independently derived relationship, obtained from field experiments (e.g.GPR-based), in order to circumvent that model errors or model inefficiencies are captured in the relationship.Second, it was assumed that the within-field variability can be represented by means of a normal probability distribution function.However, a clear difference in shape could be observed between the empirical cdf and those in the bundle.Subsequently, the part of the method in which the modelled soil moisture content values are updated according to the information present in the bundle fully relies on the field-averaged soil moisture content value as provided by the SAR, whereas the modelled soil moisture pattern is preserved.Future research can therefore extend the method as to meet these shortcomings.

Fig. 3 .
Fig. 3. Fitted spline with 4 knots (red), between mean soil moisture content values and their standard deviation (black dots) for the agricultural field considered.

Data:
Possibility distribution of soil moisture content Scaling relationship (spline) Result: Bundle of cumulative normal distributions for α = + δ,...,1 do Determine the corresponding α-cut (π θ m ) α of the possibility distribution of θ m for θ m = θ ml ,θ ml + θ m ,...,θ mr do Use the spline to calculate the corresponding standard deviation θ s Compute the cumulative normal distribution,N (θ m ,θ s ) end end Algorithm 1: Outline of the disaggregation step.

Figure 5 Fig. 5 .Fig. 5 .
Figure5illustrates the cumulative normal distributions obtained when the disaggregation step is carried out for the α-cut (π ) 0.6 .It can be seen that the cumulative normal distributions have been cut off at higher soil moisture content values than the saturated soil moisture content value

Fig. 8 .Fig. 8 .
Fig. 8. Modelled empirical cdf (green dashed line) and its optimisation (red solid line) according to the information present in the bundle (cumulative normal distributions originating from the endpoints of the possibility distribution are given in black solid lines).

Fig. 9 .
Fig. 9. Field averaged soil moisture content modelled in different model runs (truth, baseline run and assimilated run).Time steps at which SAR data is acquired are also indicated.(a) Overview of the entire time series, (b) detail of assimilation at DOY 42, (c) detail of assimilation at DOY 105, (d) detail of assimilation at DOY 117 and (e) detail of assimilation at DOY 162.The optimisation in the assimilated runs was performed w.r.t. the information present in the bundle (bundle ) or w.r.t. the Wasserstein distance between the empiricial and the central cumulative normal distribution in the bundle (distance).

Table 1 .
Soil characteristics of the agricultural field considered.

Table 2 .
Famiglietti et al. (2008)values obtained during the cross-validation for the different models: the exponential-based function proposed byFamiglietti et al. (2008)(mod exp ), two spline functions with 3 and 4 knots respectively (mod s3 and mod s4 ), and a second order polynomial (mod p ).The results of the Kruskal-Wallis (KW) statistical test and the nonparametric comparison (NC) of a control group to other groups for a significance level α = 0.05 are given as well.

Table 3 .
RMSE values when comparing to the truth calculated on a seven days time window starting at the assimilation time steps.The optimisation in the assimilated runs was performed w.r.t. the information present in the bundle (bundle) or w.r.t. the Wasserstein distance between the empirical and the central cumulative normal distribution in the bundle (distance).

Table 4 .
Values of the Wasserstein distance d W between the cdf of the truth and the optimised cdfs.Values of the Jaccard similarity index J between the soil moisture image of the truth and the images corresponding to the optimised cdfs.The optimisation is performed w.r.t. the information present in the bundle (bundle) or w.r.t. the Wasserstein distance between the empirical and the central cumulative normal distribution in the bundle (distance).