State-space approach to evaluate spatial variability of field measured soil water status along a line transect in a volcanic-vesuvian soil

State-space approach to evaluate spatial variability of field measured soil water status along a line transect in a volcanic-vesuvian soil A. Comegna, A. Coppola, V. Comegna, G. Severino, A. Sommella, and C. Vitale Department for Agro-Forestry Systems Management (DITEC), Hydraulics Division, University of Basilicata, Potenza, Italy Division of Water Resources Management, University of Naples “Federico II”, Italy Department of Statistics, University of Salerno, Italy


Introduction
The increasing need for water for domestic and industrial purposes under ever more stringent environmental protection measures, combined with advances in irrigation, makes it necessary to gain in-depth knowledge of water and solute flow in the vadose zone, understood as the zone roughly extending from the soil surface to the water table.Mathematical models have for some time been available that allow the probable losses of water by evaporation and percolation to be estimated, as well as the probable solute residence times and the evolution of available water reserves (Feddes et al., 1988).Based on laws of water flow in unsaturated porous media, in order to be applied such models are known to require mathematical relations linking the local value of water content in volume θ to the water tension h and soil hydraulic conductivity k.Experimental observations to measure as directly as possible the relations between θ, h and k can be developed through field trials (Hillel, 1998).It is also well known that in field applications of models, to achieve results of a practical interest, there must be an evaluation in statistical terms of the variability of such observable parameters also in fairly homogeneous natural media.Deterministic evaluation of spatial heterogeneity of soil physical and hydraulic properties requires a large number of measurements and hence can only be performed for limited areas.This has led to the increasing use of statistical models in which hydraulic variables are considered stochastic (Freeze, 1975).
Given the complexity of the problem, several decades ago, systematic field measurements were conducted and the data were analysed in order to specify and describe heterogeneities (Nielsen et al., 1973).The conventional statistical approach adopted at the time consisted in treating observations concerning the property in question as statistically independent quantities abstracted from their spatial position.Only in recent years have surveys been conducted that have clearly shown the existence of a spatial structure of heterogeneities (Russo and Bresler, 1981).This structure has been described with geostatistical techniques essentially derived from regionalized variable theory (Matheron, 1971) in terms of semivariograms.Each physical property, in the case of isotropy, could thus be considered as the realization of a stochastic process which is a function of coordinates on a horizontal plane and, in the case of anisotropy, a function of direction.Applications of such techniques have proved promising for describing variability in space of soil hydraulic properties and have led to defining the number and distance at which to make determinations, thereby reducing sampling costs (Vieira et al., 1981).
Another group of techniques, also used to study the structure of variability, is based on a modified time series theory (Box and Jenkins, 1970).By using such techniques, the structure may be described in terms of autocorrelation functions and SARMA (Spatial Autoregressive-Moving Average) models with a view to estimating the stochastic properties of the data.Some of these applications in soil physics and hydrology include the studies by Morkoc et al. (1985), Anderson and Cassel (1986), Wendroth et al. (1992), Cassel et al. (2000), Heuvelink and Webster (2001), Wendroth et al. (2006).
In the present paper, reference is made to a state-space statistical model which was set up to analyze the water status of a volcanic Vesuvian soil.Section 2 illustrates the experiment from which observations were made on the two parameters in question θ and h.Section 3 deals with the state-space model formulation.Sections 4 and 5 analyze two applications of the model in the univariate and bivariate case.Finally in Sect.6 some conclusions will be drawn and comments made.

Description of the experiment
The experiment was conducted on a sandy soil (83% sand, 12% silt and 5% clay, USDA), located at Ponticelli, Naples (Italy; 40 • 52 00 N and 14 • 53 00 E) and pedologically classifiable as an Andosol.This soil was chosen because it is typical of a large, intensively cultivated area near Vesuvius.At the center of the field, where the trial was carried out, a plot with dimensions of 2 × 50 m 2 was prepared along a N-S axis, with a boundary ridge about 0.25 m high (Fig. 1).
At the center of the plot 50 three-rod time domain reflectometry (TDR) probes (0.15 m long and a wire spacing of 0.015 m) were inserted at constant distance of 1 m apart for measuring, at a depth of 0.3 m, volumetric soil water content θ.The TDR probes were multiplexed manually to a TDR 100 tester (Campbell Scientific, Inc, Logan, UT).On a parallel transect, at a distance of 0.5 m from the TDR probe line, 50 tensiometers were installed with their tip at a depth of 0.3 m to register tension h in the liquid phase.The ceramic tensiometer cups were made in our laboratory, with the following characteristics: (i) the bubbling pressure (P a ), definable as the pressure at which soil air enters the tensiometer, is greater than 0.5 hPa; (ii) the cup conductance (C) is greater than 0.0111 cm 3 s −1 hPa −1 of pressure difference across the wall; (iii) considering that the gauge sensitivity (S) is 1000 hPa cm −3 , an instrumental time constant in water τ = C −1 S −1 may be calculated equal to 90 s.Water tension was measured connecting tensiometers to a microdatalogger (Skye-Instruments, Ltd, UK) For the purposes of the trial, the plot was ponded by applying water in excess of the infiltration rate, while an overflow pipe guaranteed a constant water depth of 0.15 m.The time required for establishing steady-state flow in the profile at all depths to 1.5 m, was about 1 week.When infiltration was complete, the surface of the plot was covered with a plastic sheet so as to prevent evaporation from the soil surface and rainfall infiltration in the soil profile.
Measurements were carried out at twelve sampling times at increasing time intervals (5, 24, 48, 72, 120, 160, 240, 336, 432, 600, 768, 936 h, respectively) from the start of the drainage process.Such times on a logarithmic scale are distributed approximately along a straight line; in other words, the choice of measuring times on this scale may be considered approximately equidistant.
The water content θ and the tension h were always measured at the same time, thereby making it easier to determine the retention curves θ(h).Monitoring was interrupted 42 days after the end of infiltration when the drainage process was evolving so slowly as to make it pointless to continue with the experiment.

State-space model formulation
Clearly, there is a strict analogy between space and time, at least in the case of one-dimensional space.Hence, under the hypothesis of isotropy, analytical methods are to a broad extent equivalent.Typically, time series analysis allows us to analyse spatial structure in terms of autocorrelation functions and generalisation of state-space models.For this particular method of regression in the time and space domain, unlike the methods of kriging and cokriging (Vieira et al., 1983) the assumption of stationarity of observations is not required.The state-space method (Kalman, 1960) is particularly interesting when the phenomenon in question satisfies certain systems of differential equations.The method has been used in economics (Shumway and Stoffer, 2000) and has yielded good results in agronomic and soil science (Vieira et al., 1983;Morkoc et al., 1985;Wendroth et al., 1992;Wu et al., 1997;Cassel et al., 2000;Poulsen et al., 2003;Nielsen and Wendroth, 2003) Let us use Y(x), x = x o +1, ..., x o + n, to indicate the values assumed by n observations made for a certain soil parameter Y along a given transect (below we shall use the simpler notation Y t , t = 1, 2, ..., n).A state-space model consists, in the formulation most useful for our purposes (for details and generalisations see Anderson and Moore, 1979), of two equations: The first termed that of observations and the second that of transition, where F t is a known vector (p, 1), Z t is a vector (p,1) of the system state, G t, G 1t , G 2t are a set matrices (p, p); v t ∼ N(0;σ 2 v t ) independent of w t ∼ N p (0; w t ).The model ( 1) is wholly specified by the parameters (F t , G it, σ 2 v t , w t ) and includes, as particular cases, other statistical models such as regression, ARIMA and SARMA models.
Having set the initial values, we may obtain optimal forecasts and estimates of the non-observable components by using the Kalman filter.At the same time, from many observations made of soil physical and hydraulic properties, the latter may plausibly have been generated by stationary isotropic processes with parameters independent of the individual measuring points: Hence we may consider the case in which the equations in (1) are reduced to simple ARMA and SARMA models.The importance of being able to make the double representation (state-space and SARMA or ARMA) lies in the fact that ARMA and SARMA models are easy to identify and estimate, while state-space models allow a more straightforward, immediate interpretation of the phenomena to which they are applied.Indeed, from (1) it follows that Y t may be interpreted as the result of signal F t Z t which is overlaid by a random error v t .Evolution of many physical phenomena can be well represented with a logical scheme like that reported in Fig. 2.
The system structure is usually very straightforward and can be approximated, in the isotropic case, by an AR(1), given by: or, in the anisotropic case, a SAR(1) given by: Note that, if it is φ 1 = φ 2 then the SAR(1) model may be replaced by the simpler AR(1) model.
Moreover, if we assume p = 1,F t = 1,G t =φ then we obtain more simply: where B is the backshift autoregressive operator, φ i > α i and e t such that e t −α Thus both the equation of the observations and that of transition (i.e. the signal) are reduced to simple ARMA models and especially to an ARMA (1,1) for Y t and an AR(1) for Z t .Besides, as is widely acknowledged in soil physics, between the many parameters there may well be functional relations such that what applies to the univariate cases can be extended to the simultaneous analysis in which Y t is an rdimensional vector.Under isotropic hypothesis, a particular generalisation of (1) to the case r = 2 imply the following model: where  In the particular case of p = 2, F t = I with I the identical matrix and G t = , we have the equivalent bivariate model: where e t ∼WN(0; e ) is independent of w t ∼WN(0; w ) and are (2, 2) matrices of unknown parameters to be estimated.

Application in the univariate case
In this section we will analyze individually the two parameters which characterize the soil water status in terms of θ and h measured at 0.3 m depth, along the N-S line of the plot so as to highlight their intrinsic structure linked to regional variability and, for 3 of the 12 measuring sampling times (the 3rd, 6th and 11th carried out 48, 168 and 768 h respectively from the start of the drainage), the variations occurring in time (the parameters concerned are indicated by θ i and h i ).
The data were first elaborated using classical statistical techniques, hypothesizing that the parameters vary in an essentially random manner.From this point of view, the main statistical indices (min.value, max.value, mean, standard deviation, skewness, kurtosis, coefficient of variation) of the above parameters are reported in Table 1.
From Table 1 we may deduce, for all the measuring times considered, an increase in the standard deviation (SD) with its mean for parameter h, whereas the SD of θ is practically constant.We also note that the coefficient of variation (CV) of h is almost twice that of θ.Concluding, the two processes describing h and θ are, in time, both non-stationary on the mean, while h is also non-stationary in variance.Figure 3 illustrates the above points: it reports the 50 observations of θ and h for 10 of the 12 sampling times (from the 2nd to the 11th).This all agrees with the theoretical results obtained by Yeh et al. (1985), which predicted such behaviour on the basis of the stochastic analysis of unsaturated flow through heterogeneous media.
In the context of stochastic analysis it is essential to verify, for the parameters considered, the existence of a correlation structure.
Variables θ and h, given that they are recorded at constant intervals along the transect, are ordered in space and their evolution in the prefixed direction can therefore be evaluated by means of typical statistical analyses of the time series and in particular by means of the ARMA model (see Box and Jenkins, 1970) only when the hypothesis of isotropy can be justified.A preliminary test was than carried out on θ 3 and h 3 series.The model which supplied the most acceptable results  in terms of simplicity and interpretability, both because of the limited number of parameters and goodness of fit of the data, was SAR(1) model where the unknown parameters to be estimated are φ 1 and φ 2 .The above criteria of choice was followed for all models subsequently used.It should be noted that if φ 1 and φ 2 are substantially equal than θ 3 and h 3 are to be considered isotropic, conversely anisotropy may be taken into account.
The iterative least-squares method estimate of the model parameters in question, provided the results reported in Table 2 where standard deviations are in brackets.Having supposed that the phenomenon is isotropic and therefore invertible in space, the estimated φ 1 and φ 2 values are expected to be equal.In particular, in our case, we may observe that φ 1 = φ 2 .If the estimates are analyzed in greater detail, in the Fig. 4a, b it may be noted that the parameters in question are statistically identical.Then the model reported in (1) was applied, only to three series of data obtained along the transect.The series concerns, in particular, values of soil water content θ and tension h obtained 48 h from the beginning of the drainage.Furthermore the analysis will be extended to a section of the soil moisture retention curve θ(h) constructed for h = 100 cm, subsequently indicated as θ 100 .
More significantly, the essential characters assumed in the space from the distribution of the parameters in question may be deduced from the transects of Fig. 5, which report the relative values in the 50 observation points.
To identify the ARMA models to be adapted to the above three series, we estimated the autocorrelation (ACF) and partial autocorrelation (PACF) function.As transpires from Fig. 6a, b, c, the three series can be well represented by an ARMA (1,1) model.
Anyway analysis of ACF residuals (Fig. 6d) shows clearly that no structure whatever is present in the series of noises, which is further confirmation of the good fit of the model used to represent the examined parameters.
Parameter estimates, obtained with the least squares method, of the ARMA(1,1) model adapted to the above series and the goodness index fit R 2 (mean square errors in brackets) are reported in Table 3 below.Clearly, all three series show a strong inertia component which confirms the presence of the spatial structure ascribable to an AR(1), accompanied by marked fortnitosness as results from the low value of R 2 .The Jarque-Berà (Snedecor e Cochran, 1980) test is performed to evaluate the normality of the residual of www.hydrol-earth-syst-sci.net/14/2455/2010/ Hydrol.Earth Syst.Sci., 14, 2455-2463, 2010 both ARMA(1,1) models end the null hypothesis (normality distribution) is accepted.The kernel estimate residuals distributions are reported in Fig. 7.The estimated model was then used to obtain optimal predictions along the transect.Figure 8a, b reports the observed data, a signal estimate and the relative noise for series θ 3 and h 3 .The graphs for the other series were similar, with analogous signal in the general pattern, not reported here for the sake of brevity, confirming that the spatial structure of soil hydraulic parameters is a characteristic of the porous medium in question.

Application in the bivariate case
Consistent with the aim of simultaneously analysing parameters θ and h as a bivariate dynamic system and modelling statistically the intrinsic variability, in this section we seek to ascertain once again the suitability of the multivariate approach based on the use of state-space models.Preliminary qualitative assessment regarding the nature of the functional relationship between θ and h may be inferred from inspection of figure 9 which reports all the θ and h values measured contemporaneously for each of the 50 sites and for 10 of the 12 measuring times.
In particular, the figure shows the degree of heterogeneity of the moisture retention curve θ(h) irrespective of the velocity with which h varies in time in relation to the redistribution process of moisture in the soil profile.For a more straightforward interpretation, the scatter (θ, h) was fitted .00000 .00001 .00002 .00003 .00004 .00005 -30,000 -20,000 -10,000 0 10,000 20,000 30,000 .00000 .00001 .00002 .00003 .00004 .00005 -30,000 -20,000 -10,000 0 10,000 20,000 30,000

RES_H3
(right) with a curve to which the analytical expression proposed by van Genuchten (1980) was assigned: where θ s and θ r denote the saturated and residual water content respectively.The constants α, m and n are shape parameters and m = 1 − 1 n .The estimate of parameter (α, n) in model and the goodness index of fit R 2 , obtained by the least squares method, led to the following results: θ r = 0, α = 0.01, n = 1.46 and R 2 = 0.90.
The problem that arises at this point is to ascertain whether the bivariate model is compatible with the results obtained for the individual variables θ and h.In this respect, it can be easily verified that a bivariate ARMA(1,1) means that the single components are univariate ARMA(2,2) in contrast to ARMA(1,1) models that are adaptable to θ and h.Admittedly, the situations between the elements and may coincide, hence ARMA(2,2) are simplified into ARMA(1,1).However, the constraints to be met are such as to rule out that this may in practice occur.On the other hand, if in (3) we have v t = 0, then Y t coincides with Z t and this has two implications: (a) it can no longer be supposed that θ and h are broken down simultaneously into a signal and an error (this does not exclude the decomposition of single components); (b) the ARMA structure of Y t is simplified into the bivariate AR(1) model: which implies, for the single components, ARMA(2,1) models.As may be noted, we still have a different model from ARMA(1,1) obtained empirically for the components, but in this case the coincidental conditions such that an ARMA(2,1) is reduced to an ARMA(1,1), are not very constraining.
Hence it is plausible that the bivariate model for Y t is type (6).Moreover, it is easy to prove that model ( 6), through orthogonalization of w , may equally be represented by the following: which expresses the simultaneous functional relation between θ and h.Note that in Eq. ( 3), a t and b t are white noises independent among them and respectively with h t and θ t .Moreover, the proof is straightforward that if = Diag{φ 11 , φ 22 } then we also obtain while the first 10 auto cross-correlations matrices of the estimated residuals, for exploratory purposes are reported synthetically in Fig. 10.
From these it emerges that the bivariate AR(1) model fits the two phenomena well and highlights the existence of simultaneous causal relations, as was to be expected, between θ and h.Estimation with the least squares of model (3) supplied the following results (in brackets the mean square deviations of the estimates): As may be noted, this yields β2 ≈ φ11 and δ2 ≈ φ22 which may be considered further confirmation of the goodness of the statistical model used to interpret and describe the two parameters θ t and h t and the relations between them.In this respect, in Fig. 11 we report the θ t values observed and those estimated with the first of (4).The expression manages to capture the phenomenon's general trend.A similar relationship, albeit not presented, is obtained for the second of (4).

Conclusions
The soil water status may be better defined stochastically rather than deterministically since it is not always possible to evaluate with precision the behaviour of parameters θ and h of the flow system at an assigned point in time.This is due both to intrinsic and extrinsic heterogeneities of natural porous media, and to field crop root water uptake as well as natural contributory factors which are essentially stochastic.
The experiment carried out on a field plot on a Vesuvian volcanic soil, with regard to its water status, showed that θ and h are essentially multi-dimensional processes when observed in space and time.The isotropic nature of the phenomena are firstly empirical verified and then the space correlation structures were analyzed using state-space methods which allowed the setting-up of univariate models.The nonstationary nature of soil water tension, in mean and in variance was then ascertained, whereas water content was locally stationary on mean and variance for the whole period of observation.In any case, the signals in the observed series were fairly clear and marked even if influenced in complex manner by the water dynamics of the soil profile.
The deterministic relations between θ and h suggested the use of bivariate models which allowed for any simultaneous isotropic relations existing between the two parameters in space.The theoretical potential and the practical implications of these results in the modelling of water transport processes in unsaturated heterogeneous porous media require further in-depth studies above all with regard to different pedological contexts from those here analyzed.Since drainage is the predominant transport process in this simplified hydrological experiment, it is reasonable to suppose that it depends upon the combined effect of soil conducting properties within the entire soil profile.
Nevertheless, starting from available knowledge and verified combination of accuracy and flexibility in the models used, we hope that these instruments may be considered adequate for the study and interpretation of the statistical properties of soil hydraulic parameters.

Fig. 1 .
Fig. 1.View of the experimental plot displaying relative position of TDR probes and 0.3 m depth tensiometers.

Fig. 3 .
Fig. 3. Soil water tension (h) and volumetric water content (θ as a function of time at the 0.3 m depth for all redistribution times during the drainage period.

Fig. 10 .
Fig. 10.Schematic representation of cross correlation matrices of estimated residues: ( ) auto-cross correlations non significantly different from zero, ( -) auto-cross correlation significantly greater than zero.

Table 1 .
Descriptive indices of the spatial series observed along the transect at 0.3 m depth of soil profile.

Table 2 .
Parameter estimates and comparison of SAR(1) model for the series examined and goodness of fit index R 2 ; in parenthesis the standard deviation of the estimates.

Table 3 .
Parameter estimates of the ARMA(1,1) model, for θ 3 , h 3 , θ 100 and goodness of fit R 2 .In parenthesis the standard deviations of the estimated parameters. 13