Separating precipitation and evapotranspiration from noise – a new filter routine for high-resolution lysimeter data

Weighing lysimeters yield the most precise and realistic measures for evapotranspiration (ET) and precipitation (P ), which are of great importance for many questions regarding soil and atmospheric sciences. An increase or a decrease of the system mass (lysimeter plus seepage) indicates P or ET. These real mass changes of the lysimeter system have to be separated from measurement noise (e.g., caused by wind). A promising approach to filter noisy lysimeter data is (i) to introduce a smoothing routine, like a moving average with a certain averaging window, and then (ii) to apply a certain threshold value, δ, accounting for measurement accuracy, separating significant from insignificant weight changes. Thus, two filter parameters are used, namely w andδ. In particular, the time-variable noise due to wind as well as strong signals due to heavy precipitation pose challenges for such noise-reduction algorithms. If w is too small, data noise might be interpreted as real system changes. If w is too wide, small weight changes in short time intervals might be disregarded. The same applies to too small or too large values forδ. Application of constant w andδ leads either to unnecessary losses of accuracy or to faulty data due to noise. The aim of this paper is to solve this problem with a new filter routine that is appropriate for any event, ranging from smooth evaporation to strong wind and heavy precipitation. Therefore, the new routine uses adaptive w andδ in dependence on signal strength and noise (AWAT – adaptive window and adaptive threshold filter). The AWAT filter, a moving-average filter and the Savitzky–Golay filter with constantw and δ were applied to real lysimeter data comprising the above-mentioned events. The AWAT filter was the only filter that could handle the data of all events very well. A sensitivity study shows that the magnitude of the maximum threshold value has practically no influence on the results; thus only the maximum window width must be predefined by the user.


Introduction
Precise knowledge of the water fluxes between the soil-plant system and the atmosphere is of great importance for understanding and modeling water, solute and energy transfer in the soil-plant-atmosphere system.The water flux towards the soil-plant system within a certain time interval is precipitation (P [mm]), which can be rain, snow and dewfall, whereas the flux leaving the soil-plant system towards the atmosphere within a certain time interval is given by soil evaporation (E [mm]), evaporation of intercepted water (I [mm]) and transpiration (T [mm]), often summed up to evapotranspiration (ET [mm]).
The precipitation is usually measured by a standard gauge 1 m above the soil surface, which is prone to systematic errors due to its geometry, wind and other factors (Michelson, 2004).One method to determine the reference evapotranspiration (ET 0 [mm]) is the use of a class-A pan.Due to differences in albedo between water and grass and island effects, among other factors, these measured data have to be corrected by a so-called pan coefficient (Irmark et al., 2002;Gundekar et al., 2008), which is location dependent (Howell et al., 1983).Actual evapotranspiration is even more difficult to measure under field conditions.
Weighing lysimeters yield the most precise and realistic measures for P and ET, as they avoid all the abovementioned systematic errors.In order to precisely distinguish between P and ET, which might occur both in relatively small time intervals, the masses of lysimeter and seepage water have to be measured in high temporal resolution.This is of special importance if the energy balance of the soil-plantatmosphere system is focused on, where a great fraction of total heat flux is given by latent heat flux (Foken, 2008).Note that for long-term water balances focusing on, for example, ground water recharge, where a precise discrimination of P and ET is not needed, a high temporal resolution of measurements is not necessary.
Lysimeters have been used in agricultural studies to measure ground water recharge (Yang et al., 2000), solute transport towards the groundwater (Schoen et al., 1999) or water fluxes at the soil-plant-atmosphere interface (Meissner et al., 2007) as well as in urban sites to study surface runoff (Nehls et al., 2011).
The early weighable lysimeters are instrumented with lever-arm counterbalance systems (Aboukhaled et al., 1982), and are still used to date (Nolz et al., 2013).Depending on the measurement system, these lysimeters can reach resolutions of < 0.1 mm.
In the last decades, resolution and precision of the weighing systems have been substantially improved, and thus modern lysimeters, resting on weighing cells (von Unold and Fank, 2008), can reach resolutions of up to 0.01 mm.They are regarded as the most precise measurement devices for rainfall, actual evapotranspiration or even dewfall (Meissner et al., 2007).
As the resolution of the weighing systems increased, small mechanical disturbances (e.g., caused by wind) became visible in the data as noise (Ramier et al., 2004;Nolz et al., 2013).Therefore, precision and accuracy of the lysimeter measurements depend not only on the precision of the weighing device but also on external conditions, which cannot be controlled or turned off.Moreover, as the wind speed varies with time, the measurement noise also varies with time.In the study of Nolz et al. (2013) the accuracy of the system was up to 3 times lower due to wind (wind speed range 0 to 13 m s −1 ).Ramier et al. (2004) report a reduced accuracy of up to about 5 times due to wind disturbance.
A mandatory requirement for the quantification of P or ET from lysimeter measurements is that in a reasonably small time interval, either P or ET is negligible; in other words, they do not happen simultaneously (Ramier et al., 2004;Schelle et al., 2012).Note that in the case of snow or rainfall, the air right above the soil surface need not necessarily be water saturated.Thus, ET and P may actually take place at the same time.However, it can be assumed that during such precipitation events evaporation is negligible (i.e., ET P ).
With this assumption, every increase in system weight (lysimeter mass + cumulative seepage mass) is interpreted as P , whereas every decrease in system weight is interpreted as ET.To apply this concept correctly, the noise (e.g., due to wind) has to be separated from signals using a filtering routine.Such filtering can be carried out in two steps as outlined by Fank (2013) or Schrader et al. (2013).First, a smoothing routine with certain window width w is applied.Such a routine can be the simple moving average or a more advanced routine, like the Savitzky-Golay filter (Savitzky and Golay, 1964).Second, all changes in weight smaller than a predefined accuracy threshold δ are discarded.
Both the window width w and the allowed accuracy δ have to be defined before using the filter routine.The problem with this procedure is the choice of the optimal values for w and δ.If the averaging window is too small, noisy data might be interpreted as real system changes.If the window width is too wide, small weight changes in short time intervals might be disregarded.The same applies to too small or too large values for δ.
The general requirement for such filters is that they have to be applicable for very different meteorological conditions -like short, heavy rainfalls (strong signals) -smooth evaporation events with low wind speed (low noise) and for events with no or low P or ET but strong winds (high noise).The former requires narrow averaging windows, whereas the latter requires wide averaging windows.Moreover, in periods with low wind speed, the data are more accurate than in periods with high wind speed (Nolz et al., 2013).Application of constant w and δ leads either to unnecessary losses of accuracy or to faulty data due to noise.A new filtering approach should solve this dilemma.
The best way to test filter routines would be to conduct lysimeter experiments under defined conditions (precision irrigator, wind canal etc.).However, it is easier to use artificial data, where the "true" signals are known (Schrader et al., 2013), or to test the routines by applying them to real lysimeter data from very different events, like strong wind or heavy rainfall, and to judge the filters through expert knowledge.The disadvantage of real data is that the true system response is not known.However, artificially composed data might not comprise the same complex system and noise behavior as in reality.
The aim of this paper is to introduce a new filter routine that is appropriate for any event, including events with low disturbances as well as strong wind and heavy precipitation in small time intervals.The novel approach is based on (i) an adaptive window width, w, which depends on the signal strength, i.e., intensity of P or ET, and on (ii) an adaptive threshold value, δ, that depends on noise severity.The filter is compared to other routines using real lysimeter data that comprise all above-mentioned events.

Lysimeter setup
The measurements were conducted at the lysimeter station Marienfelde, south of Berlin (52.396731 • N, 13.367524 • E).The lysimeter was 1.5 m deep with a surface area of 1 m 2 .A lever-arm counterbalance system was used in combination with a laboratory scale, which had a resolution of 0.01 g.The resolution due to the lever-arm mechanism was 80 g for the lysimeter mass.With a water density of ≈ 1000 kg m −3 , this results in a resolution of 0.08 mm for the upper boundary fluxes.The outflow of water at the lower boundary was directly recorded with a scale with a resolution of 5 g.All data were logged in a 1 min time interval.The soil material in the lysimeter was a packed sand from a partly hydrophobic Dystric Arenosol from Niederlehme (Brandenburg, Germany).No plants were on the lysimeter, so evapotranspiration was reduced to mere evaporation.The data used in this study were recorded from 25 May to 6 October 2012 under very different weather conditions.

Data processing
The total mass of the system, M [kg], is the sum of the masses of the lysimeter, M lys [kg], and of the outflow, M out [kg]: (1) Beginning at a certain time, t 0 , the cumulative water mass flux at the upper boundary is given by M − M 0 , where M 0 [kg] is the mass of the lysimeter system at t 0 .Note that with the lysimeter geometry outlined above, a water storage change in kilograms is equal to a change in millimeters.Therefore, all water storage changes are given in millimeters in the following.
In order to evaluate the new filter, we focus on three very different benchmark events, including a day of smooth evaporation (6 July 2012), a heavy rainfall event with an intensity of approximately 1 mm min −1 (21 August 2012) and a day with strong wind and low evaporation (23 September 2012) (see Fig. 1).In the following these three events are denominated as "smooth evap", "heavy prec" and "strong wind".There was no precipitation on 23 September 2012 (detected by rain gauge).In the time between 1 July and 3 July 2012 a power breakdown led to data loss.

Calculating evaporation and precipitation from lysimeter data
As mentioned above it is assumed that either ET or P , but not both, take place within the same time interval.With this assumption and with perfect (i.e., non-noisy) data a change in M is either precipitation or evapotranspiration.Thus, P and ET can be calculated by (Schrader et al., 2013): where M [kg] is a change in cumulative upper boundary mass flux in the according time interval.However, lysimeter data are usually noisy to some extent, and thus M might be possibly noise due to wind or other external disturbances.Thus, Eq. ( 2) is only valid after an appropriate data-filtering procedure is applied.Such a procedure must be a compromise between too "strong" and too "weak" filtering.If noise is filtered not at all or too little, both P and ET are overestimated.If the data filter is too "strong", both processes might be underestimated (Schrader et al., 2013).An appropriate filter routine must take this into account for a wide range of very different conditions, as will be discussed in the following.

Separating P and ET from noise -general approach
A promising approach to filter noisy lysimeter data is (i) to introduce a smoothing routine, like a moving average with a certain averaging window w, and then (ii) to apply a certain threshold value δ, accounting for measurement accuracy, separating significant from insignificant weight changes (Fank, 2013;Schrader et al., 2013).In Fig. 2, the implementation of these two steps is illustrated for the case of the strong wind event (23 September 2012).
The simplest form of a smoothing routine is the simple moving average, hereafter denoted as MA.In the MA routine a certain window width (w [min]) is chosen and then the arithmetic mean of the data in the time window of t i − (w − 1)/2 to t i + (w − 1)/2 is calculated for each point in time t i [min].Another, more complex smoothing routine is the Savitzky-Golay filter (Savitzky and Golay, 1964), which has been used in several lysimeter studies (Vaughan et al., 2007;Vaughan and Ayars, 2009;Huang et al., 2012;Schrader et al., 2013).The Savitzky-Golay filter, hereafter denoted as SG filter, is based on a local least-squares polynomial approximation.With either an MA or SG filter, the data are smoothed to a large extent, depending on the smoothing window width.
After smoothing, there is usually still some noise left (Fig. 2, center panel), which would lead to an overestimation of both P and ET.Therefore, a threshold value, δ [mm], is introduced to reduce the fluctuations (Fig. 2, right panel).The threshold approach, which might more correctly be named "thresholding with memory", makes sure that significant weight changes are separated from insignificant changes in a way that all changes in weight smaller than a predefined accuracy threshold δ are discarded.As long as a change from t i−1 to t i is smaller than δ, the value for t i−1 is kept.Such a threshold value should be at least as high as the scale resolution.
Data with small noise ("smooth evap" in Fig. 1) need a relatively small value for δ, whereas data with large noise (strong wind) need larger values for δ.Moreover, if small or no changes happen, w should be large, whereas it should be small in the case of a strong signal, like the heavy precipitation event in Fig. 1.Therefore, an optimal separation of ET and P cannot be achieved with constant values for w and δ.In other words, an appropriate filter must have different properties for the "strong wind", the "heavy rain" and the "smooth evap" events (Fig. 1).In conclusion, time-variable window widths for averaging and threshold values are required, where the window width should depend on signal strength and the threshold value on the amplitude of the data noise.

Adaptive window and adaptive threshold (AWAT) filter routine
We solve the above-mentioned problem in three steps (Fig. 3): first, a maximum window width, w max , is defined in which information for signal strength and data noise is collected for each data point, i.This information is derived from simple statistical measures by fitting a moving polynomial to the data within w max .Second, a moving average with an adaptive window width is applied, where the window width is a function of signal strength.Third, an adaptive threshold value is applied, where the threshold value depends on the measurement noise (the software is available from the authors).These three steps will be explained in detail in the next paragraphs.

Derivation of measures for signal strength and noise
For each data point, i, a polynomial of kth order (Eq.3) is fitted to the neighboring data within a time window of a certain constant width, w max , (for example 31 min) by minimizing the residual sum of squares.The polynomial for data point i, Y i (t), is given for the time interval t i−w max /2 to t i+w max /2 : a j t j for t i−w max /2 ≥ t ≤ t i+w max /2 . (3) The order of the polynomial must be high enough to guarantee that it can describe the data in the time window reasonably well.However, it should be low enough to avoid the noise being described by the polynomial as well.To select the optimal order, we use an extension of Akaike's information criterion (Akaike, 1974) as suggested by Hurvich and Tsai (1989): where SSQ is the sum of squared residuals, n = k + 1 is the number of adjustable parameters and r is the number of data within the time window.Note that r must be odd.The first term of Eq. ( 4) penalizes a poor fit, the second term the number of parameters and the third term is the correction term for small values of r/n.The polynomial with the smallest AICc is selected as the best one.If no or low P or ET take place, k is low, since the data might be best described by a straight line.In the case of strong changing signal response in the time window, e.g., strong P followed by ET or vice versa, k is high.Figure 4 shows the fitted polynomials and the order k as selected by the AICc for three points in time in each of the three benchmark events.Although the AICc is a well-suited and much-used identification tool for the best model, there  is a possibility of "overfitting", e.g., if some kind of outlier is within the data.Therefore, we chose a maximum allowed order k max of 6.As can be seen in Fig. 4, k max is only reached for the heavy precipitation event.
Note that the polynomial is not a "perfect" model as can be seen for the heavy precipitation event.However, the required information can be derived.For each data point i, s res,i and s dat,i are calculated: and where y j , y j and ŷj are the measured data, the mean of the data within the time window and the fitted values, respec-tively.Considering the polynomial to be a good approximation for the system behavior, the value of s res,i is a measure for the noise, i.e., the accuracy of the measurements.This accuracy is not a single value and an intrinsic property of the used scales but also depends on the wind conditions and thus is time dependent.The quotient B i = s res,i /s dat,i is a measure of how much of the variation in the data is explained by the polynomial model and thus a measure of the signal strength.Note that where R 2 i is the coefficient of determination.The values for s res,i and R 2 i are also given in Fig. 4. Note that the polynomial regression is solely used to get information for data noise and signal strength.Other models, like splines with fixed or even variable knots, could be used as well to get the required information.We chose the polynomials because the parameters and thus the required information can be found by linear regression.This is especially important when the amount of data to be filtered is large.In this study we used approximately 2 × 10 5 data points, meaning that with k max = 6, approximately 1.2 × 10 6 polynomial fits had to be conducted.

Calculation of adaptive width of moving window
The window width at time step i, w i [min], in which the data are smoothed by the moving average is now a function of B i and is thus time dependent.We use a simple linear relationship for w i (B i ): where w min and w max are the minimum and maximum allowed window widths.Since B i has a value of 0 if the polynomial explains the complete data variation and a value of 1 if the polynomial explains nothing of the variation, the window width varies between w min for evaporation and/or precipitation events with no noise and w max for events with no evaporation or precipitation.Since w i must be an odd number, w i is rounded to the nearest odd integer.Figure 5 left illustrates the dependency of w i (B i ).We suggest to use the temporal resolution of the measurements (1 min) for w min , so that for B i = 1 the data are not smoothed at all.Note that w max is the time window in which the complete information for data point i is gained (see above).Table 1 shows the calculated values of w i for the depicted times of Fig. 4 with w max = 31 min.A too low order of the polynomial (e.g., k max = 1) would lead to larger window widths and thus to less accuracy for strong signals like the heavy precipitation event (not shown).As evaporation gives a relatively low signal with a maximum of approximately 0.015 mm min −1 (van Bavel and Hillel, 1976), even little noise will lead to low values for B and thus to large window widths (Table 1).

Calculation of adaptive threshold value
The dynamic impact of external mechanical disturbances on the accuracy of the system is taken into account by introducing a linear functional relationship between the threshold value and the 95 % confidence interval of the residuals: for s res,i • t 97.5,r ≥ δ max s res,i • t 97.5,r for δ min < s res,i • t 97.5,r < δ max δ min for s res,i • t 97.5,r ≤ δ min , where δ min and δ max are the minimum and maximum allowed accuracy for the fluxes and t 97.5,r is the Student t value for the 95 % confidence level, meaning that 95 % of all data lie within the fitted polynomial ±s res,i • t 97.5,r .The threshold value, δ i , is minimal for low-noise conditions and maximal for high-noise conditions.Figure 5 right illustrates the dependency of δ(s res ).The value for δ min is set slightly larger than the lowest scale resolution in the lysimeter system.In our case, δ min is set to 0.081 mm.The upper limit δ max is set to a value that is high enough to guarantee that changes due to noise are not interpreted as real signals.Table 1 shows the calculated values of δ i for the depicted times of Fig. 4 with δ max = 0.24 mm, which is approximately 3 times δ min .
In the typically applied filter routines (see above), there are two filter parameters that have to be defined before starting the filter, namely w and δ.In our new routine, w min and δ min are given by the temporal resolution and the scale resolution.Again, only two parameters have to be defined, namely w max and δ max .
In the following we will compare the performance of the new adaptive width and threshold filter (denoted as AWAT) to that of the MA and second-degree SG filters with fixed w and δ.

Results -test on data
The MA and SG filters were applied with three fixed window widths, namely 11, 31 and 61 min, and two threshold values, 0.081 and 0.24 mm.These values were also used as w max and δ max for the AWAT filter.In summary, three filter routines with three window widths and two threshold values were applied, yielding a total of 18 variants.

Test of AWAT filter with variable w and fixed
δ = 0.081 mm In Fig. 6, the upper boundary fluxes of the three events are shown together with the applied filters.For all three filters, the threshold value was 0.081 mm and the window width was 11, 31 and 61 min.
In the case of a narrow window width of 11 min the smooth evaporation (left) and the heavy rainfall event (right) can be described reasonably well with the SG and MA filters.However, the data with strong wind (center) would be interpreted as a series of small evaporation and precipitation events.Since there was no precipitation at 23 September 2012 (detected with rain gauge), this is a misinterpretation and thus a wider window width is required.If the width is increased to 31 or 61 min, the data noise is reduced but still visible to some extent for that day.However, this noise reduction is done at the cost of the accuracy for the heavy rain event, where the narrow window is optimal.For the event with smooth evaporation, the window width has no significant impact on the results.
The SG filter does not smooth the heavy precipitation data as much as the MA filter does, but it tends to oscillate, which will lead to an overestimation of both precipitation and evapotranspiration.This oscillation behavior of SG filters was also reported by Bromba and Ziegler (1981).
Using the new AWAT filter leads to a better description of the data.Again, the smooth evaporation event is well described.Moreover, the heavy precipitation event is also very well described, with w max being either 11 or 31 min.Even with w max = 61 min, the data are described reasonably well.The strong wind event is better described by the AWAT filter than by the SG filter and equally well as by the MA filter.Thus, the noise for the strong wind event is greatly reduced but in none of the cases completely erased.It is obvious from the data that the measurement accuracy is worse than the scale accuracy in that time interval.Therefore, δ or δ max must be increased, as shall be discussed next.6.Three benchmark events as depicted in Fig. 1 and the filter routines used with different window widths and threshold value δ = 0.081 mm.SG: Savitzky-Golay filter; MA: simple moving average; AWAT: new filter with adaptive window width and threshold value.In the case of AWAT, w ≡ w max and δ ≡ δ max .Note that the time and flux intervals are different for the three events.

Test of AWAT filter with variable w and δ
In Fig. 7, the threshold value for the MA and SG filters was now 0.24 mm, whereas for the AWAT filter, δ i is given by Eq. ( 8), with δ min = 0.081 mm and δ max = 0.24 mm.
Increasing δ for the MA and SG filters leads to better filtering in the middle of the strong wind event, where δ = 0.24 mm might better represent the low measurement accuracy in that time interval.However, this large value is unsatisfactory for the beginning and the end of that day, when low noise and thus higher accuracy is observed.Moreover, with δ = 0.24 mm the smooth evaporation event is no longer well described.Thus, the quality increase in the middle of the strong wind event leads to an accuracy loss for the smooth evaporation event, where the measurement accuracy is actually better than 0.24 mm.Using a constant value of δ = 0.24 mm for the AWAT filter leads to the same disadvantages as for the MA and SG filters (not shown).For the heavy precipitation event, the higher value for δ does not significantly influence the results.
In contrast, the AWAT filter with variable δ i leads to very good results if w max = 31 min.Even in the case of w max = 61 min, the new filter is well suited, although the data of the heavy precipitation event are now filtered slightly worse.Ob-viously the AWAT filter with variable window width and accuracy is better suited to separate evaporation and precipitation from noise than compared to the MA and SG filters.In the following, this statement is underlined by an analysis of residuals.

Analyzing residuals
Figure 8 shows the frequency distributions of the residuals between filtered and measured data for the case with w and w max = 31 min for the three filters.The blue bars show the residual distribution for filtering without threshold values.In that case the residuals are more or less symmetrically distributed with zero mean.However, as has been discussed above, omitting the threshold value would lead to an overestimation of both P and ET.
If a threshold value (red bars) of δ = δ max = 0.081 mm is introduced, all filters show a slight tendency towards negative residuals.Since a value of 0.081 mm for δ is too small (see above), a value of 0.24 mm is favored.Now the tendency of MA and SG filter towards negative residuals is strongly increased, whereas the increase is only slight for the AWAT filter.The mean of the residuals for the AWAT filter is −0.021,The tendency towards negative residuals for the filtered data when applying the threshold values is explained as follows: as long as a change from t i−1 to t i is regarded as insignificant, the value for t i−1 is kept (see Figs. 6 and 7).This leads to an underestimation, and thus to negative residuals for evaporation events, as well as to an overestimation, and thus positive residuals for precipitation events.In temperate climates, in which our data were measured, evaporation periods exceed periods with precipitation.

Comparison of estimated cumulative fluxes at upper boundary
The estimated cumulative evaporation for the time period In general, the influence of the magnitude of δ max on the estimated fluxes is only minimal for the AWAT filter (Fig. 10, left panel).From δ max = 0.081 to 0.24 mm, the estimated cumulative fluxes are reduced by ≈ 1.3 mm.For δ max > 0.24, there is no influence on estimated cumulative fluxes anymore.This is different for the MA and SG filter, where the magnitude of δ has a drastic influence on the estimated evaporation and precipitation.
Varying w or w max has a great influence on estimated fluxes for all three filters, with the highest fluxes being estimated for the smallest window widths.As expected, greater w or w max lead to lower fluxes in the complete range of variegated widths for the AWAT and the MA filters.The fluxes estimated with the SG filter can even increase as w increases.This might be due to the fact that the SG filter tends to oscillate depending on signal strength and w (see Figs. 6 and 7).

Summary and conclusions
A new filter routine for lysimeter data with adaptive averaging window width and threshold value was introduced.A test with benchmark events, including strong wind as well as smooth evaporation and heavy rainfall, showed that neither a simple moving average nor the more sophisticated Savitzky-Golay filter were able to meet all three events with high accuracy.In contrast, the new filter was able to meet the data of all three events very well.Thus, the new filter can greatly help to separate precipitation and evapotranspiration from noise with much better precision for different atmospheric conditions.
Although not perfectly matching the data, a moving polynomial was sufficient to yield the required information for window width and threshold value.The usage of spline functions with k knots might be more precise than a polynomial of kth order.However, such spline functions must be fitted by nonlinear regression, which would consume more computer resources by far.This would particularly limit the procedure for large data sets.The suggested routine with polynomial regression requires approximately 30 s to 1 min on a regular personal computer for the analyzed time of approximately 140 days, including ≈ 2×10 5 data points in 1 min resolution.
Using the Savitzky-Golay filter led to oscillation in the filtered output for the heavy precipitation event resulting in an overestimation of both precipitation and evapotranspiration.As such events occur in most climates, it is not recommended to use the Savitzky-Golay filter for evaluating lysimeter data.
The SG and MA filter require two filter parameters, namely the window width w and the threshold value δ.The selected value for δ has a drastic influence on the estimated fluxes for the SG and MA filter.For the AWAT filter, the maximum threshold value, δ max , had practically no influence if greater than 0.16 mm.Figures 6 and 7 show that δ max = 0.24 mm was a much better choice than δ max = 0.081 mm.Thus, it is concluded that δ max can be set to any reasonably high value.The value for w and w max had great influence on the results for all three filters.Thus, if δ max is given a reasonably high value, only one filter parameter, w max , remains.Choosing w max carefully through expert knowledge should result in high-quality filtering of lysimeter data with respect to precipitation and evapotranspiration estimations.For our benchmark events, including very different atmospheric conditions, w max = 31 min led to the best results.
It is worthy of mention that noise caused by wind is not necessarily symmetric around the mean signal.Wind might lead to temporally different air pressures above the lysimeter compared to the lysimeter cellar, which in turn might lead to slightly systematic lower or higher values for lysimeter weights in such wind events.However, strong wind events do lead to greater noise, which leads to higher threshold values.In the strong wind event (Figs. 6 and 7), a systematic effect is barely visible, whereas the noise is very high.Lower wind speeds will lead to lower noise but also to lower systematic effects.Thus, a small systematic effect due to wind will not be accounted for in the analysis.
The new filter should be tested with other data sets and with artificial data (Schrader et al., 2013) to prove its general applicability and to figure out whether 31 min is a generally applicable maximum window width.

Fig. 1 .
Fig. 1.Raw data for cumulative upper boundary flux of the lysimeter.The three subplots with zoomed data depict three different representative benchmark events (6 July, 21 August and 23 September 2012) that have to be met by the filter routine.Note that the time and flux intervals for the three cases are different.

Fig. 2 .
Fig. 2. Cumulative upper boundary flux data on 23 September 2012 without filter (left panel), with moving-average (MA) filter (center panel) and with additional threshold value δ (right panel).Filter parameters were w = 31 min and δ = 0.081 mm.

Fig. 4 .
Fig.4.Polynomials fitted to raw data at selected times.Upper row: data from smooth evaporation event at 6 July 2012; middle row: data from strong wind event at 23 September 2012; lower row: data from heavy precipitation event on 21 August 2012.The chosen window width for the polynomial fit, w max , is 31 min.Note that for the smooth evaporation and strong wind events, only a small part of the complete day is shown.

Fig. 5 .
Fig. 5. Schematic illustration of the dependencies of the averaging window width, w, on signal strength, B, (left panel) and the threshold value, δ, on fitting accuracy of the polynomial, s res,i • t 97.5,r (right panel).See text for further explanations.
Fig.6.Three benchmark events as depicted in Fig.1and the filter routines used with different window widths and threshold value δ = 0.081 mm.SG: Savitzky-Golay filter; MA: simple moving average; AWAT: new filter with adaptive window width and threshold value.In the case of AWAT, w ≡ w max and δ ≡ δ max .Note that the time and flux intervals are different for the three events.

Fig. 7 .
Fig. 7. Same as Fig. 6 but with δ = 0.24 mm.SG: Savitzky-Golay filter; MA: simple moving average; AWAT: new filter with adaptive window width and threshold value.In the case of AWAT, w ≡ w max and δ ≡ δ max .Note that the time and flux intervals are different for the three events.

Fig. 8 .
Fig. 8. Relative residual frequency distribution for the complete data set and the different filters with w = w max = 31 min.Blue bars indicate residuals between original and filtered data for the cases with mere smoothing, omitting the threshold values; red bars indicate cases with threshold values of either 0.081 (top panels) or 0.24 mm (bottom panels).The broad bars at plot edges comprise all residuals greater than 0.25 or smaller than −0.25 mm.

Fig. 9 .
Fig. 9.Estimated cumulative evaporation for the period from 5 July to 7 October 2012 with two different values for δ or δ max and with w = w max = 31 min.

Fig. 10 .
Fig. 10.Estimated sum of evaporation and precipitation for the time from 5 July to 7 October 2012.Left panel: varied filter parameter was δ (MA and SG) or δ max (AWAT).Right panel: varied filter parameter was w (MA and SG) or w max (AWAT).

Table 1 .
Calculated variables for the depicted times of Fig.4.The letters refer to the subplots in Fig.4.