A spatial neural fuzzy network for estimating pan evaporation at ungauged sites

Evaporation is an essential reference to the management of water resources. In this study, a hybrid model that integrates a spatial neural fuzzy network with the kringing method is developed to estimate pan evaporation at ungauged sites. The adaptive network-based fuzzy inference system (ANFIS) can extract the nonlinear relationship of observations, while kriging is an excellent geostatistical interpolator. Three-year daily data collected from nineteen meteorological stations covering the whole of Taiwan are used to train and test the constructed model. The pan evaporation (Epan) at ungauged sites can be obtained through summing up the outputs of the spatially weighted ANFIS and the residuals adjusted by kriging. Results indicate that the proposed AK model (hybriding ANFIS and kriging) can effectively improve the accuracy of Epan estimation as compared with that of empirical formula. This hybrid model demonstrates its reliability in estimating the spatial distribution of Epan and consequently provides precise Epan estimation by taking geographical features into consideration.


Introduction
Evaporation is one of the main elements that affect water storage and temperature in the hydrological cycle.An accurate estimation of evaporation is crucial for the management of agricultural irrigation, water balance and soil conservation.However, it is difficult to effectively simulate its variation due to the complex interactions between land and atmosphere systems.In previous hydrological applications, a number of direct and indirect methods were applied to the measurement and estimation of evaporation.One direct method was to use evaporation pans to accumulate the total amount of evaporation at a specific location during an observation period, while indirect methods such as mass transfer and empirical equations had difficulty in identifying the most suitable equation that best fits the wide range of data types.Those problems should be explored through better models that could satisfy inherently nonlinear systems.
Artificial neural networks (ANNs) are an adaptive system that adjusts its structure according to the given input-output patterns.The nonlinear characteristics of ANNs often serve as a viable tool for physical or stochastic models in various hydrological fields (Antar et al., 2006;Chen and Chang, 2009;Chiang and Chang, 2009;Chang et al., 2005;Kim and Barros, 2001).In recent years, the applications of ANN to evaporation estimation were presented in many studies (Trajkovic et al., 2003;Keskin and Terzi, 2006;Kisi, 2006;Kisi and Ozturk, 2007;Gonzalez-Camacho et al., 2008;Chang et al., 2010).In general, those applications of ANN-based models could only produce point estimates.Therefore, how to extend the applicability of ANN models from point estimation to spatial estimation would be an important scientific issue because ANNs are not distributed models but lumped models.Therefore, it will be of great help if ANNs can combine geostatistical methods such as kriging for estimating the spatial distribution of evaporation.In addition, Choudhury et al. (1994), Xu et al. (2006) and McVicar et al. (2007) had discussed the relationship between reference evapotranspiration (ET 0 ) and pan evaporation (E pan ) to obtain the pan coefficient (K pan ) in spatial interpolation issues.The aforementioned research provided evidences of the estimation and/or prediction of evaporation only at gauged sites using ANNs.In this study, a hybrid computational model is proposed by combining the ANN and kriging for estimating the evaporation at ungauged sites.Kriging has been recognized as a geostatistical tool for interpolating the value at a random location based on nearby information and has been applied to problems in hydrological fields (Schuurmans et al., 2007;Chen, 2008;Neuman and Jacobson, 1984;Legleiter, 2008).A number of studies have investigated the applicability of neural networks with geostatistics to environment, such as fallout (Kanevsky et al., 1996), temperature (Koike et al., 2001), etc.Nevertheless, all of the abovementioned studies merely performed the spatial estimations through two-dimensional coordinate (latitude and longitude).The spatial estimation of evaporation developed in this study was achieved by using three-dimensional information including latitude, longitude and elevation.Moreover we specifically take the meteorological variables related to evaporation for estimating the pan evaporation at ungauged sites by integrating kriging into ANN which never been investigated previously.
In this study, a novel approach that combines a spatial neural fuzzy network with kriging is implemented to improve the accuracy of evaporation estimation and to extend its capability for displaying the spatial distribution of evaporation.Evaporation was measured by Class A Pan (pan evaporation), and the collected data contained invariant information including geographical coordinates and elevation.Moreover, the relation between the ANN and kriging is that the ANN is assigned to extract knowledge only from changeable   parameters (evaporation and meteorological variables).The outputs of the spatially weighted ANN can then be combined with the residuals adjusted by simple kriging for producing evaporation estimation at ungauged sites.Another scientific issue raised in this study is that the hybrid model can reliably reflect the impacts of topographical features on evaporation.Section 2 presents the hybrid methodology that integrates the spatially weighted ANFIS and simple kriging method (hereinafter called the AK model).Section 3 addresses the study area, data sets, the model construction procedure and the empirical formula.Section 4 shows the results of the proposed hybrid model which are compared with those of the empirical equation.The contribution and findings in this study are given in Sect. 5. Finally, the conclusion of this study is drawn in Sect.6.

Methodologies
In the following sub-sections, the ANFIS, spatial weight method, kriging method and model construction process are introduced.

Adaptive network-based fuzzy inference system (ANFIS)
The adaptive network-based fuzzy inference system (AN-FIS), initially introduced by (Jang, 1993), has been successfully applied to many problems (Chang andChang, 2001, 2006;Chen et al., 2006;Shu and Ouarda, 2008).The ANFIS uses neural network learning algorithms coupled with fuzzy reasoning to map an input space onto an output space.The architecture of the constructed ANFIS model with a brief description is shown in Fig. 1.
The ANFIS is usually trained by a hybrid supervised learning algorithm to optimize both linear and nonlinear parameters.Furthermore, it is common to adopt the subtractive fuzzy clustering algorithm when establishing the fuzzy rules between input and output variables.The subtractive fuzzy clustering algorithm determines the minimum number of rules to discriminate the fuzzy quality associated with each cluster.Therefore, this algorithm determines the fuzzy rules of the constructed ANFIS model in this study.

Spatial weight method
Owing to great uncertainties in spatial distribution, evaporation at different locations is usually affected by various factors.Daly ( 2006) had an overview of commonly used interpolation techniques.The inverse distance weighting method is introduced to integrate the results given by the point estimator (ANFIS), and then the evaporation at specific ungauged sites of interest will be computed in this case study.The implementation procedure of this method is described as follows: 1. identify the TWD67 (Taiwan Datum 1967) coordinates (x and y) and elevation (z) of two sites (gauged and ungauged sites) and calculate the distance d between the two sites using Eq. ( 1), where subscripts o and i represent gauged and ungauged sites, respectively.Because elevation is a key factor for evaporation and the elevation difference (less than 2000 m) is much smaller than the distance (less than 300 km) between gauged and ungauged sites.Based on a great number of trial-anderror process, the emphatic weight is set as 1000, which can also refer to Hutchinson (1995) that indicated that the weighted values of elevation and distance should be approximately equal.
2. Define the inverse square ratio f as Eq. ( 2) 3. Compute the weight w by calculating the distance between every E pan and every grid cell of the ungauged site as Eq. ( 3): (3)   4. Calculate the evaporation E at every grid cell of the ungauged site based on the data from n gauged sites (n = 16 in this case) by using ANFIS output E instead of measurements.

Kriging method
Kriging, synonymous with optimal prediction (Journel and Huijbregts, 1981), is an interpolation method that uses a variogram to express the spatial variation and predicts unknown values from data observed at known locations, and therefore is applied to analyzing the spatial information extracted from the residuals by ANFIS.Compared with the other kriging methods, the simple kriging method is mathematically the simplest.The practical assumptions for the simple kriging are a known constant trend, second-order stationary and a covariance function.

Model construction
The hybrid AK model integrates the spatially weighted AN-FIS and kriging method.The construction procedure of the AK model is described below (see Fig. 2).
(a) Determining the optimal ANFIS structure To determine a suitable structure for the ANFIS model,  a number of network combinations with different numbers of rule nodes are calibrated based on training datasets, and then the optimal ANFIS structure is determined based on validation datasets (or "in the validation phase").

(b) Generating the residual covariance
To get the bias produced by the ANFIS outputs in spatial distribution, residuals are calculated based on the differences between the ANFIS outputs and the observations.The spatial coordinates corresponding to each residual are given to develop the covariance.
(c) Developing the residual map The residual map is developed by considering covariance functions, residuals, spatial coordinates and elevation.As a result, the residuals ∧ Z(x0) at ungauged sites can be interpolated according to spatial positions and altitudes.
(d) Computing the evaporation at ungauged sites First, the ANFIS outputs at gauged sites are taken into account to compute the evaporation at ungauged sites using the spatial weight method mentioned above.Second, the kriging map can provide the residual with its spatial coordinates and elevation.Finally, the spatial distribution E sp of evaporation at any ungauged site can be obtained by summing up the estimated evaporation E and residual ∧ Z(x0), shown as Eq. ( 5).

Study area and datasets
Taiwan, an island situated in East Asia within the subtropical monsoon zone of northwestern Pacific Ocean, has an area of 35 801 km 2 and is known for its variable climate.The observation data were collected from 19 meteorological stations of Taiwan Central Weather Bureau.In general, each weather station in Taiwan has installed a piston mercury barometer, a sheathed thermometer, a propeller anemometer, a tipping-bucket rain gauge, a class A pan, a hair hygrometer, a pyranometer, a solar-cell sunshine recorder and a psychrometer for measuring pressure, temperature, wind speed, rainfall, pan evaporation, humidity, global solar radiation, sunshine hour and humidity, respectively.Figure 3 shows the locations of the meteorological stations in study area.Nevertheless, the data collected at each station were incomplete and/or inconsistent, and therefore data in the same time scales will be arranged for spatial interpolation.Tables 1 and 2 show the statistics of meteorological variables and evaporation in training, validation and test subsets.The properties of all variables are similar in three subsets, which indicate the data structures of these three independent subsets for model construction have the same characteristics.Each of the first three variables (sunshine hour, radiation and temperature; see Table 1) that are highly related to evaporation has a correlation coefficient higher than 0.6.However, dew point temperature, humidity and wind speed have been frequently suggested in the evaporation estimation literature, and therefore are adopted in this case study.Figure 4 displays the spatial distribution of the measured evaporation in the aspect of average and standard deviation.It is obvious that the evaporation is less in northern Taiwan than in southern Taiwan; however the variance of evaporation is even larger in northern Taiwan than in central and southern Taiwan, which indicates the climate and weather conditions are more complicated and more variable in northern Taiwan.This is another reason for selecting station Nos.17-19 (located in northern, central and southern Taiwan, respectively) as test targets.In other words, the capability of model generalization can be identified when the constructed estimation model for evaporation performs well in the testing phase.Data arranged in training and validation datasets are used for building the model structure and optimizing the corresponding parameters, and the test data are used for performance comparison between the ANFIS-based model and the empirical formula when estimating the spatial distribution of evaporation.

Achievement comparison
The Penman equation is a popular empirical formula for evaporation estimation, and its original version that described evaporation from an open water surface was developed by Penman (1948).Donohue et al. (2010)   Penman formula gives the most reasonable result on the estimation of potential evaporation among five commonly used empirical formulas.Numerous Penman equations have been used to assess evaporation from water and lands.Specifically, the Food and Agricultural Organization of the United Nations (FAO) suggests using the definition of the reference evapotranspiration from Smith et al. (1997), and the FAO also recommends Penman-Monteith equation as a standard method for computing daily reference evapotranspirationcrop (Allen et al., 1998;Chang et al., 2010).Moreover, Xu and Singh (1998) indicated that the monthly values of the Penman method agree most closely with the pan evaporation values.The Penman-Monteith (PM) empirical formula as shown in Eq. ( 6) is therefore compared with the AK model in this study.The PM formula considers many physical conditions such as crop canopy resistance and aerodynamic resistance to replace the original wind speed function.To have the same comparison standard, Yeh et al. ( 2008) is referred, which analyzed reference evapotranspiration calculated by Penman-Monteith formula and pan evaporation to determine the pan coefficient (K pan ) based on 15-yr (1990-2004) meteorological data in Taiwan.The mean of 15-yr K pan in station Nos.17-19 are 0.82, 0.9 and 0.82, respectively, and the K pan equation is shown in Eq. ( 7).

indicated the
where ET 0 is the reference evapotranspiration (mm day −1 ), R n is the global solar radiation (MJ m −2 day −1 ), G is the soil heat flux density (MJ m −2 day −1 ), T is the mean daily air temperature at a height of 2 m above ground ( • C), u is the wind speed at a height of 2 m above ground (m s −1 ), (e s −e a ) is the deficit of saturation vapor pressure (kPa), is the slope vapor pressure curve (kPa • C −1 ), γ is the psychrometric constant (kPa • C −1 ), and E pan is pan evaporation (mm day −1 ).

Comparative performance
To compare and assess the performance and reliability between the AK model and the Penman-Monteith formula, the Nash-Sutcliffe model efficiency coefficient (CE) and rootmean-square error (RMSE) are used as performance criteria.CE (Eq.8) ranges from −∞ to 1, a CE of 1 means an excellent fit between measurements and model outputs.RMSE (Eq.9) is a quantitative statistics adopted to measure how close the model simulations are to the observations and to assess the total error of model outputs.
where o i , e i and o are the observed value, the predicted value and the average of observed values, respectively, and n is the number of data.

Results
To demonstrate the effectiveness of the AK model, the evaporation at ungauged sites is adopted as the target to test the reliability and stability of the constructed AK model.The number of fuzzy rules arranged in ANFIS is affected by the radius of the cluster, and thus the determination of the radius is a crucial step when constructing an appropriate AN-FIS structure.The error variations are associated with the radiuses ranging from 0.35 to 0.6 with an increment of 0.05 each time, and therefore 11, 8, 7, 7, 5 and 4 fuzzy rules are produced for corresponding radiuses, respectively.The criterion adopted to determine a radius is based on RMSE, which is a useful measure to illustrate the predictive capability of a model.In general, the training error gradually increases as the radius increases in the training phase, and the minimum validation error will be obtained when the radius equals 0.5.
Therefore, the optimal radius selected to build the best AN-FIS structure is 0.5, which then produces seven fuzzy rules.The fitted model is developed by the exponential function in Eq. ( 10).The parameters of the exponential function are set as c 1 = 0.28 and a = 60 000, and r is the spatial variable.The spatial effect between two sites occurs when the distance of the two sites is less than 40 000 m (Fig. 5).
Figure 6 shows the spatial distribution of the averages of residuals obtained from the differences between the ANFIS outputs and the observations by kriging, where kriging interpolates values onto the residual map based on the differences between the ANFIS outputs and the observations.The average of all residuals from the residual map is computed.First, the higher variation of residuals that occurs in northern Taiwan is due to a denser deployment of stations (7 sites) and the very complex terrain in this region.Second, it is obvious that the residual map follows a uniform distribution in central Taiwan because fewer stations are set up in the Central Mountain Range (see Fig. 3).Finally, the spatial residual information is helpful when trying to realize the error pattern of ANFIS outputs, thus reducing the spatial errors when estimation is required at ungauged sites.
Results displayed in Table 3 below indicate that performance obtained in the testing phase is comparable to that of the training and validation phases, demonstrating the generalization was well achieved by the proposed model.
Station Nos.17-19, excluded from training and validation datasets, are assumed to be ungauged sites, and an AK model is constructed to accurately estimate evaporation at these locations.Table 4 illustrates the test performance of the AK model and the empirical formula for evaporation estimation at three assumed ungauged sites.According to Table 4, several conclusions can be drawn: (1) the AK model effectively generates more precise and more consistent evaporation estimation than the PM model in terms of smaller RMSE and higher CE; (2) as far as higher evaporation is concerned, the AK model produces much better performance (smaller RMSE) than the PM model; (3) for station No. 18, the AK model highly improves the accuracy of evaporation estimation over the PM model because the CE value in PM is negative (−0.32).The reason for a smaller CE value is mainly because that station No. 18 is located in a mountainous area with an elevation of about 1000 m where the weather type, geographical features and temperature are quite different from those of the other two ungauged sites.
Figure 7 further shows the variation and tendency of evaporation at station Nos.17-19 with daily evaporation accumulated in a seasonal scale.It is clear that the variation of evaporation measured at station No. 18 is relatively flat and small as compared with that of the other two stations.In addition, the maximum seasonal evaporation observed at station No. 18 (191 mm) is much smaller than that at stations No. 17  (331 mm) and No. 19 (323 mm).This is the reason why both models overestimate evaporation at station No. 18 and therefore results in smaller CE values.However, results displayed in Table 4 and Fig. 7 demonstrate that the AK model provides more accurate evaporation estimation at ungauged sites and the improvement rates in terms of RMSE are 26 %, 18 % and 5 % in northern, central and southern Taiwan, respectively, which shows that the kriging method actually improves the capability of ANFIS in dealing with the targets related to spatial distribution.
Table 5 illustrates the test performance of the AK and the PM models in daily and seasonal scales.Because most of the daily evaporation observations are less than 7 mm, the statistics calculated from daily estimations are not conspicuous.Therefore, these statistics are evaluated mainly in seasonal scales by accumulating daily estimations.As far as the seasonal scales are concerned, it is obvious that the accuracy and efficiency of the AK model is much better than that of the empirical formula, indicating that the AK model is capable of producing more stable and reliable evaporation estimation.Figure 8 shows the scatter plots of AK estimations versus observations in the testing phase in daily and seasonal scales, respectively.Results demonstrate that the AK model fits the ideal line very well with slight underestimation only in the vicinity of high values.

Discussion
Results obtained in this study demonstrate that the AK model not only provides precise estimation of pan evaporation but also enhances the applicability of ANNs to spatialrelated computation.The major difference between the AK model and the single ANN model is that the AK model is capable of estimating the spatial distribution of pan evaporation at ungauged sites without using their meteorological measurements, whereas the single ANN model is not able to provide estimations at ungauged sites where meteorological measurements are not available.The proposed hybrid AK model extends the estimation of pan evaporation from point scales to spatial scales.Therefore, the spatial distribution of the estimated pan evaporation on a seasonal scale is achieved by implementing the AK method.The study area is first divided into 80 000 grids (400 × 200 with a resolution of 1 km × 1 km); then the seasonal evaporation during January 2007-December 2008 is computed by accumulating the daily evaporation for each grid accordingly.Consequently, two-year averages of seasonal evaporation for all the grids are produced (Fig. 9).It is obvious that evaporation is higher and more complex in summer than in the other seasons.The distribution of evaporation is also more diversified in northern Taiwan than in southern Taiwan.Nevertheless, evaporation is lower in northern Taiwan than in southern Taiwan no matter which season it is.Northern Taiwan with mountainous terrains belongs to subtropical zones, whereas southern Taiwan with plain topography belongs to tropical zones.In addition, it is worth noting that evaporation is much less in central Taiwan than in the other places, which is because most areas in this region are located within the Central Mountain Range (high elevation area, see Fig. 3).This study result can reflect the geographical features of Taiwan.Overall, the efficiency and precision of the proposed AK model outperforms that of the traditional method.The AK model also demonstrates its capability for producing accurate and reliable evaporation estimation and for introducing spatial relation to enhance the ANFIS when estimating evaporation at ungauged sites, and therefore is a helpful technique for hydrological applications such as precipitation estimation at ungauged sites.In sum, the contribution of this study greatly enhances the ANN-based model for displaying the spatial distribution of pan evaporation, for reflecting the impacts of geographical features on the estimation of pan evaporation, and for estimating pan evaporation at ungauged sites, which are difficult to be achieved by a single ANN model.

Conclusions
A hybrid AK model, which integrates a spatially weighted ANFIS with kriging, is developed for accurately estimating pan evaporation at ungauged sites.Results obtained from the AK model are compared with those from the Penman Monteith empirical formula when estimating evaporation at ungauged sites.It is important to note that the AK model estimates evaporation without using meteorological variables at ungauged sites, whereas the PM model uses meteorological variables directly.The role of ANFIS in the AK model is to estimate evaporation at gauged sites and extend its estimations to ungauged sites through the spatial weight method; whereas the use of kriging is to adjust the spatial error of ANFIS outputs.Once the AK model is well developed and trained, the operation of the AK model merely requires coordinates and elevation data at ungauged sites and coordinates, elevation data and the meterological variables at gauged sites in practice.In addition, the daily estimation of evaporation in the testing phase obtained from the AK model provides an RMSE of 1.09 mm day −1 , whereas the estimation accuracy of the PM model only achieves 1.34 mm day −1 .In summary, the ANFIS provides more accurate point estimation of evaporation at all stations than the Penman Monteith empirical formula; and the AK model not only significantly reduces the estimation errors but provides robust and precise temporal and spatial distribution of evaporation by integrating kriging into the spatially weighted ANFIS, which can be effectively extended to cover the whole of Taiwan in four seasons.We conclude that the proposed hybrid AK model demonstrates good reliability when estimating evaporation at both gauged and ungauged sites and is an effective and efficient method for use in related fields that involve temporal and spatial relationship management.

Fig. 1 .
Fig. 1.Architecture of an ANFIS model with seven fuzzy rules for evaporation estimation.W, S, R, T, H and D are wind speed, sunshine hour, radiation, temperature, humidity and dew point temperature, respectively.

Fig. 1 .
Fig. 1.Architecture of an ANFIS model with seven fuzzy rules for evaporation estimation.W , S, R, T , H and D are wind speed, sunshine hour, radiation, temperature, humidity and dew point temperature, respectively. 36

Fig. 2 .
Fig. 2. Flowchart of the constructed AK model for estimating the spatial distribution of evaporation.

Fig. 2 .
Fig. 2. Flowchart of the constructed AK model for estimating the spatial distribution of evaporation. 37

Fig. 4 .
Fig. 4. Statistical features of the spatial distribution of daily pan evaporation in 526 Taiwan: (a) average of evaporation; and (b) standard deviation.527 528 529 Fig. 4. Statistical features of the spatial distribution of daily pan evaporation in Taiwan: (a) average of evaporation; and (b) standard deviation.

Fig. 5 .
Fig. 5. Covariance values obtained from the fitted and the experimental models based on the residuals from ANFIS.Fig. 5. Covariance values obtained from the fitted and the experimental models based on the residuals from ANFIS.

Fig. 6 .
Fig. 6.Residual map of the averages of daily evaporation residuals.

Fig. 8 .
Fig. 8. Scatter plots of AK estimation versus observations in testing phase: (a) in daily 537 scale; and (b) in seasonal scale.538

Fig. 8 .
Fig. 8. Scatter plots of AK estimation versus observations in testing phase: (a) in daily scale; and (b) in seasonal scale.

Table 1 .
Statistics of meteorological variables in training, validation and test subsets.

Table 2 .
Statistics of evaporation in training, validation and test subsets.

Table 3 .
Performance of the ANFIS model in the training, validation and testing subsets.

Table 4 .
Performance of the AK and PM models at individual meteorological station.
532 Fig. 6.Residual map of the averages of daily evaporation residuals.533

Table 5 .
Test performance of the AK and PM models in daily, monthly and seasonal scales.