Comparison of MODIS and SWAT evapotranspiration over a complex terrain at different spatial scales

In most hydrological systems, evapotranspiration (ET) and precipitation are the largest components of the water balance, which are difficult to estimate, particularly over complex terrain. In recent decades, the advent of remotely sensed data based ET algorithms and distributed hydrological models has provided improved spatially upscaled ET estimates. However, information on the performance of these methods at various spatial scales is limited. This study compares the ET from the MODIS remotely sensed ET dataset (MOD16) with the ET estimates from a SWAT hydrological model on graduated spatial scales for the complex terrain of the Sixth Creek Catchment of the Western Mount Lofty Ranges, South Australia. ET from both models was further compared with the coarser-resolution AWRA-L model at catchment scale. The SWAT model analyses are performed on daily timescales with a 6-year calibration period (2000– 2005) and 7-year validation period (2007–2013). Differences in ET estimation between the SWAT and MOD16 methods of up to 31, 19, 15, 11 and 9 % were observed at respectively 1, 4, 9, 16 and 25 km2 spatial resolutions. Based on the results of the study, a spatial scale of confidence of 4 km2 for catchment-scale evapotranspiration is suggested in complex terrain. Land cover differences, HRU parameterisation in AWRA-L and catchment-scale averaging of input climate data in the SWAT semi-distributed model were identified as the principal sources of weaker correlations at higher spatial resolution.


Introduction
In most hydrological systems, evapotranspiration (ET) and precipitation are the largest components of the water balance (Nachabe et al., 2005) and yet the most difficult to estimate, particularly over complex terrain (Wilson and Guan, 2004).In arid and semi-arid environments ET is a significant sink of groundwater, with ET often exceeding precipitation (Domingo et al., 2001;Cooper et al., 2006;Scott et al., 2008;Raz-Yaseef et al., 2012).Reliable estimation of ET is integral to environmental sustainability, conservation, biodiversity and effective water resource management (Cooper et al., 2006;Boé and Terray, 2008;B. Zhang et al., 2008;Tabari et al., 2013).Moreover, ET will be one of the most severely impacted hydrological components of the water cycle alongside precipitation and runoff as a consequence of global climate change (Abtew and Melesse, 2013).
Reliable, cheap and generally accessible methods of estimating ET are essential to understand its role in catchment processes.ET is principally measured and estimated using ground based measurement tools and/or through various modelling techniques often involving remote sensing (Drexler et al., 2004;Tabari et al., 2013).Ground based measurement methods such as the Bowen Ratio Energy Balance (BREB), Eddy Covariance (EC), Large Aperture Scintillometers (LAS) and lysimeters have been regarded as the most accurate and reliable ET determination methods (Kim et al., 2012a;Rana and Katerji, 2000;Liu et al., 2013), but they are spatially and/or temporally limited (Wilson et al., 2001;Glenn et al., 2007).Despite the relative reliability of ground based measurement methods, there are inherent uncertainties associated with the different methods, which affect the accuracy of ET measurements (Baldocchi, 2003;Brotzge and Crawford, 2003;Drexler et al., 2004;B. Zhang et al., 2008).Ground based measurement methods are particularly prone to significant errors related to instrument installation (Allen et al., 2011).Mu et al. (2011) observed that multiple EC towers on a site can have uncertainties ranging between 10-30 % and Liu et al. (2013) documented uncertainty ranges of over 27 % between EC and LAS measurements over the same site on an annual scale.EC towers have also been observed to encounter energy balance closure challenges (Wilson et al., 2002), while other challenges of the EC method such as inaccuracies due to complex terrains have been documented by Feigenwinter et al. (2008).Furthermore, Kalma et al. (2008), conducted a review of 30 remote sensing ET modelling results relative to ground based measurements and contended that the ground based measurement methods were not incontrovertibly more reliable than the remote sensing ET modelling methods.Moreover, most of the ground based measurement methods are usually cost intensive thereby constraining measurements over large areas and thus making spatial extrapolation difficult (Moran and Jackson, 1991;Verstraeten et al., 2008;Melesse et al., 2009;Fernandes et al., 2012).
In more recent years, the spatial challenges associated with ET estimations are being eased by the increased availability of remotely sensed data.The use of remotely sensed input data in many surface energy balance algorithms and highly parameterised hydrological models have been extensively documented (Kalma et al., 2008;Hu et al., 2015;Zhang et al., 2016).The advances in remote sensing have seen these methods become prominent in water resource assessment studies (Sun et al., 2009;Vinukollu et al., 2011;Anderson et al., 2011;Long et al., 2014;Zhang et al., 2016).
Several hydrological models and remotely sensed based surface energy balance models are currently used in ET simulations globally (Zhao et al., 2013;Chen et al., 2014;Larsen et al., 2016;López López et al., 2016;Webster et al., 2017).However, the relative accuracy of these models relative to one another should be extensively explored to improve our understanding of the ET estimation from these algorithms.Two of the more prominent ones will be comprehensively evaluated in this study at various spatial scales -the Soil and Water Assessment Tool (SWAT) (Neitsch et al., 2011) and the MODIS ET product (Mu et al., 2013) derived from remotely sensed data from the Moderate Resolution Imaging Spectroradiometer (MODIS) instrument aboard the National Aeronautics and Space Administration (NASA) Aqua and Terra satellites.The evapotranspiration product of a third model, the Australian Water Resource Assessment model (AWRA_L) with a coarser resolution, will also be evaluated at the catchment scale.
The MODIS ET (MOD16) is based on the Penman-Monteith equation, the AWRA-L uses the Penman equation, while the SWAT ET algorithm also has the Penman-Monteith equation as one of the three user-selectable methods of estimating ET.In this study, the Penman-Monteith method in SWAT is used for a direct comparison with the MOD16 and the AWRA-L.Moreover, the Penman-Monteith equation is regarded as one of the most reliable methods for ET estimation over various climates and regions (Allen et al., 2005(Allen et al., , 2006)).While both the MOD16 and SWAT ET use the Penman-Monteith equation, the methods for estimating the parameters of the equation are significantly different between them.For instance, the SWAT Penman-Monteith implementation requires wind speed data for the computation of the aerodynamic resistance, while the MOD16 Penman-Monteith variant does not require wind speed data, but instead uses the Biome-BGC model (Thornton, 1998) to estimate the aerodynamic resistance.This study does not seek to evaluate the individual accuracy of any method, but rather to compare the ET results from the water balance based hydrological models AWRA-L and SWAT and the energy balance based model (MOD16) over a complex terrain catchment.Two different land cover products are used in the SWAT model in this study (the Geoscience Australia and MODIS land cover products).The rationale for this is to analyse the effect of land cover on the ET modelling in SWAT, and the use of the MODIS land cover also allows for a direct comparison with the MOD16 which uses the same land cover product.The results will be compared temporally on a catchment scale and spatio-temporally on sub-catchment scales to identify the effects of input data and other drivers of ET estimation in the MOD16 and SWAT ET algorithms.
While the MODIS evapotranspiration has been widely studied and compared to other methods, this is much less the case for SWAT ET (Table 1) and the AWRA-L.Moreover, a graduated spatial-scale comparison of the SWAT and MOD16 ET products is yet to be documented over a complex terrain.The objectives of this study are therefore (1) to simulate and compare the results of the evapotranspiration of SWAT, AWRA-L and MOD16 over a complex terrain at a catchment scale in a semi-arid climate; and (2) to analyse and determine the spatial scale at which the SWAT and MOD16 ET models tend towards agreement to enhance the confidence in ET estimation in a complex terrain.

SWAT model
The Soil and Water Assessment Tool (SWAT) is a physically based, semi-distributed hydrological model designed on the water balance concept.SWAT simulates catchment processes such as evapotranspiration, runoff, crop growth, nutrient and sediment transport on the basis of meteorological, soil, land cover data and operational land management practices (Neitsch et al., 2011).The SWAT model has been used in hydrological modelling from sub-catchment scales of under 1 km 2 (Govender and Everson, 2005) to subcontinental scales (Schuol et al., 2008).The model discre-Hydrol.Earth Syst.Sci., 22, 2775Sci., 22, -2794Sci., 22, , 2018 www.hydrol-earth-syst-sci.net/22/2775/2018/  tises a catchment into sub-catchments and further into hydrological response units (HRU), which represent unique combinations of land cover, soil type and slope.The discretisation method employed by SWAT enables the model to simulate catchment processes in detail and to understand the response of unique HRUs to hydrological processes.Evapotranspiration is simulated at the HRU scale.A comprehensive outline of ET calculations in SWAT is included in Appendix A and Fig. 1 summarises in a flowchart the SWAT ET algorithm, where PET is the potential evapotranspiration, E can is the evaporation from the canopy surface, E t is the tran-spiration, E soil is the evaporation from the soil and Revap is the amount of water transferred from the underlying shallow aquifer to the unsaturated zone in response to water demand for evapotranspiration.

MOD16 model
The MOD16 provides evapotranspiration estimates for 109.03× 10 6 km 2 of global vegetated land area at 1 km 2 spatial resolution at 8-day, monthly and yearly temporal resolutions since the year 2000 (Mu et al., 2013).The initial version of the MOD16 algorithm used MODIS imagery as part of a Penman-Monteith method as described in Cleugh et al. (2007).The MOD16 algorithm was significantly improved by the inclusion of a sub-algorithm for estimating soil evaporation as a component of total ET (Mu et al., 2007).Further improvements on the MOD16 algorithm such as the calculation and inclusion of nighttime evapotranspiration and partitioning of evaporation from moist and wet soils were incorporated into the new algorithm (Mu et al., 2011).In this study, the ET products from the new algorithm are used.Details of ET calculations in MOD16 are included in Appendix B, while

AWRA-L model
The  that the shallow-rooted vegetation HRU can only access the first two soil storage layers, while the deep-rooted vegetation HRU can access the three layers.The AWRA-L model simulates catchment hydrological processes such as evapotranspiration, infiltration, runoff, drainage, interflow, and recharge.Evapotranspiration in the AWRA-L is a sum of six processes: canopy evaporation from intercepted precipitation, evaporation from the soil surface, groundwater evaporation, shallow storage transpiration, deep storage transpiration and groundwater transpiration.The evaporation in the model is constrained by the Penman equation (Penman, 1948).For a detailed structure of the AWRA-L model, see Viney et al. (2014).

Penman-Monteith algorithm parameterisation
The MOD16 and SWAT ET algorithms, which are both based on the Penman-Monteith equation but parameterised differently, suggests there will be similarities and differences in the results from both methods.Both algorithms are principally limited on temporal timescales by the available energy to convert liquid water to atmospheric water vapour.Their transpiration and soil evaporation algorithms are also very dependent on vegetation/biome type, VPD, and the soil moisture constraint parameterisation (Fig. 3).
In the SWAT ET algorithm, the VPD significantly impacts the transpiration through the constraining of the stomatal conductance.Detailed soil data on HRU scale such as layer depth, number of layers, unsaturated hydraulic conductivity and water capacity are crucial for constraining the soil moisture content, which in turn regulates the percolation and recharge into the system.Similarly, the calculated MOD16 ET is significantly impacted by the biome properties lookup table (BPLUT) and the soil moisture constraint function.The BPLUT was calibrated using the response of biomes on flux tower sites globally.The BPLUT contains information on the stomatal response of each biome to temperature, VPD and biophysical parameters.The soil moisture constraint function is applied in the estimation of the soil evaporation and is an important parameter in regions where the saturated zone is close to the ground surface such as our study area.
3 Data and methods

Study area
The study area is the Sixth Creek Catchment of South Australia, located in the western part of the Mount Lofty Ranges, which is a range of highlands separating the Adelaide Plains in the west from the Murray-Darling basin in the east.The western part of the Mount Lofty Ranges runs 90 km north to south; its summit is at 680 mAHD (metres Australian Height Datum) (Sinclair, 1980).It extends from the southernmost part at McLaren Vale on the Fleurieu Peninsula to Freeling in the north over an area of 2189 km 2 .The Sixth Creek Catchment is a complex area, with acute elevation changes over a few hundred metres (Fig. 4).The catchment is located close to the summit of the Western Mount Lofty Ranges.
It covers an area of 44 km 2 between 34 52 6.098 to 34 57 54.541 S and 138 42 55.855 to 138 49 27.174 E and has an elevation range of 140-625 mAHD (Fig. 4).The land cover consists of 95 % forestland with significant deeprooted eucalyptus plantation and 5 % pasture, shrubs and grasslands (Fig. 5b).Most of the native vegetation is under conservation.The climate is Mediterranean, with warm dry summers and cool wet winters, and is of the type "Csb" according to the Köppen-Geiger classification.The Sixth Creek is a perennial stream with mean annual discharge of 0.25 m 3 s −1 which accounts for 20-25 % of the mean annual  (Gallant et al., 2011).
The Sixth Creek Catchment's complex terrain plays a significant role in its hydrology, with highly localised precipitation events recorded from the two weather stations in the catchment within the study period.The weather stations are located 4.5 km apart, with an elevation difference of over 200 m (Fig. 4).Differences in annual rainfall of over 400 mm have been recorded between the two weather stations.
The annual precipitation for the period 2002 till 2016 for Station A ranges between 500-900 and 750-1500 mm for Station B, while the temperature ranges between 10.5 and 22.2 • C in the summer months and 3.4 and 10 • C in the winter months.

Input datasets
The GIS interfaced version of SWAT (ArcSWAT) was used in the hydrological modelling.A 30 m Digital Elevation Model (DEM) (Dowling et al., 2011) of the Sixth Creek Catchment was used to extract the stream network and the catchment area.A detailed soil properties database for the catchment was created from the soil data obtained from the Australian Soil Resource Information System (Johnston et al., 2003).The 250 m land cover map of Australia from Geoscience Australia's Dynamic Land Cover database (Fig. 5b) is typically preferred to be used in the SWAT model ahead of the 500 m MOD12 land cover map (Fig. 5a) due to its finer spatial resolution and better biome match with local field knowledge but for direct comparison with MOD16, both maps are used to run separate SWAT models.In this study, the 0.01 • × 0.01 • wind speed data (McVicar et al., 2008), and the 0.05 • × 0.05 • relative humidity, temperature, rainfall, solar radiation (Jeffrey et al., 2001), were preferred to weather station data.Four 0.05 • × 0.05 • gridded data cells fall within the boundaries of the catchment and are therefore comparable to the climate components of the two weather stations in the catchment.Moreover, the gridded data used in this study are calibrated using the weather stations across Australia including the two weather stations in the Sixth Creek Catchment, thus maintaining excellent correlation when compared to the weather stations' measured data.Details of the gridded data methodology and algorithm used in this study can be found in Jeffrey et al. (2001) and McVicar et al. (2008).The daily gridded climate datasets were simply averaged over the Sixth Creek Catchment, to obtain values used in this study.
The monthly MOD16 datasets for the years 2000 to 2013, at 1 km 2 spatial resolution, were used in this study (Mu et al., 2013).Catchment averages were calculated by simple averaging of all the 1 km 2 cells that fall within the catchment area.

SWAT model set-up and calibration
The soil, land cover and DEM derived slope data were classified into classes and used to create 124 and 119 unique HRUs for the Geoscience Australia and MOD12 land covers respectively, ranging from 0.001 to 6 km 2 in area.While each unique HRU has a specific set of properties, several small areas with the same land cover, slope and soil type make up the total area of a single HRU.The properties of each unique HRU determine how it responds to precipitation, and how different hydrological processes such as streamflow, runoff, lateral flow and evapotranspiration are modelled in the catchment.The runoff from each HRU is accumulated and routed through the river network to the outlet of the catchment.Driven by the meteorological input, the model simulates catchment hydrological processes with a daily time step for the period 2000 to 2013.
The SWAT model is calibrated by fitting simulated streamflow to observed streamflow with the SUFI-2 algorithm.This semi-automatic Latin hypercube sampling algorithm optimises SWAT model parameters while attempting to fit the simulated data as closely as possible to the observed data using the user preferred objective function from those detailed below as a measurement of simulation accuracy (Abbaspour, 2007).Although a single user objective function is used in the calibration and validation, the results of the other objective functions are also recorded for the optimal model run.The Nash-Sutcliffe efficiency (N SE ) (Nash and Sutcliffe, 1970): where Q n (m 3 s −1 ) is the measured discharge at time n, Qn (m 3 s −1 ) is the simulated discharge at time n, Q (m 3 s −1 ) is the mean measured discharge and N is the number of time steps.
The ratio of the root mean squared error to the standard deviation of measured data (R SR ) (Moriasi et al., 2007): (2) Percent bias (P BIAS ): where Q n (m 3 s −1 ) is the mean simulated discharge.The Kling-Gupta efficiency (K GE ) (Gupta et al., 2009): where r is the linear correlation coefficient between the simulated and measured variables, and σ m are the standard deviation of simulated and measured data.
After obtaining a satisfactory fit between the simulated and observed streamflow data during calibration, the model is validated by running the model for a different time period using the same parameters from the calibration period.SUFI-2 further incorporates the unitless P -and R-factor metric, which gives an indication of the confidence in the calibration exercise.The P -factor which is also referred to as the 95 percent prediction uncertainty (95 PPU), is the percentage fraction of observed data captured which falls between the 2.5 and 97.5 percentiles, while the R-factor is the width of the 95 PPU.The P and R-factors are iteratively determined using Latin hypercube sampling.For streamflow calibration and validation to be considered reliable, combined satisfactory values should be obtained of the P -factor (> 0.7), Rfactor (< 1) (Abbaspour, 2007) and one of the objective functions, N SE (> 0.5), R SR (≤ 0.7) and P BIAS (±25 %) (Moriasi et al., 2007).In this study, the NSE objective function combined with the P -and R-factors are used.The result of the other objective functions at the optimal NSE are also recorded.For a comprehensive explanation of the SUFI-2 algorithm, see Abbaspour (2007).
The calibration process was conducted on daily timescales for the years 2000 to 2005, while the validation was conducted for the years 2007 to 2013.A warm-up period of 5 years between 1995 and 1999 was used in the SWAT model to equilibrate the model mass budget and internal reservoirs.The relatively long periods of streamflow calibration and validation on daily timescales were specifically used to address the potential problem of equifinality of parameters to be optimised.The principle of equifinality has been known to affect semi-distributed models such as SWAT (Qiao et al., 2013).Nevertheless, the use of many observation points has been observed to effectively constrain it (Tobin and Bennett, 2017).In this study, 21 sensitive SWAT model parameters (Table 3) are optimised with SUFI-2 to fit simulated streamflow to the observed streamflow data.In the SUFI-2 algorithm preparation for calibration, an "r_" and a "v_" prefix before a SWAT model parameter (Table 3) are indicative of a relative change (a percentage increase or decrease in the SWAT modelled value) and replacement change of the original SWAT modelled values respectively.The relative change is often used to fine-tune parameters that have been modelled within the acceptable range, while the replacement change is used when modelled parameter values are at odds with local field knowledge or established values.
The resultant SWAT simulated ET was compared with the MOD16 ET using the root mean square error (R MSE ), mean difference (M D ), Pearson's correlation coefficient (R) and coefficient of determination (R 2 ) metrics.
where x 1 and y 1 are SWAT and MOD16 monthly ET values respectively.

Streamflow
The streamflow was calibrated and validated on daily timescales according to the guidelines set out in Moriasi et al. (2007) and Abbaspour (2007) (Table 4, Fig. 6).The result indicates an observed data bracketing of between 87 and 89 % for both calibration and validation with R-factors under 1.Table 4 shows better results for the validation than calibration for the N SE , R 2 , K GE and R SR metrics, however slightly lower for the P -factors.The results of the calibration and validation exercise on daily timescales show that the model effectively represents the high-and low-flow periods (Fig. 6).

Sub-catchment-scale evapotranspiration
The SWAT ET model is calculated at the HRU scale (Fig. 7a,  b), however for direct comparison with the MOD16 ET (Fig. 7c), the HRU ET results were reprocessed into 1 km 2 cells using simple averaging.For cells on the boundary which do not aggregate up to the 1 km 2 resolution, a percentage weighting based on the area covered is applied.Figure 7d shows the mean annual difference between both SWAT models (the SWAT model with Geoscience land cover as SWAT-GEO and the SWAT model with MOD12 land cover as SWATMOD12) over the validation period at the 1 km 2 spatial resolution.The SWATMOD12 and the MOD16 maps (Fig. 7b and c) can be seen to show some spatial semblance in the north, south, east and west corners of the catchment principally due to the use of the MOD12 map in both models.Generally, a trend of higher ET in the north-east and central part of the catchment is seen while lower ET is observed in the south-western parts of the catchment.The spatially distributed mean annual ET difference of the SWAT models compared to the MOD16 show about 40 % of the catchment with a difference of ±100 mm yr −1 at the 1 km 2 spatial scale.Clear spatial difference between the SWAT models are seen at the HRU scale but at the 1 km 2 resolution, the maximum mean annual difference between the SWAT models is 12 %.Further analyses were carried out to determine the effect of spatial aggregation on the correspondence between the ET methods.For the spatial aggregation analysis, the SWAT-GEO model was used due to its improved land cover accuracy based on field knowledge.The box and whisker plot in Fig. 8 shows the spread of the difference between the SWAT ET and the MOD16, with the bottom, middle and top of the box indicating the 25th, 50th and 75th quartiles of the distribution.The lowest and highest bars in the plot indicate the minimum and maximum differences between the ET products at the different spatial scales.Figure 8 show that with increasing cell aggregation the difference in the ET between Hydrol.Earth Syst.Sci., 22,[2775][2776][2777][2778][2779][2780][2781][2782][2783][2784][2785][2786][2787][2788][2789][2790][2791][2792][2793][2794]2018 www.hydrol-earth-syst-sci.net/22/2775/2018/ SWAT and MOD16 decreases.At 1, 4, 9, 16 and 25 km 2 the maximum cell difference between the SWAT and MOD16 ET are 31, 19, 15, 11 and 9 % respectively.The grand variances for the monthly data of the three models were calculated and partitioned into the spatial and temporal components at the 1, 4, 9, 16 and 25 km 2 resolutions (Table 5) using the Time-First formulation described in Sun et al. (2010).The partitioning presents the average of the temporal variances for each of the regions in the catchment as the temporal component and the spatial variance of the evapotranspiration as the spatial component shows the spatial component consistently higher across the three models.The partitioning shows that at the finer resolution the variances in the evapotranspiration in the models are principally associated with the spatial component but that the temporal component of the variance increases with spatial aggregation.

Catchment-scale evapotranspiration
At catchment scale, the mean annual ET of the SWAT-GEO, SWATMOD12 and MOD16 models are 873, 864 and 865 mm respectively.The means show better agreement between the SWATMOD12 and MOD16 models, which is attributed to the use of the same land cover in both models.
To compare the temporal dynamics of the MOD16, the SWAT ET and the AWRA-L ET, the data were aggregated to catchment scale.As both SWAT models tend towards unity at the catchment scale with less than 1 % difference in their annual mean ET, only the SWATGEO model is evaluated at catchment scale as the more accurate model to keep with the philosophy of the study.
Monthly MOD16 ET and AWRA-L ET values at 1 and 25 km 2 resolution respectively were averaged to catchmentscale values using the spatial analyst tools in ArcGIS, while ET values from the validated SWAT model on catchment spa-  tial extent and daily timescales were aggregated to monthly timescales.Using the R MSE and R 2 metrics, the analysis shows a good correspondence between the models (Fig. 9).The SWAT and MOD16 methods at catchment scale have a maximum annual ET difference and mean ET difference of respectively less than 13 and 6 % for the period from 2007 to 2013.The MOD16 and the AWRA-L show similar temporal patterns, but the AWRA-L ET was significantly lower than both the MOD16 and SWAT ET results (Fig. 9).A direct comparison between the AWRA-L ET and the SWAT ET without the Revap component shows very high correlation and agreement between both models with a maximum annual ET difference and mean ET difference of respectively 10 and 2 % for the period from 2007 to 2013.

Spatial aggregation analysis
The mean annual graduated spatial-scale analysis across the SWAT models and the MOD16 for 2007-2013 exhibits a wide spread at the 1 km 2 spatial resolution with a maximum cell difference of 31 %.When the data were aggregated to 4 km 2 using the simple averaging method, the maximum difference reduced to an acceptable 19 %.Further aggregation to 9 km 2 reduced the maximum difference by a further 4 %, but also sees a significant degradation in the resolution of the evapotranspiration data.Table 5 also shows the impact of the spatial aggregation on the variance of the monthly ET data across the SWAT and MOD16 models.It is observed that the aggregation from 1 to 4 km 2 altered the percentage vari-ance between the spatial and temporal by about 1 % across the three models, but beyond the 4 km 2 resolution the spatial component of the variance which accounts for the larger portion of the variance begins to degrade further.Hence our spatial scale of confidence for small-catchment-scale ET analysis is the 4 km 2 resolution based on the comparison of the SWAT and MOD16 ET over a complex terrain.The differences between regions in the catchment are more significant at finer spatial resolutions due to the diverse input data and their associated errors: these impacts become less significant as the outputs are up-scaled (Fig. 8).This trend was also observed by Hong et al. (2009).The simple averaging method was preferred in this study over the bilinear, cubic and other methods as the simple averaging method has been observed to be the best in flux aggregation after a study of various methods (Ershadi et al., 2013).

Sources of differences across the three models
The recognised principal sources of differences between the three ET methods are associated with land cover, the Revap component in SWAT and the HRU parameterisation in the AWRA-L; they are discussed in the following sections.

Land cover
The land cover is an important parameter in the MOD16 and SWAT ET algorithms as it determines the values allocated to biophysical properties such as leaf conductance and boundary layer resistance, which significantly impact ET calculations.The impact of the land cover on the SWAT models is evident from the spatially divergent high-resolution SWAT models (Fig. 9a and b), at the HRU scale, though the stream-Hydrol.Earth Syst.Sci., 22, 2775Sci., 22, -2794Sci., 22, , 2018 www.hydrol-earth-syst-sci.net/22/2775/2018/ flow calibration and validation parameters and results were similar.With the spatial aggregation of the SWAT models to 1 km 2 resolution, the obvious spatial differences at the HRU scale reduce significantly and begin to disappear beyond the 1 km 2 resolution.Differences in the land cover in the SWAT models were responsible for the difference spatial distribution of the ET across the catchment between the models.The effect of the land cover on the MOD16 was not evaluated; however, the SWATMOD12 model with the same land cover expectedly showed better agreement when compared with the MOD16, with a mean for the period of 2007-2013 within 1 mm at the catchment scale.The Geoscience land cover map has 95 % percent forests, while the MOD12 has a classification of 67 % forests and 24 % woody savanna, with most of the region misclassified as woody savanna having some similar properties of the forests.At catchment scale, the data averaging contributes to the convergence of the MOD16 and SWAT ET results albeit with closer agreement between the MOD16 and SWATMOD12, which share land cover.

Revap
The Revap component of the AET in SWAT is mostly significant in forested catchments with deep rooted trees that can access the saturated zone and as such are governed by land use parameters (Neitsch et al., 2011).However, the relative accuracy of the Revap component of the ET on HRU scales has been questioned (Liu et al., 2015) due to the linear relationship between the Revap coefficient and potential evapotranspiration in SWAT (see Eq. A23).The Revap component in this study appears consistent with the studies by Benyon et al. (2006) in south-eastern Australia with similar climatic condition as the Sixth Creek Catchment.Benyon et al. (2006) observed that under the combined conditions of highly permeable soils, available groundwater resources of low salinity (< 2000 mg L −1 ), a high transmissivity aquifer and groundwater of depths up to 6 m, annual groundwater ET contribution to total ET ranged from 13-72 % for sampled Eucalyptus tree species.The Sixth Creek Catchment is principally underlain by the highly transmissive and permeable Aldgate Sand- stone aquifer, with salinity levels well below 2000 mg L −1 (Gerges, 1999).Monitoring bores in the Sixth Creek Catchment have recorded standing water levels of less than 1.5 m at the end of the rainy winter months in parts of the catchment.The Sixth Creek Catchment has been identified as one of the principal recharge zones in the Western Mount Lofty Ranges based on the catchment geology and hydrochemical analysis (Green and Zulfic, 2008).A significant portion of the 95 % forested part of the Sixth Creek Catchment is a mosaic of various Eucalyptus tree species, thereby corroborating the results of Benyon et al. (2006).The AWRA-L ET model does not appear to include a separate groundwater ET model in its algorithm such as is found in the SWAT model (A23-26), hence the correlation and strong agreement between the AWRA-L model when the Revap is unaccounted for in the SWAT ET.The results suggest the Revap is a significant contributor to ET in the Sixth Creek Catchment (Fig. 10) with mean annual contribution of 20 % for the years 2007-2013, while monthly contributions ranged from 15-52 % over the same period.The possibility exists that the linear relationship with PET employed in its calculation on HRU scale may be contributory to the higher range of ET fluctuation seen in the SWAT model on the 1 km 2 scale when compared to the MOD16, however, that is beyond the scope of this study.On a catchment scale, the results show that MOD16 simulates higher ET in the winter periods, while SWAT simulates higher ET during the summer periods (Fig. 9).Generally, the agreement between the products is more consistent during the winter seasons when ET is lower.The lesser correlation during higher ET seasons may be related to the linearly determined Revap component of the ET, which is a more dominant process in the summer months when the demand for soil evaporation, plant transpiration and groundwater ET is significantly higher.

HRU parameterisation in AWRA-L
The HRU parameterisation method in AWRA-L significantly impacts the evapotranspiration modelling process.While the AWRA-L does not use a robust land cover product that distinguishes between vegetation including trees, it uses a fraction of the tree cover product to parameterise the HRU.AWRA-L discretises each 5 km 2 grid cell into two HRUs: the shallow-rooted HRU and the deep-rooted HRU.The determination of the area of the grid apportioned as deep-rooted and shallow-rooted HRU is solely based on the satellite derived product of the persistent and recurrent photosynthetically active absorbed radiation (F par ) from the Advanced Very High Resolution Radiometer (AVHRR) (Donohue et al., 2008).The fraction of the persistent F par is regarded as the fraction of tree cover, and hence it is used as the fraction of the deep-rooted HRU in each grid cell.The SWAT ET and MOD16 methods both have challenges associated with input data, which are subsequently propagated through the algorithm.In semi-arid environments such as the Sixth Creek Catchment, high-intensity rainfall events are common occurrences, which impacts hydrologic processes such as infiltration and evapotranspiration differently than if the precipitation were evenly distributed through the day (Syed et al., 2003).Yang et al. (2016) observed that the use of hourly rainfall in SWAT significantly improved the modelling of streamflow and hydrological processes.In this study, due to the unavailability of hourly precipitation data, daily precipitation data were used, thus neglecting the impact of high-intensity precipitation events in the catchment.
Another challenge encountered with the SWAT model is associated with the semi-distributed model methodology.The use of a single value for wind speed, relative humidity and solar radiation for a sub-catchment with spatial scale, which could be in the order of tens of square kilometres, affects the accuracy of hydrological processes at the HRU scale.The "elevation band" method of temperature and precipitation distribution with respect to elevation changes across a catchment was introduced into the SWAT algorithm to attenuate orographic effects in complex terrain catchments (Neitsch et al., 2011).The elevation band algorithm in SWAT has performed well in predominantly snowy, complex terrain catchments, which are significantly larger than the Sixth Creek Catchment with elevation changes in the order of kilometres (Abbaspour et al., 2007;X. Zhang et al., 2008;Pradhanang et al., 2011).However, the application of the elevation band algorithm in the non-snowy Odiel River basin (Spain) with Mediterranean climate similar to the Sixth Creek Catchment yielded less than satisfactory results (Galván et al., 2014).In the non-snowy Sixth Creek Catchment, the orographic effects are a dominant atmospheric process when winds are moving from the lower elevations in the north of the catchment to the higher elevations in the South particularly during the winter months.The orographic lift leads to significantly higher precipitation in the southwesterly direction in the Sixth Creek Catchment, which the elevation band algorithm in SWAT would not represent accurately in non-snowy catchments.
The various meteorological and remote sensing input data used in the processing of the MOD16 all have their inherent uncertainties, with cloud cover challenges and coarseresolution resampling (Mu et al., 2011), while errors have been associated with the land cover product used (Ruhoff et al., 2013).The land cover map (MOD12) used in MOD16 (Fig. 5a), in conjunction with the calibrated Biome Properties Lookup Table (BPLUT), significantly influences the ET output from the various land covers under different climatic conditions.A more detailed map and local knowledge of the Sixth Creek Catchment indicates that the MOD12 land cover spatially mismatches some biomes (Fig. 5a and b).Besides the obvious land cover mismatches that were observed between the input data of the two models, the variety of accepted national, regional and global land cover classification systems contributes to the challenges of hydrological modelling.In this MOD12, the "mixed forest" category covered over 50 % of the catchment, while the category does not exist in the local field map land cover classification.The global standardisation and harmonisation of land cover maps and biome classification at high resolution may improve model performance.

Conclusion
The main objectives of this paper are to compare three ET products (SWAT, MOD16 and AWRA-L) on a catchment scale, while also evaluating the two finer-resolution products (SWAT and MOD16) on a graduated spatial scale.We also attempted to determine the spatial scale at which the models tend towards agreement, while also seeking to understand the sources of disagreements between the models.
The calibrated SWAT model using the SUFI-2 algorithm and various objective functions could simulate ET to within 6 % of the MOD16 on catchment scale, annually.The P and R factors metrics were observed to be very reliable indicators of a good calibration exercise.Abbaspour (2007) proposed P and R factor minimum benchmarks of > 0.7 and < 1 respectively for streamflow calibration, in this study the P and R factors > 0.8 and < 1 were found to produce reliable ET estimates on catchment scales.We observed that at a spatial scale of 4 km 2 we obtained cell differences of under 20 % annually which gave confidence to our study in the complex terrain that our 4 km 2 aggregation is a good scale of confidence.
SWAT and MOD16 show good correlation on a catchment scale, while the AWRA-L and SWAT models without the inclusion of the groundwater ET component of the SWAT model showed good agreement.Biome differences and the input spatial scale contribute to poor agreement at finer spatial scales.The challenge of the lack of a globally accepted and harmonised land cover classification system at high resolution was encountered in the study, with two products derived from the MODIS satellite data classifying land cover differently and thus impacting the results from the SWAT models.The use of different land covers with different classification systems and parameters is observed to have limited impact on evapotranspiration modelling at coarse spatial resolutions due to spatial averaging.Nevertheless, the tree cover fraction used in place of a land cover product in the AWRA-L is also observed to impact the ET modelling, particularly in a groundwater-dependent catchment like our study area.The inherent differences and uncertainties associated with these land cover products will continue to be propagated through the models, thereby promoting divergence in the drive towards more accurate and finer-resolution evapotranspiration O. O. Abiodun et al.: Comparison of MODIS and SWAT evapotranspiration data products.While many concerted research efforts have been made in the past (Latham, 2009;Friedl et al., 2010), a globally accepted harmonised world land cover database at high resolution can significantly improve correlation and confidence in high-resolution ET products.
The result of the spatial-resolution analysis corroborates the view that prevailing ET algorithms and measurement methods will have a certain degree of variability due to the complexity of ET estimation and various drivers of the contributory processes.The study shows that correlation at catchment scale does not necessarily translate to correlation at finer spatial scales.The study also highlights the possible challenges of the semi-distributed SWAT ET algorithm in a complex terrain as the input climate data can be a challenge due to spatial resolution and climate variability.
Data availability.The datasets for this research can be accessed through the Flinders University repository in the future.This is part of a current PhD research; hence, until the completion of the research, the datasets belong to Flinders University of South Australia.
AWRA-L is a daily 25 km 2 grid based hydrological model designed on the water balance concept over Australia.The model conceptualises each grid as two distinct HRUs: shallow-rooted vegetation HRU and deep-rooted vegetation HRU.The shallow-rooted vegetation corresponds to grass, while the deep-rooted vegetation corresponds to trees.The model conceptualises the soil into three layers with water storage capacity: the soil surface storage with a 0.1 m depth, the shallow storage from 0.1 to 1 m and the deep storage from 1 to 6 m.The principal difference between the two HRUs is

Figure 4 .
Figure 4. Digital elevation model of the Sixth Creek Catchment study area(Gallant et al., 2011).

Figure 8 .
Figure 8. Differences between SWATGEO ET and MOD16 for spatial aggregations between 1 and 25 km 2 .The bottom, middle and top of the whisker indicate the 25th, 50th and 75th quartiles of the distribution; the lowest and highest bars indicate the minimum and maximum differences.

Figure 9 .
Figure 9. Monthly comparison of SWAT, AWRA-L and MOD16 ET at catchment scale.

Figure 10 .
Figure 10.Monthly comparison of the Revap component of the ET and total ET in SWAT.

Table 1 .
Literature studies of MODIS and SWAT evapotranspiration (see Table 2 for climate classification).

Table 3 .
Optimised SWAT parameters and their final range.

Table 4 .
Streamflow calibration and validation results.

Table 5 .
Variance partitioning into space and time components at various spatial resolutions.
The discretisation of the AWRA-L HRU in the Sixth Creek Catchment which suggests under 60 % of tree cover in the Sixth Creek Catchment severely limits the access of the model to the deep soil storage and groundwater ET computation in the catchment; hence, the close correlation and agreement of the AWRA-L model with the SWAT model when the Revap (groundwater ET) is unaccounted for are reasonable.