Advances in
space-based observations have provided the capacity to develop regional- to
global-scale estimates of evaporation, offering insights into this key
component of the hydrological cycle. However, the evaluation of large-scale
evaporation retrievals is not a straightforward task. While a number of
studies have intercompared a range of these evaporation products by examining
the variance amongst them, or by comparison of pixel-scale retrievals against
ground-based observations, there is a need to explore more appropriate
techniques to comprehensively evaluate remote-sensing-based estimates. One
possible approach is to establish the level of product agreement between
related hydrological components: for instance, how well do evaporation
patterns and response match with precipitation or water storage changes? To
assess the suitability of this “consistency”-based approach for evaluating
evaporation products, we focused our investigation on four globally
distributed basins in arid and semi-arid environments, comprising the
Colorado River basin, Niger River basin, Aral Sea basin, and Lake Eyre basin.
In an effort to assess retrieval quality, three satellite-based global
evaporation products based on different methodologies and input data,
including CSIRO-PML, the MODIS Global Evapotranspiration product (MOD16), and
Global Land Evaporation: the Amsterdam Methodology (GLEAM), were evaluated
against rainfall data from the Global Precipitation Climatology Project
(GPCP) along with Gravity Recovery and Climate Experiment (GRACE) water storage anomalies. To ensure a fair
comparison, we evaluated consistency using a degree correlation approach
after transforming both evaporation and precipitation data into spherical
harmonics. Overall we found no persistent hydrological consistency in these
dryland environments. Indeed, the degree correlation showed oscillating
values between periods of low and high water storage changes, with a phase
difference of about 2–3 months. Interestingly, after imposing a simple lag
in GRACE data to account for delayed surface runoff or baseflow components,
an improved match in terms of degree correlation was observed in the Niger
River basin. Significant improvements to the degree correlations (from
Space-based observations of the Earth system have provided the capacity to retrieve information across a wide range of land surface hydrological components and an opportunity to characterize terrestrial processes in space and time. Indeed, remote sensing offers a number of independent means with which to retrieve various components of the hydrological cycle (e.g. rainfall, soil moisture, evaporation, terrestrial storage). Progress in satellite-based observation of the Earth system has enabled the characterization of land surface hydrological components and an improved representation of terrestrial processes (Famiglietti et al., 2015). Dedicated space missions such as the Gravity Recovery and Climate Experiment (GRACE) (Tapley et al., 2004b), the Global Precipitation Measurement Mission (GPM) (Hou et al., 2014), and a suite of microwave-based soil moisture platforms (Liu et al., 2012) represent important efforts that have contributed to these advances. Considering the spatial advantage that space-based observations have over ground-based measurements, there has been a proliferation of regional- to global-scale data products, providing knowledge on the multi-scale behaviour and patterns of hydrological states and fluxes useful for enhanced process description (Stisen et al., 2011). However, one of the challenges of space-based remote sensing is how to characterize the degree to which these products represent realistic estimates of the underlying variables they attempt to retrieve.
Terrestrial evaporation (
Beyond the assessment of evaporation models, a limited number of studies have sought to quantify large-scale water budgets using either satellite observations alone (Sheffield et al., 2009) or through a combination of satellite observations and data assimilation (Pan and Wood, 2006; Pan et al., 2008, 2012; Sahoo et al., 2011). While some of these studies (Sheffield et al., 2009; Gao et al., 2010) evaluate water budget closure by comparing the residual of the water budget (i.e. inferred runoff) with measured runoff, others aim to provide merged or observation-constrained estimates of the water cycle components, with estimates of uncertainty given in terms of the variability among the products (e.g. Long et al., 2014). The results of these studies have generally illustrated large water budget closure errors, focusing on the temporal scale and invoking the use of a hydrological model to guide analysis or force closure, rather than being solely observation-driven assessments. Observation-only studies are important, as they not only provide an unbiased perspective on hydrological closure, but also allow for a first-order examination of the underlying agreement between component variables. However, rather than just comparing the uncertainties between evaporation products and other hydrological components (which are poorly defined), there is still a need for alternative assessment techniques that exploit the inherent connection between hydrological variables at both temporal and spatial scales. One approach to determining this is to evaluate the hydrological consistency between observed products (McCabe et al., 2008). The term “hydrological consistency” refers to the spatial and temporal match that should exist between independent observations of hydrological states and fluxes, based upon physical considerations. It is a concept that encompasses the expectation of water cycle behaviour and mass balance: that is, changes in one term should be reflected in related variables, both spatially and temporally. For instance, a rainfall event should result in an observable change in soil water storage and a consequent increase in evaporative flux, which in turn should reduce the available soil moisture. This relatively simple concept has been explored in the recent past, including in efforts to improve precipitation events by employing cloud detection methodologies (Milewski et al., 2009); using soil moisture changes to infer precipitation amounts (Brocca et al., 2014); examining the connection between soil moisture state and changes in atmospheric variables such as humidity and sensible heat flux (McCabe et al., 2008); as well as in assessments of land–atmosphere coupling between observations and reanalysis data (Ferguson and Wood, 2011).
In considering these earlier contributions, there remains a need to determine whether the basic idea of hydrological consistency can be realistically extended to explore the agreement between independent global-scale satellite-based hydrological products. To examine this question, it makes sense to focus on catchments that have relatively simple hydrological interactions, as they represent natural laboratories within which the evaluation of large-scale products and the concept of hydrological consistency can be reasonably undertaken. For example, Wang et al. (2014) evaluated the level of agreement between three satellite-based hydrological cycle variables over arid regions in Australia, where surface and sub-surface runoff were minimal. With a sufficiently low runoff component, a lack of snow accumulation, and a relatively strong coupling of precipitation and evaporation components, arid and semi-arid environments represent potential candidates within which to undertake such process assessments. Recognizing the need to advance a more comprehensive evaluation strategy for remote sensing retrievals, this study seeks to explore the hydrological consistency within a number of basins where hydrological processes are relatively simple, i.e. reflecting the conditions described above. Our analysis constitutes a framework for assessing the utility of hydrological consistency to evaluate remotely sensed hydrological products. We undertake the analysis over four large river basins within arid and semi-arid environments distributed across the globe, with study regions comprising the Colorado River basin in North America, the Niger River basin in Africa, the Aral Sea basin in Asia, and the Lake Eyre basin in Australia.
In compiling data sets with which to evaluate and differentiate between candidate evaporation products, a number of product-specific considerations needed to be accounted for. Total water storage estimates, which comprise the summation of groundwater, soil moisture, snow, surface water, ice, and biomass, were derived from anomalies in the gravity field from GRACE satellites (Tapley et al., 2004b). As any continuous function on a sphere, the gravity field can be represented as an expansion in spherical harmonics, which form a complete set of basis functions in the sphere, similar to the way in which a Fourier series expansion uses sines and cosines as basis functions. Unlike precipitation and evaporation products (and most other hydrological remote sensing variables), it is problematic to directly compare spatial maps of GRACE water storage data with other spatially distributed hydrological variables (Tapley et al., 2004a), since GRACE data are usually filtered in the spectral domain. While scaling the GRACE data to account for differences due to filtering has been proposed as a solution to this problem (Landerer and Swenson, 2012), it has recently been shown to affect results in certain cases (Long et al., 2015), including over arid regions. Given this restriction, we implement an alternative approach in which the precipitation and evaporation fields are transformed into spherical harmonics in order to remove the impact (and model dependence) of this scaling term. Such an approach allows for a more reasonable and equivalent intercomparison of hydrological variables, and represents a novel aspect of this work. Further details describing this process are presented in Sect. 3.
The overall objective of this study is to evaluate the hydrological consistency of three global-scale satellite-based evaporation products against remotely sensed retrievals of precipitation and terrestrial water storage across a selection of basins that exhibit relatively well-defined hydrological interactions. Throughout this analysis we aim to determine whether the hydrological consistency concept can expand the range of evaluation metrics used to assess large-scale hydrological data sets such as evaporation, and enable some differentiation of relative product quality to be made.
A range of globally distributed large-scale data sets derived primarily from satellite observations were used in this analysis. The study period, encompassing the years between 2003 and 2011, was based upon the availability of GRACE data and several recently developed global-scale evaporation products. In the following paragraphs we briefly describe the sources and nature of the data used in this contribution.
Description of the satellite products used in this study.
The temporal resolution is daily except for MOD16 (8-daily) and GRACE
(monthly). The original MOD16 product is available at 1 km resolution in the
sinusoidal projection. In this study, the product was reprojected onto a
0.05
GRACE water storage estimates have been used in a myriad of studies exploring
the indirect groundwater response across many different spatial and temporal
scales (Swenson et al., 2008; Rodell et al., 2009; Famiglietti et al.,
2011; Sun, 2013; Voss et al., 2013). The accuracy of GRACE terrestrial water
storage anomalies (TWSAs) is related to the number of degrees to which the
gravity field is solved for in spherical harmonics (Swenson and Wahr, 2002),
and an approximate global averaged accuracy of 20 mm month
GRACE data contain two types of errors (correlated and random) that need to be filtered before translating the data into water storage anomalies. Correlated errors are known to contaminate the signal in the form of north–south oriented stripes. A “de-striping” filter was applied to the coefficients (Swenson and Wahr, 2006; Duan et al., 2009) in order to remove this source of error. An isotropic filter (Gaussian filter with a radius of 300 km) was then used to remove random errors (Swenson and Wahr, 2002). Furthermore, it is a usual practice to replace the degree 2 coefficients with a more reliable estimate from a low-degree model of the gravity field calculated using satellite laser ranging (Cheng et al., 2011, 2013). While the effect that the filters have on the true geophysical signal is not known a priori, an indirect measure can be obtained by applying the filter to a synthetic water storage variation from a land surface model (LSM). This method has been used to obtain scaling factors for GRACE data in order to restore the signal (it has been observed that the filters typically reduce the signal) before using the GRACE data with other hydrological variables (Landerer and Swenson, 2012). Long et al. (2015) evaluated the impact of different land surface models on the scaling factor and showed that the impact was greatest in arid regions. To avoid this potential element of uncertainty in our study, which is focused on arid regions, we instead transformed the other water cycle components (i.e. evaporation and precipitation) into spherical harmonics, using an approximation similar to Eq. (1). The effect of the filters is therefore incorporated directly into the other hydrological components in spherical harmonics.
Several satellite-based evaporation products have been developed over the last decade, based on a range of modelling schemes (Mu et al., 2011; Leuning et al., 2008; Miralles et al., 2011a) and global-scale input data. Given the importance of evaporation within studies of the global energy and water cycle, considerable effort has been directed towards accurately reproducing its spatial and temporal variability, with comprehensive reviews of various approaches to do this provided by Kalma et al. (2008) and Wang and Dickinson (2012). Here we employ a range of global evaporation data sets, which are briefly described in the following paragraphs and summarized in Table 1. To ensure consistency with the GRACE data, the evaporation products were aggregated from daily (or 8-daily in the case of the MODIS Global Evapotranspiration product – MOD16) to monthly estimates, centered on the dates specified in the GRACE monthly gravity field solutions. In the aggregation from daily to monthly data, pixels that presented missing data for more than 20 % in a given month were not included in the calculation.
Cleugh et al. (2007) developed an algorithm for large-scale evaporation
monitoring based on the Penman–Monteith (PM) equation, using meteorological
forcing data and a surface resistance linearly modelled through a remotely
sensed leaf area index (LAI), as measured by the MODerate resolution Imaging
Spectroradiometer (MODIS). Improvements to this approach (Mu et al., 2007,
2011) led to the development of MOD16, a three-source scheme used for
terrestrial land flux estimation. In MOD16, the linearization of the surface
resistance is specified for each biome separately via a look-up table, with
the evaporation calculated for daytime and nighttime conditions. Other
adjustments incorporated into MOD16 include soil heat flux calculation,
distinction of dry and wet canopy, as well as moist and wet soil, and
improvements to the aerodynamic resistance. The MOD16 product comprises
transpiration, evaporation from the soil and wet canopy, as well as total
evaporation calculated as the sum of these three components. Each component
is weighted based on the fractional vegetation cover, relative surface
wetness, and available energy. Inputs to the model include net radiation
(
In parallel to the PM–Mu model, Leuning et al. (2008) introduced
improvements to the Cleugh et al. (2007) algorithm, resulting in the
two-source Penman–Monteith–Leuning (PML) model. An important new feature of
the PML approach was a biophysical algorithm for the calculation of the
surface resistance, which was previously calculated as LAI multiplied by a
constant
Global Land Evaporation: the Amsterdam Methodology (GLEAM) (Miralles et al., 2011a) is a satellite-based model developed to estimate evaporation at a global scale. In this approach, rainfall interception loss is evaluated using an analytical model (Gash, 1979) as a first step. GLEAM then employs the Priestley–Taylor equation to calculate the potential evaporation of bare soil and vegetation components (both short and tall canopy), with values constrained to actual evaporation via application of a stress factor. The stress factor is calculated using vegetation optical depth from a combination of different satellite passive microwave observations using the Land Parameter Retrieval Model (Liu et al., 2013). GLEAM also has the capacity to explicitly calculate sublimation of snow-covered surfaces (Takala et al., 2011) as well as open water evaporation. Satellite observations of surface soil moisture can be assimilated using a Kalman filter assimilation approach to estimate the moisture profile over several soil layers. Here we employ version 2A of GLEAM (D. G. Miralles, personal communication, 2014), which uses a combination of satellite, ground, and reanalysis input data. Precipitation is obtained from the Climate Prediction Center Unified data set, consisting of data from over 30 000 stations (CPC-Unified, Joyce et al., 2004). The radiation product used in this version of GLEAM is the European Center for Medium-Range Weather Forecasts (ECMWF) ERA-Interim meteorological reanalysis product (Dee et al., 2011). In this version of GLEAM, surface soil moisture data from the Water Cycle observation Multi-mission Strategy Climate Change Initiative (WACMOS-CCI) merged product (from a combination of several passive and active microwave products) are assimilated (Liu et al., 2012), while air temperature is derived from both the International Satellite Cloud Climatology Project (ISCCP) and the Atmospheric Infrared Sounder (AIRS) (Rossow and Dueñas, 2011). Further details of the model can be found in Miralles et al. (2010, 2011a, b).
Global daily precipitation (
Selected study basins used within the analysis. Criteria for the selection of basins included predominantly arid climate (more than 50 % areal coverage with any of the arid Köppen climates: BWk, BWh, BSk, or BSh), size, geographical location, and amplitude and trends in the water storage variations.
While runoff data were not used explicitly in the consistency analysis
presented in this paper, simulated runoff data were compared to precipitation and
evaporation observations in order to evaluate the assumption of a relatively
simple hydrological system in the study basins. Surface runoff, sub-surface
runoff, and snowmelt were derived from the NOAH land surface model included
in the Global Land Data Assimilation System (GLDAS) (Rodell et al., 2004).
GLDAS uses global satellite and ground-based observational products to obtain
optimal estimates of land surface states and fluxes from land surface models
using data assimilation techniques. Although these values were not
constrained with ground estimates and thus may contain biases, as noted,
runoff values were only used to provide an assessment of runoff against the
observed precipitation and evaporation data. The version of the product used
in this study (GLDAS-2.0) is forced with meteorological data from the
Princeton University forcing data set (Sheffield et al., 2006) and is
available at 1
The study basins were targeted primarily on their climate classification, with river basins in regions with a predominantly arid or semi-arid climate preferentially selected. This criterion was established in order to seek a relatively simple hydrological system (i.e. constrain the range of possible hydrological interactions), thereby maximizing the conditions under which hydrological consistency between evaporation and precipitation and water storage changes might be achieved. A Köppen classification map, generated using data sets from the Climatic Research Unit and the Global Precipitation Climatology Centre up to 2006 (Kottek et al., 2006), was used to identify arid and semi-arid regions. The basins were selected from a set of 405 globally distributed river basins provided by the Global Runoff Data Centre (GRDC) and derived from flow direction data of the HYDRO1k Elevation Derivative Database, developed at the US Geological Survey (USGS). A threshold of 50 % areal extent containing any of the arid Köppen climates (BWk, BWh, BSk, or BSh) was used to select potential basins. Secondary criteria for basin selection from the GRDC data set focused on size, geographical distribution and amplitude, and trends in the water storage variations. In terms of the size of the basin, a smaller size would more likely satisfy the assumption of a relatively simple hydrological system. However, due to the coarse resolution of GRACE data (see Sect. 2.1), this requirement had to be compromised. Given these considerations, four basins were selected as focus regions of study: the Colorado River basin (CRB) in North America, the Niger River basin (NRB) in Africa, the Aral Sea basin (ASB) in Asia, and the Lake Eyre basin (LEB) in Australia (Fig. 1).
Average
Left: CSIRO-PML monthly evaporation (
Figure 2 shows the spatially averaged hydrological fluxes over the study
basins, including the sum of surface, sub-surface, and snowmelt runoff (
Even though these four basins were preselected based upon their location
within dryland systems (Wang et al., 2012), they reflect a range of trends in
water storage and precipitation. For example, the Colorado River basin
experienced intervals of wet and dry periods, while the Niger River basin
exhibits a small but steady increase in water storage with a clear seasonal
variability in both water storage and precipitation. Meanwhile, the Aral Sea
basin experienced a significant loss of water during the study period
(
In order to provide a meaningful spatial evaluation of the hydrological consistency between the data sets (i.e. at sub-basin scale) and to ensure that a fair comparison between GRACE data and satellite products could be undertaken, the analysis was carried out in spherical harmonics. The effects of the de-striping filter (see Sect. 2.1) are incorporated into the analysis directly instead of relying on a land surface model, the choice of which can severely impact the results of our analysis in arid regions (Long et al., 2015). In this section, we present a detailed account of how the transformation was carried out, as well as how the actual evaluation of hydrological consistency is performed in spherical harmonics.
The spherical harmonic analysis refers to the process of solving Eq. (1) for
a set of coefficients
In the analysis so far, the computed spherical harmonic coefficients are
undertaken at the global scale (e.g. Fig. 3). In order to evaluate the
hydrological consistency of the study regions (Fig. 1), the data need to be
masked for the particular study basins. In Swenson and Wahr (2002), an exact
averaging kernel is defined as a function with a value of 1 inside the
boundaries of a region and 0 outside. To isolate the GRACE signal,
an approximated averaging kernel was computed in spherical harmonics and
convolved with the Gaussian filter in order to obtain a spatially
averaged value of the TWSA (at the basin scale). In this study, we instead
compute the spherical harmonic from the product of the global data sets (e.g.
the TWSA or
The spatial agreement between two data sets can be evaluated using spherical
harmonic coefficients by computing the degree correlation measure
(Arkani-Hamed, 1998; Tapley et al., 2004a) following Eq. (5):
An examination of the evaporation data sets indicates that there are evident
differences across the various products in each of the studied basins (see
Fig. 2). In general, MOD16 simulates lower flux estimates when compared
against both CSIRO-PML and GLEAM, a feature that has been noted in a number
of recent global intercomparison studies (McCabe et al., 2016; Michel et al.,
2016; Miralles et al., 2016). There are also clear differences in terms of
the variability in the temporal response of the models, although CSIRO-PML
and GLEAM show a greater level of agreement in terms of amplitude and timing,
if not in absolute values. For example, during the wet period of 2004–2005
in the Colorado River basin, the response to precipitation reflected in MOD16
was far more rapid than either CSIRO-PML or GLEAM displayed. Of some concern
is that CSIRO-PML is larger than precipitation during much of the study
period in both the Colorado River and Lake Eyre basins, immediately negating
any type of hydrological consistency analysis. In the Niger River basin,
there is more consistent agreement between the evaporation products,
indicating greater confidence in the retrievals of evaporation in this
region. For the Aral Sea basin, the discrepancies in
Overall, even from a qualitative perspective, there are clear challenges in developing a hydrological consistency approach over these comparatively “simple” basins. Indeed, this has been demonstrated in other studies using either satellite data alone or a combination of satellite and ground data. While it is not the intention of the current work to explore the error characterization of these different evaporation models based on hydrological closure, the techniques being used to evaluate product consistency should provide some insight into retrieval quality, at least relative to the other hydrological products (precipitation and gravity-based water storage changes) that the evaporation is being compared against. These ideas are explored more quantitatively in the following sections.
Top: anomalies of the terrestrial water storage (TWSA) observed by
GRACE (with 20 mm uncertainty bounds) and
In this section, we examine the spatial and temporal patterns of the degree
correlations between water storage variations (TWSA) and
The start of the study period (2003) coincided with the end of an intense
multi-year drought in the Colorado River basin (Scanlon et al., 2015). During
the wet period of 2004–2005 (Fig. 2), the basin showed a corresponding
increase in the TWSA (see Fig. 4), although with a delay in time of 2 to
3 months. During this time of increase in the TWSA, there was a corresponding
increase in
Top: anomalies of the terrestrial water storage (TWSA) observed by
GRACE (with 20 mm uncertainty bounds) and
The TWSA in the Niger River basin was characterized by an overall steady
increase (5.79 mm yr
The endorheic Aral Sea basin reflected a historical trend of water loss
during the study period, most likely caused by anthropogenic consumption
related to agricultural activities (Zmijewski and Becker, 2014). Although
there were short intense precipitation events during much of the study period
(Fig. 2), the total annual precipitation showed a negative trend of
Top: anomalies of the terrestrial water storage (TWSA) observed by
GRACE (with 20 mm uncertainty bounds) and
Top: anomalies of the terrestrial water storage (TWSA) observed by
GRACE (with 20 mm uncertainty bounds) and
Another endorheic basin examined here was the Lake Eyre basin, which
experienced a marked increase in precipitation during the rainy seasons of
2009–2011 (Fig. 2), resulting in an increase in water storage anomalies of
about 40 mm yr
For GRACE to identify a water storage increase, the water mass resulting from precipitation needs to accumulate within the catchment beyond a detectable threshold. This accumulation process may take up to several months, during which time the spatially distributed rainfall drains via sub-surface processes or collects in rivers after travelling from different source areas within a basin (Rieser et al., 2010). The apparent lag that GRACE data illustrates relative to faster hydrological processes such as precipitation events has been observed in African basins (Ahmed et al., 2011; Hassan and Jin, 2016) as well as in Australia (Rieser et al., 2010; Wang et al., 2014). The clearest example from amongst the basins studied here is shown in the Niger River basin (Fig. 5), where a lag of 2 months is evident throughout the study period. In other regions, such as the Colorado River basin and the Lake Eyre basin, the time needed to detect water storage changes after precipitation events tends to vary, perhaps due to changing spatial and temporal patterns in precipitation as well as geomorphological characteristics (Ahmed et al., 2011; Wang et al., 2014). Because of their large extent and geographical features, the Colorado River and Aral Sea basins include regions where snow storage plays an important role as a source of delayed runoff. The combination of snowmelt, groundwater flow, and other sources of delayed flow is defined as baseflow (Beck et al., 2013).
To examine this temporal component, at least in a simplified manner, lags of
1, 2, and 3 months were considered for all basins and assumed to remain
constant throughout the study period. In terms of changes to the degree
correlation, for the Niger River basin it was clear that a 2-month lag
produced an improved temporal match between the TWSA and
Top: average degree correlation statistics per study region and
evaporation product. Bottom: GRACE data were shifted by 2 months to match the
phase with
Figure 8 presents a statistical summary of the mean degree correlation values
over the study period, comparing the original analysis and using a constant
lag of 2 months. The results are presented as boxplots, where the median is
indicated as a bold black line inside a box confined by the first and third
quartiles (bottom and top of the box). The whiskers below and above the first
and third quartiles show a threshold of 1.5 times the inter-quartile range
(IQR), defining a number of outliers outside this range. As already noted,
the Niger River basin showed a significant improvement in
The development of methods and sensors to retrieve the various components of the water cycle has for the most part been undertaken independently of any evaluation against interrelated processes (see McCabe et al., 2008, and Brocca et al., 2014, for some examples of complementary retrieval). Large-scale retrievals of hydrological variables such as evaporation, soil moisture, and rainfall products do not come with well-defined accuracy metrics, let alone uncertainty bounds. This lack of any well-defined error structure associated with individual products complicates the task of product assessment. As such, the question of how to evaluate large-scale data sets remains an outstanding one. This is especially important in the context of global-scale products. While a number of global evaporation (and precipitation) evaluation papers have been published, none seek to identify consistency with related hydrological variables, and focus instead on comparisons against traditional point-scale or tower-based techniques (McCabe et al., 2016; Michel et al., 2016). Given the spatial mismatch between ground observations (and the lack of continuous large-scale coverage of in situ data in remote regions), it is perhaps inappropriate to evaluate these large-scale products in such a manner. Determining whether individual products are at least consistent with each other (i.e. they reflect hydrological expectation) is a needed first step in product assessment. The motivation behind this study was to take a step back and determine whether a first-order hydrological assessment could be achieved. Rather than comparing the uncertainties between the evaporation products and the other hydrological components (which are poorly defined), we attempt to distinguish between the different evaporation products relative to their consistency with precipitation and terrestrial water storage. That is, are observed changes or patterns in the evaporation data sets reflected in these other hydrological variables? We explore this approach precisely because of the challenges in quantifying uncertainty based upon traditional in situ methods. As is discussed below, the challenge on how to do this remains, raising some important questions on both product quality and also the techniques we use to evaluate global products.
For some regions, especially those where simpler and more defined water cycle behaviour dominates, it is reasonable to expect that significant and consistent inter-product agreement between hydrological components should be observable. To explore this idea, our study focused on basins where such a simplified water budget, consisting of water storage anomalies as a function of precipitation and evaporative fluxes, might be expected to predominate. The aim was to reduce the influence of complicating variables such as snow, vegetation changes, large precipitation and streamflow contributions, and other hydrological processes from the analysis. The assumption was that arid and semi-arid regions would best fit this profile. The role of the degree correlation was to evaluate the spatial agreement between the hydrological components, assuming that any non-closure errors due to unmodelled outflow components (e.g. long-term baseflow or minimal surface runoff) would not affect this measure. However, other sources of errors that directly affect evaporation estimates, such as the choice of algorithm, implicit model assumptions, choice of parameterizations, and an incorrect representation of the land cover, can directly impact the degree correlation measure. Given the relationship between size and retrieval accuracy as relates to GRACE data, obtaining a geographical distribution of basins that could satisfy this simplified water budget assumption was non-trivial. Restrictions related to basin size affect the study in two conflicting ways. On the one hand, a large basin will inevitably present complications related to heterogeneity (including in climate zones, as was the case for the Colorado River basin and the Aral Sea basin) and also be more likely to contain areas affected by anthropogenic activities, such as irrigation, land cover changes, and building of dams and reservoirs. On the other hand, a small catchment size is more difficult to evaluate with this consistency approach, given the coarse resolution of (most) of the global products used here, but especially the GRACE data. The spatial resolution of GRACE data is further limited by the use of filters to remove errors. Considering these restrictions, a compromise in the selection of study basins was required to allow for at least a narrow range of length scales (500–800 km) to be evaluated.
In the end, our study consisted of four major globally distributed river basins, including two endorheic systems. Although they mostly have an arid climate in terms of Köppen classification, both the Colorado River and Aral Sea basins include regions with the presence of snow and snowmelt-dominated runoff. While snow storage itself is not a problem, since GRACE detects changes in storage irrespective of their nature (snow, groundwater, soil moisture, etc.), snowmelt may contribute to delayed changes in storage that can affect gravity results. Likewise, evaporation models generally have a difficult time adequately estimating sublimation. However, the inclusion of these basins was considered important in order to test the hydrological consistency concept in regions that deviated from the ideal assumption. Indeed, the influence that snowmelt and other potential sources of lag in the system have is poorly defined and forms part of the motivation to explore the inclusion of a lag response in the GRACE data (see Sect. 4.3).
Apart from the issues of spatial scale, the use of satellite-based hydrological data presents additional challenges and sources of uncertainty to any consistency-based assessment. For instance, because GRACE data are smoothed to remove errors in small-scale terms (i.e. truncation of the spherical harmonic coefficients), the gravity signal contains contamination from outside of the studied basins (leakage) and represents a potential source of uncertainty in areas neighbouring high-amplitude signals (particularly if they are out of phase with the study basin) and the ocean. Although the LSM-based scaling factor, which is static in time, has been used to correct for bias (e.g. signal reduction) and leakage contamination, dynamic changes in water storage trends outside the basin might still contaminate the signal (Long et al., 2015). In addition, the temporal lag in terrestrial storage response, as documented in previous studies (Rieser et al., 2010; Ahmed et al., 2011; Wang et al., 2014; Hassan and Jin, 2016) and observed in our analysis, represents an important source of potential error (see Sects. 4.3 and 5.2). Product errors are also evident in the precipitation and evaporation data sets. Global rainfall retrievals have well-recognized limitations, including the detection of both high- and low-intensity events (Hou et al., 2014), the discrimination of cloud-free and cloud precipitation scenes, as well as the sensitivity to parameters in the forward model of radiative transfer over different sensors (Stephens and Kummerow, 2007). In terms of evaporation, uncertainties related to algorithm choice, input data variability, and process parameterizations all complicate the accurate estimation of terrestrial fluxes (Ershadi et al., 2015). McCabe et al. (2016) present a thorough description of accuracy issues related to global products. However, it is not the intention of this work to explore these product uncertainty issues in detail. Determining whether or not and understanding how much these sources of product uncertainty affect hydrological consistency studies are important areas requiring further investigation. What is clear from this analysis is that there is still some way to go in terms of being able to confidently assert that any single global product outperforms any other, at least in terms of its inter-product consistency.
In exploring the relationship between GRACE water storage changes and
precipitation and evaporation data, it was evident that water storage
anomalies peaked at a significantly later time than the corresponding
One motivating aspect of this work was to explore whether differences in
available global evaporation products (i.e. satellite-based evaporation
models forced with global input data) impacted the results of the consistency
analysis; that is, could we identify better agreement between water storage
anomalies and
As the focus of the study was to discriminate between evaporation products, the question of whether the choice of precipitation product affected the hydrological consistency analysis was somewhat beyond the scope of this work. However, a preliminary analysis was undertaken by replacing the GPCP precipitation data with another data set and reproducing the analysis. To do this, we processed the Precipitation Estimation from Remote Sensing Information using Artificial Neural Network (PERSIANN) product, which uses an artificial neural network to approximate spatiotemporal non-linear relationships between physical variables and remotely sensed signals (Hsu et al., 1997). PERSIANN uses data from the long-wave infrared imager onboard the Geostationary Operational Environmental Satellite (GOES) as well as from the Tropical Measurement Mission (TRMM) microwave imager (TMI). As shown in Fig. S1 in the Supplement, the results of this new analysis did not reveal any significant difference when compared to those based on the GPCP analysis. Figure S1 shows the average degree correlation statistics per study region and evaporation product, with and without the inclusion of a lag in GRACE data.
Evaluating global evaporation products remains an outstanding challenge. The purpose of implementing a hydrological consistency approach was to explore the evaluation of evaporative fluxes by comparing the spatial patterns between precipitation and changes in water storage. If such an approach could be shown to perform well in a relatively simple hydrological system, the potential for broader-scale application in regions with more complex behaviour would be the next logical step. However, the study showed that even in these relatively simple basins, it was not possible to demonstrate a consistent hydrological agreement between independent observations. Improvements in satellite-based evaporation products are likely to be delivered through advances in algorithm development, increases in the observable resolution, and also via the development of multi-product ensembles (with weighting based on validation analyses and uncertainty assessments). The prospects for improved precipitation monitoring are also promising given the Global Precipitation Measurement mission, which will allow for a more accurate representation of light rains, a challenge that has been a limitation in other precipitation products, including the GPCP (Huffman et al., 2001). Likewise, the next generation gravity missions (GRACE follow on and GRACE II) with the incorporation of improved sensor design (Christophe et al., 2015) are anticipated to provide more accurate estimates of the water storage anomalies, albeit with no significant increase in resolution.
Given the inherent challenges in validating satellite-based products via the use of ground-based observations, a key motivation behind this study was to examine the capacity of independent observations of the water cycle to reflect some form of hydrological consistency. To do this, the study focused on regions where it would be most expected to observe such responses: arid and semi-arid regions with a simplified water budget, consisting primarily of precipitation and evaporation, and assuming a minimal runoff and other long-term outflow components. Unfortunately, it was determined that, even in these simple environments, hydrological consistency was difficult to obtain. While there are times and locations at which some consistency was observed, there were a greater number for when it was not. The lack of any persistent behaviour is problematic, both in the attempt at independently evaluating remote sensing data and also in any effort to discriminate between individual products. Although there were significant and known differences in evaporation estimates, especially with the MOD16 product in the Colorado River and Aral Sea basins, these differences, whether caused by model parameterizations or by input data, did not seem to play a significant role in the overall level of hydrological consistency.
While not envisioned as providing a comprehensive tool for product evaluation, the approach did help to reveal some interesting spatial and temporal patterns between the studied hydrological variables. In general, the correlation between the satellite products was higher with smaller degrees or larger spatial scales. In simple water cycle systems such as in the Niger River basin, the correlation followed cyclical patterns along with the water storage anomalies, i.e. correlation increased together with water storage anomalies up to the point where these peaked, where it then decreased to the point where these were minimal. A similar pattern but in terms of negative correlations and negative anomalies was also observed. This indicates that, at the least, the correlations are not random but roughly follow the cyclical hydrological variations within the basin. It is also quite reasonable to expect low agreement when fluxes and/or water storage anomalies are minimal, explaining some of the cyclical nature in the correlation. A lag between GRACE and precipitation data was also considered to account for delayed sources of water storage changes. It was shown that imposing even a simple correction (i.e. a constant phase shift to GRACE data) greatly improved the agreement, both in average degree correlation and variability of the results in time. Implementing techniques to better account for these delayed sources of outflow could prove highly beneficial for the analysis of hydrological consistency.
The lack of persistent agreement in some of the studied basins may be explained in part by the added complexities that limit the validity of the assumption of a simple water cycle, i.e. snowmelt runoff, complex geomorphology or hydrogeology, changing patterns of precipitation, as well as anthropogenic influences on the water system. Other limitations to exploiting the hydrological consistency approach include the many challenges that still exist in the large-scale retrieval of precipitation, evaporation, and GRACE data, all of which complicate a thorough interpretation of product uncertainty. Despite these challenges, the expectation is that retrievals of global and regional products will inevitably improve with advances in resolution, process understanding, and forcing data accuracy. In concert with such product improvements, the way in which we evaluate remotely sensed variables should also evolve beyond the relatively simplistic comparisons against in situ data that form the basis of most current assessments. Such a strategy would include evaluation against related hydrological variables, reflecting the underlying rationale of hydrological consistency and hydrological closure studies. Only by implementing a more comprehensive evaluation framework in our assessment schemes will greater confidence in component retrievals be realized.
The MOD16 data set used in
this study (Mu et al., 2011) is publicly available as HDF files at
The spherical harmonic analysis code developed by Wang et al. (2006) is
available at
The authors declare that they have no conflict of interest.
Research reported in this publication was supported by the King Abdullah University of Science and Technology (KAUST). The authors would also like to thank the editor Marc Bierkens, as well as Rogier Westerhoff and an anonymous reviewer for their valuable comments which helped improve the manuscript. Edited by: M. Bierkens Reviewed by: R. S. Westerhoff and one anonymous referee