Hydrology and Earth System Sciences Discussions 2-d Empirical Mode Decomposition on the Sphere, Application to the Spatial Scales of Surface Temperature Variations

Papers published in Hydrology and Earth System Sciences Discussions are under open-access review for the journal Hydrology and Earth System Sciences Abstract The Empirical Mode Decomposition (EMD) is applied here in two dimensions over the sphere to demonstrate its potential as a data-driven method of separating the different scales of spatial variability in a geophysical (climatological/meteorological) field. After a brief description of the basics of the EMD in 1 then 2 dimensions, the principles of its 5 application on the sphere are explained, in particular via the use of a zonal equal area partitioning. The EMD is first applied to a artificial dataset, demonstrating its capability in extracting the different (known) scales embedded in the field. The decomposition is then applied to a global mean surface temperature dataset, and we show qualitatively that it extracts successively larger scales of temperature variations related for example 10 to the topographic and large-scale, solar radiation forcing. We propose that EMD can be used as a global data-adaptative filter, which will be useful in analyzing geophysical phenomena that arise as the result of forcings at multiple spatial scales.

For the last 50 years or so a great deal of effort has been put into the development, refinement and application of statistical methods that help to achieve the goal of separating signal from noise and identifying the preferred scales of variations in both time and space.
In one dimension, variations can be separated using Fourier series, wavelets, Singular Spectrum Analysis, etc.The power spectrum of a one dimensional time-series can be decomposed into its preferred frequency components, hence the preferred time-scales of variations.However, though widely used, these methods rely on quite strong assumptions: most especially stationarity and/or the choice of predefined basis functions.
In space, the problem of scale separation is much more difficult.If time is a natural, sensible direction along which perform the analysis in one dimension, in two dimensional space there is no obvious and natural preferred direction.So on a global scale, it would be arbitrary, although convenient if one is using Fourier or Wavelet methods, to perform scale separation following e.g.only latitudes and longitudes.Furthermore, although there are definite periodicities in the rotation of the Earth, weather and climate are not truly periodic, so the use of a lat-long co-ordinate system is pertinent only when one searches a-priori for given structures (e.g.wavelike patterns at high latitudes).However, this analysis presupposes that we have an a-priori knowledge of the form of the embedded scales.
Until recently, there has been no objective method that could allow for a true separation of the different scales of variations of a geophysical field in space.For example, Empirical Orthogonal Functions (EOFs) and their derivatives have been widely used in climatology/meteorology (e.g.Preisendorfer, 1988).They seek to maximise the variance in time, associated with a set of hierarchical EOF (spatial patterns) to Principal Components (time-series) that describe the variations in sign and amplitude of these patterns, but as such EOFs potentially mix different scales of variations in both in time and space.Singular Spectrum Analysis (Elsner and Tsonis, 1996), has also been used in this context for a number of years.
Recently a new method, namely Empirical Mode Decomposition (called EMD hereafter) was introduced by Huang et al. (1998) for the analysis of 1-D signals in the context of time-series analysis.The basic idea of EMD is to allow for an adaptive and unsupervised representation of the basic components of linear and non-linear signals and is designed to accommodate non-stationarity in the series.
This method has recently been extended to two dimensions (Linderheld, 2002;Nunes et al., 2003).The 2-D decomposition has also been used in the context of the analysis of radar rainfall fields at high resolution by Sinclair and Pegram (2005) where a full justification for the method is available, an outline will be offered later in this paper.
One of the difficulties associated with 1-D EMD is the treatment of the ends of the bounding functions, requiring some (non-objective) decisions to be made in order to proceed (Chiew et al., 2005;Peel et al., 2005;Pegram et al., 2008).In the application of 2-D EMD in radar fields, the edges of the wet areas are well defined, so the ending problem is solved, since the envelopes of the extrema at any stage of the decomposition process are set implicitly to zero.
The beauty of decomposing a field on the sphere is that there are no edge or end effects.What is more, on the sphere, it is particularly advantageous to dispense with a formal coordinate system and to compute the distances between points of interest (extrema in the decomposition process), as one would do in Kriging a field of randomly arranged points in 2-D.It is precisely this thinking that allows us to treat each observation with the same weight, in contrast with methods applied to the lat-long grid which cannot sensibly ascribe the same weight to values at high as well as low latitudes.
The goal of this paper is to demonstrate how 2-D EMD can be applied on the sphere to separate different spatial scales embedded in a field.2-D EMD will be applied in this context as a data-adaptive spatial filter to successively remove the smaller scale variations and retain only the larger ones.
The paper is organized as follows: Sect. 2 presents the zonal equal area partitioning of the sphere, used to adequately represent the data on the sphere on equal areas for the requirements of the EMD analysis.Section 3 presents an outline of the basic principles and algorithm of the EMD in one dimension, while the fourth section presents the principles of its extension to the 2-D case on a sphere.The algorithm is then applied on a simple artificial dataset to demonstrate its viability in Sect. 5.In Sect.6, a first application on the Surface Temperature field obtained from a reanalysis is then presented and we show how EMD on the sphere can be used to filter out the local scales of variations and retain only the larger ones.Conclusions, future developments and potential applications are discussed in Sect.7.

The zonal equal area partitioning of the sphere
The first step is to find an appropriate projection to represent/map the data on the sphere for the purpose of 2-D EMD analysis.
Global-scale climatological data, whether it comes from observations, numerical model outputs, reanalyses, etc. are usually provided on a regular grid, for which the number of longitude points per latitude is always the same, so that for a 2.5 • resolution, one obtains a 72×144 matrix.This presents a set of problems: i) the data points are not representative of equal areas; at the equator the mean area for a standard 2.5 • resolution grid is 12 345 square kilometers, while near the poles, at 87.5 degrees, it is 538 km 2 , i.e. reduced by a factor of 23 ii) the data at the pole are surrogate, stretched along the longitudes to match the regular grid.
The properties of the physical space that are required for EMD to work without supervision are two: i) continuous (no edges) ii) the points should be relatively equidistant, i.e. representative of a roughly equal area.Thus, in order to apply the EMD on the sphere, we need to first resample/interpolate the original data onto a physical partitioning of the sphere that matches these requirements.
Recently, a method for equal area partitioning of the sphere that respects these requirements has been presented (Leopardi, 2006).A Recursive Zonal Equal Area partition is a partition of S dim (the unit sphere in the dim+1 Euclidean space R dim+1 ) into a finite number of regions of equal area.The area of each region is defined using the Lebesgue measure inherited from R dim+1 .Each region is defined as the product of intervals in spherical polar coordinates.The centre point of a region (element) is defined via the centre of each interval, with the exception of the polar caps, where the centre of the region is simply the centre of the spherical cap (hence the geographical pole).One of the advantages of this partitioning (in contrast to triangulation which was considered but rejected) is that each edge of the spherical trapezium lies on a latitude or longitude, making interpolation from a lat-long grid relatively easy.Figure 1 provides an example of a partition of S 2 into 100 elements of equal area.
For a partition of the sphere into 6500 elements, the number of latitude bands is 73, and the maximum number of longitude separations (symmetrical each side of the equator at ±1.2697 • ) is 144, which fortuitously fits nicely with the 2.5 • resolution of the data examined in Sect.6.The regular grid was then interpolated onto this equal area partition, using the average value of the points whose centres fall in the target area -a variation of the nearest neighbour technique; at the equator there is an exact match, while at the poles, 144 points are averaged on the target.

The basic principles of the EMD in one dimension
The basic idea embodied in the EMD analysis, as introduced by Huang et al. (1998), is to allow for an adaptive and unsupervised representation of the intrinsic components of linear and non-linear signals based purely on the properties observed in the data, without appealing to the concept of stationarity.As Huang et al. (1998) point out in their abstract: "This decomposition method is adaptive and therefore highly efficient.Since the decomposition is based on the local characteristic time scale of the data, it is applicable to nonlinear and non-stationary processes."Few sequences of observations of natural phenomena are long enough to test the hypothesis of stationarity and frequently, the phenomena are patently non-stationary.The EMD algorithm copes with stationarity (or the lack of it) by ignoring the concept, embracing non-stationarity as a practical reality.For a fuller discussion of the genesis of these ideas, see the Introduction of Huang et al. (1998), who also heuristically demonstrate the implicit orthogonality of the sequences of Intrinsic Mode Functions (IMFs) defined by the EMD algorithm.
In the application of the 1-D EMD algorithm, the possibly nonlinear signal, which may exhibit varying amplitude and local frequency modulation, is linearly decomposed into a finite number of mutually quasi-orthogonal, zero mean, frequency and amplitude modulated signals, as well as a residual function which (i) exhibits a single extremum, (ii) is a monotonic trend or (iii) is simply a constant.Although EMD is a relatively new data analysis technique, its power and simplicity have encouraged its application in many fields.It is beyond the scope of this paper to give a complete review of the applications, however a few examples are cited here to give the reader a feeling for the broad scope of applications.Chiew et al. (2005) examine the one-dimensional EMD of several annual streamflow time series to search for significant trends in the data, using bootstrapping to test the statistical significance of identified trends.The technique has been used extensively in the analysis of ocean wave data (Huang et al., 1999;Hwang et al., 2003) as well as in the analysis of polar ice cover (Gloersen and Huang, 2003).EMD has also been applied in the analysis of seismological data by Zhang et al. (2003) and has even been used to diagnose heart rate fluctuations (Balocchi et al., 2004).The algorithm for 1-D EMD is readily available in the cited literature, so is not repeated here; the 2-D extension is based on the 1-D algorithm and is elaborated in Sect. 4.

The extension of EMD to 2 dimensions on the sphere
The 2-D EMD process is conceptually the same as in one dimension, except that the curve fitting of envelopes to the maxima and minima now becomes a surface fitting exercise and the identification of the local extrema is performed in space to take into account for the connectivity of the points.There is one difference in performing EMD on the sphere as against on a surface with edges, like a flat map: there are no edges requiring special treatment as the surface is fully continuous.The introduction to the 2-D EMD method applied on a 2-D plane, and references to earlier supporting work, are fully described by Sinclair and Pegram (2005), so only the essentials are repeated here.
2-D EMD provides a truly two-dimensional analysis of the intrinsic oscillatory modes inherent in the data.Twodimensional Fourier and Wavelet analyses are really applications of their one-dimensional counterparts in a number of principal directions.Fourier analysis concentrates on orthogonal "East-West" and "North-South" directions (e.g.Press et al., 1992).Wavelet analysis can, in general, consider any direction of the wavelet relative to the data, however a typical 2-D Wavelet analysis examines only horizontal, vertical and diagonal orthonormal wavelet basis functions (Daubechies, 1992, pp. 313;Kumar and Foufoula-Georgiou, 1993).In contrast, EMD produces a fully two dimensional decomposition of the data, based purely on spatial relationships between the extrema, independent of the orientation of the coordinate system in which the data are viewed.There is also a small change in terminology: in place of Implicit Mode Functions (IMFs) in 1-D EMD, we choose to use Implicit Mode Surfaces (IMSs) in 2-D EMD.

Description of the algorithm
The algorithm follows intuitively from the one-dimensional case.We consider a two-dimensional data-set z(θ, α), where θ and α describe the location of a point in 2-D space, or The decomposition may then be obtained as follows: 1. Set the (zeroth) residual r 0 (θ, α)=z(θ, α) and set i=1 2. Locate the extrema of r i−1 (θ, α) in the 2-D space including maximal and minimal plateaux.On a rectangular grid, this is done by finding those points in the center of a 3×3 square of pixels which are larger/smaller than any of the eight surrounding points.On the sphere partitioned using the Zonal Equal Area algorithm, it is sufficient to identify the extrema relative to the centres of the six closest surrounding elements, because of the staggering of the successive rings of elements.
3. Generate the bounding envelopes to the extrema, max i−1 (θ, α) and min i−1 (θ, α), using appropriate surface fitting techniques.We suggest Kriging with an exponential semivariogram whose correlation length is set equal to double the average spacing between the elements at each stage of the decomposition.This choice yields a relatively smooth surface without isolated peaks.
4. Compute the mean surface function of the local average value of the upper and lower envelopes fitted to the extrema, 2 5. Determine the first estimate of an IMS by subtracting the mean surface from the current residual IMS i (θ, α)=r i−1 (θ, α)−m i−1 (θ, α).
6. Iterate over steps 2-5 until IMS i is close to zero everywhere (setting r i−1 =IMS i before each iteration).
8. If the Residual r i (θ, α) is a constant or has a single extremum, then stop; else increment i and return to step 2.

Surface fitting for extremal envelope generation
The generation of maximal and minimal envelopes is of key importance to a successful 2-D EMD implementation and is the most computationally intensive task.The problem is a familiar one of collocating a smooth surface to randomly scattered data points in two-dimensions.There are several options available to achieve this.Ultimately the fitting procedure reduces to computing the unknown value of the surface at a point s i =(x i , y i ), by some linear (or nonlinear) weighting of the known data.In general, a basis function determines the influence of each known data point based on its spatial position relative to the unknown point s i .Nunes et al. (2003) and Delechelle et al. (2005) use radial basis functions while Linderhed uses bi-cubic splines (Linderhed, 2002) and later chooses the more suitable option of Thin Plate Splines (Linderhed, 2004).We tried radial basis functions (technically, conical Multiquadrics), which are identical to Kriging (Cressie, 1991) with a purely linear semivariogram model, but they produced spurious local extrema which only showed up in the IMSs.We decided it would be more appropriate to select an exponential variogram to model the envelopes of the maxima and minima.We choose a correlation length equal to the average distance between extrema at any stage of the decomposition, invoking Occam's razor in the spirit of Huang's original derivation of EMD.The bounding surfaces are then generated using Ordinary Kriging, ensuring that the surfaces are constrained to tend towards the mean surface of the extrema at distance.In essence, the 2-D EMD algorithm on the sphere provides a complete quasi-orthogonal decomposition of the data, hence allows for a summation of the data over a range of scales, thus providing, by analogy to the digital filter of one-dimensional vectors, a data-adaptive spatial filter.

Application to an artificial example
In order to test the 2-D algorithm on the sphere in controlled conditions, an artificial global dataset was first created by summing two different fields with contrasting patterns and different spatial scales.
The first field (Fig. 2a and b) consists of a simple meridional gradient from the North to the South pole.It constitutes the largest spatial scale of the artificial field.
The second field (Fig. 2c) has been constructed as a mix of scaled and translated Gaussian functions.It is assumed here to represent relatively small scale variations of the scalar field embedded in the larger spatial scale presented by the above gradient.
The artificial field analysed simply consists of the sum of the two interpolated fields and is presented in Fig. 2d.
The 2-D EMD algorithm was then applied to this artificial dataset to see if it is able to retrieve the two different fields that constitute it.
3 IMS were required before a constant field (of value 38.3) was obtained as a residual.Their variances are respectively 129.0, 10.0 and 125.3.They are presented in Fig. 3. Clearly the 1st IMS (Fig. 3a) accounts directly for the first scale embedded in the field, while the 3rd IMS (Fig. 3c) represents the large-scale gradient.The 2nd IMS (Fig. 3b) is evidently unrealistic as it represents a pattern that has not been imposed in the field.However its variance is extremely low and negligible in comparison to the two other IMS and thus can be legitimately absorbed as a minor perturbation to either of the others.In summary, 2-D EMD on the sphere has been successful in retrieving the two fields constituting the artificial field that we have constructed.In the next section we apply the same algorithm to the surface temperature.2b and c, interpolated onto a zonal equal area partitioning of the sphere using 6500 points.The surface air temperature (at 2 m) used here has been extracted from the ERA 15 reanalysis project (http://www.ecmwf.int).It is available on a 2.5×2.5 • regular grid from at a monthly time-step from January 1979 to December 1993.We make use here of the long-term annual mean computed from the monthly values.The field, shown interpolated on the zonal equal area partition of the sphere, is presented in Fig. 4. Surface air temperature is the ideal real-life example for the application of the EMD on the sphere, as its variations in space are the combination of different forcings operating at different scales, from small-scale temperature contrasts to the global scale related to the temperature gradient from the tropics to the poles.
The 2-D algorithm is then applied to the temperature field presented in Fig. 4 in the same way as it was applied to the artificial dataset presented in Sect. 5.
Here 5 IMSs are extracted before obtaining a residual constant field (of value 286.3).The variance explained by each IMS is presented in Table 1.
The 2 last IMSs have a very low variance and can safely be pooled with the 5th residual to form the 3rd residual.
The first IMS (Fig. 5a) represents the high frequency signal embedded in the temperature field.The most striking features are the local high spatial variances associated with high mountainous areas such as the Andes and the Himalayas.The Antarctic and Greenland Ice-Caps and the strong temperature gradients associated with the relatively warmer surrounding ocean masses are also responsible for large local variability.As such and as expected, the 1st IMS accounts mainly for the small scale variability in the surface temperature field associated with local forcings.
The second IMS (Fig. 5b) represents clearly larger scale variations and has the highest variance.The variance is related to a large extent to the temperature gradient from the tropics to the poles, that is forced by the distribution of solar radiation.A very prominent feature also extracted on this IMS is the presence of large areas of high temperature values found in the Pacific Ocean, which we attribute to the warm pool.
The third IMS (Fig. 5c), is related again to the large-scale gradient from the tropics to the poles, minus the prominent warm pool region that has been extracted in the 2nd IMS.
The results presented above confirm that the 3 first IMSs qualitatively represent three different and hierarchical scales of the spatial organization of the global mean temperature: local to regional scale variations linked to the topography and the presence of ice-caps, the prominent warm pool region in the pacific, and the large-scale gradient from the tropics to the poles related to mean solar forcing.
As the sum of the all the IMSs plus the final residual is exactly equal to the original field (see Sect. 3), one then can use 2-D EMD as a filter to remove from a field the high frequency variations, as a data-adaptative spatial filter.The field in Fig. 6 has been obtained by summing IMSs 2 to 5 with the 5th residual and is therefore equal to the first residual.The effect is to filter out the smaller scale topographically forced variations extracted in the 1st IMS.The resulting field shows the contribution of large scale forcings to global mean surface temperature.

Conclusions
Empirical Mode Decomposition has been applied to the analysis of a two-dimensional field on the sphere, taking advantage of the data-adaptive nature of the method and the absence of edge or end effects on the sphere.In addition, the interpolation of data from a regular grid to a zonal equal area partition of the sphere allows each observation to be treated with the same weight.
An artificial dataset was analyzed and the results demonstrate the ability of EMD on the sphere to extract the different scales of spatial variations contained in a global field.The same algorithm is thereafter applied to a real data set consisting of the long-term mean (1979)(1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993) of annual surface temperatures as given by the ECMWF ERA 15 reanalysis system.The first three IMSs obtained from the decomposition of the temperature field represent schematically three different and hierarchical scales of the spatial organization of the global temperature: (i) local to regional scale variations linked to the topography and the presence of icecaps, (ii) the prominent warm pool region in the Pacific, and (iii) the largescale gradient from the tropics to the poles related to mean solar forcing.The temperature field was then reconstructed using only the large scale IMSs and the final residual, to demonstrate the ability of EMD to act as a data-adaptive spatial filter, able (for example) to filter out the smaller scale variations related to local forcings.The potential applications of EMD on the sphere for climate data are numerous: in the context of statistical analysis of space-time climate variability, the algorithm can be used prior to traditional techniques (such as PCA/EOF) to uncover the intrinsic spatial scales contained in a field and filter out the unwanted scales of variations.EMD could also be used, in the context of numerical simulations on Atmospheric General Circulation Models, to create forcing fields having predefined scales of spatial variation.We propose EMD on the sphere as an appropriate a data-adaptive and largely datadriven exploratory analysis tool for global geophysical data.

Fig. 1 .
Fig. 1.A Zonal Equal Area partition of the sphere using 100 points.

Fig. 2 .
Fig. 2. (a) 1st field consisting of a meridional gradient from the North to the South pole:plotted on a regular 73×144 grid.(b) 1st field consisting of a meridional gradient from the North to the South pole: interpolation onto a zonal equal area partitioning of the sphere using 6500 points.(c) 2nd field created by scaling and translating Gaussian functions: interpolation onto a zonal equal area partitioning of the sphere using 6500 points.(d) Artificial field produced by summing the fields shown in Fig.2b and c, interpolated onto a zonal equal area partitioning of the sphere using 6500 points.

Fig. 3 .
Fig. 3. (a) The 1st IMS of the artificial field, it's variance is equal to 129.0.(b) The 2nd IMS of the artificial field, it's variance is equal to 10. (c) The 3rd IMS of the artificial field, it's variance is equal to 125.3.

Fig. 5 .
Fig. 5. (a) The 1st IMS produced when performing EMD on the ERA 15 surface temperature field.(b) The 2nd IMS produced when performing EMD on the ERA 15 surface temperature field.(c) The 3rd IMS produced when performing EMD on the ERA 15 surface temperature field.

Fig. 6 .
Fig. 6.Temperature field reconstructed from the sum of IMSs 2 to 5 and the 5th residual.This is equivalent to the first residual.