Articles | Volume 25, issue 1
https://doi.org/10.5194/hess-25-321-2021
https://doi.org/10.5194/hess-25-321-2021
Technical note
 | 
19 Jan 2021
Technical note |  | 19 Jan 2021

Technical Note: Improved partial wavelet coherency for understanding scale-specific and localized bivariate relationships in geosciences

Wei Hu and Bing Si
Abstract

Bivariate wavelet coherency is a measure of correlation between two variables in the location–scale (spatial data) or time–frequency (time series) domain. It is particularly suited to geoscience, where relationships between multiple variables differ with locations (times) and/or scales (frequencies) because of the various processes involved. However, it is well-known that bivariate relationships can be misleading when both variables are dependent on other variables. Partial wavelet coherency (PWC) has been proposed to detect scale-specific and localized bivariate relationships by excluding the effects of other variables but is limited to one excluding variable and provides no phase information. We aim to develop a new PWC method that can deal with multiple excluding variables and provide phase information. Both stationary and non-stationary artificial datasets with the response variable being the sum of five cosine waves at 256 locations are used to test the method. The new method was also applied to a free water evaporation dataset. Our results verified the advantages of the new method in capturing phase information and dealing with multiple excluding variables. Where there is one excluding variable, the new PWC implementation produces higher and more accurate PWC values than the previously published PWC implementation that mistakenly considered bivariate real coherence rather than bivariate complex coherence. We suggest the PWC method is used to untangle scale-specific and localized bivariate relationships after removing the effects of other variables in geosciences. The PWC implementations were coded with Matlab and are freely accessible (https://figshare.com/s/bc97956f43fe5734c784, last access: 14 January 2021).

Dates
1 Introduction

Geoscience data, such as the spatial distribution of soil moisture in undulating terrains and time series of climatic variables, usually consist of a variety of transient processes with different scales or frequencies that may be localized in space or time (Torrence and Compo, 1998; Si, 2008; Graf et al., 2014). For example, time series of air temperature usually fluctuate periodically at different scales (e.g., daily and yearly), but abrupt changes in air temperature (e.g., extremely high or low) may occur at certain time points as a result of extreme weather and climate events (e.g., heat and rain). Wavelet methods are widely used to detect localized features of geoscience data.

Wavelet analyses are based on the wavelet transform using the mother wavelet function, which expands spatial data (or time series) into location–scale (or time–frequency) space for identification of localized intermittent scales (or frequencies). For convenience, we will mainly refer to location and scale irrespective of spatial or time series data unless otherwise mentioned. Bivariate wavelet coherency (BWC) is widely accepted as a tool for detecting scale-specific and localized bivariate relationships in a range of areas in geoscience (Lakshmi et al., 2004; Si and Zeleke, 2005; Das and Mohanty, 2008; Polansky et al., 2010; Biswas and Si, 2011). The BWC partitions correlation between two variables into different locations and scales, which are different from the overall relationships at the sampling scale as shown by the traditional correlation coefficient. For example, BWC analysis indicated that soil water content of a hummocky landscape in the Canadian Prairies was negatively correlated with soil organic carbon content at a slope scale (50 m), but they were positively correlated at a watershed scale (120 m) in summer because of the different processes involved at different scales (Hu et al., 2017b). Because the positive correlation may cancel out with the negative one at different scales and/or locations, the traditional correlation coefficient between soil water content and soil organic carbon content does not differ significantly from zero, which can be misleading.

Recently, Hu and Si (2016) extended BWC to multiple wavelet coherence (MWC) that can be used to untangle multivariate (≥3 variables) relationships in multiple location–scale domains. This method has been successfully used in hydrology (Hu et al., 2017b; Nalley et al., 2019; Su et al., 2019; Gu et al., 2020; Mares et al., 2020) and other areas such as soil science (Centeno et al., 2020), environmental science (Zhao et al., 2018), meteorology (Song et al., 2020), and economics (Sen and Choudhury, 2020). The MWC application has shown that an increased number of predictor variables does not necessarily explain more variations in the response variable, partly because predictor variables are usually cross-correlated (Hu and Si, 2016). For the same reason, bivariate relationships can be misleading if the predictor variable is correlated with other variables that control the response variable. Partial correlation analysis is one such method to avoid the misleading relationships resulting from the interdependence between predictor and other variables (Kenney and Keeping, 1939). For example, soil water content of the root zone was found to be positively related to grass yield throughout the year in a small watershed on the Chinese Loess Plateau (Hu et al., 2017a). This was because higher grass yield usually coincided with finer soils that usually have higher water-holding capacity. After removing the effects of other factors including sand content, partial correlation analysis indicated that soil water content was negatively affected by grass yield during growing seasons and not affected by grass yield during non-growing seasons as expected. The study of Hu et al. (2017a) clearly demonstrated that partial correlation analysis can be an effective method to avoid misleading relationships between response (e.g., soil water content) and predictor variables (e.g., grass yield) when the latter was interdependent with other variables (e.g., sand content). However, the extension of partial correlation to the multiple location–scale domain is limited. In order to better understand the bivariate relationships at various scales and locations, BWC needs to be extended to partial wavelet coherency (PWC) by eliminating the effects of other variables.

BWC was extended to PWC by Mihanoviæ et al. (2009). Their method has been widely used in the areas of marine science (Ng and Chan, 2012a, b), meteorology (Tan et al., 2016; Rathinasamy et al., 2017), and economics (Aloui et al., 2018; Altarturi et al., 2018; Wu et al., 2020), as well as in the study of greenhouse gas emissions (Jia et al., 2018; Li et al., 2018; Mutascu and Sokic, 2020), among others. For example, PWC analysis indicated that the Southern Oscillation Index and Pacific Decadal Oscillation did not affect precipitation across India, while this was misinterpreted by the BWC analysis because of their interdependence on Niño 3.4, which affects precipitation (Rathinasamy et al., 2017). Unfortunately, the PWC implementation in many previous studies (Ng and Chan, 2012b; Rathinasamy et al., 2017; Aloui et al., 2018; Altarturi et al., 2018; Jia et al., 2018; Li et al., 2018; Mutascu and Sokic, 2020; Wu et al., 2020) was based on an incorrect Matlab code developed by Ng and Chan (2012a), who might have misinterpreted the equation of Mihanoviæ et al. (2009) and mistakenly used bivariate real coherence rather than bivariate complex coherence for calculating PWC. Moreover, Mihanoviæ et al. (2009) considered only one excluding variable (i.e., the variable that influences the response variable is excluded) and did not include the phase angle difference between response and predictor variables. The PWC values between response and predictor variables can still be misleading if more than one variable is interdependent with the predictor variable. This is especially true if these variables are correlated with the predictor variable at different locations and/or scales. Without phase information, it is hard to tell whether the correlation at a location and scale is positive or negative.

As an extension of previous studies (Mihanoviæ et al., 2009; Hu and Si, 2016), this paper aims to develop a PWC method that considers more than one excluding variable and provides phase information. This new method reveals the magnitude and type of bivariate relationships after removing the effects from all potentially interdependent variables. We expect that the new method will produce more accurate PWC values than the implementation of Ng and Chan (2012a), where there is one excluding variable. The new method is an extension of the multivariate partial coherency in the frequency (scale) domain (Koopmans, 1974). The proposed method is first tested with artificial datasets following Yan and Gao (2007) and Hu and Si (2016) to demonstrate its capability of capturing the known relationships of the artificial data. Then it is applied to a real dataset, i.e., time series of free water evaporation at the Changwu site in China (Hu and Si, 2016). Finally, the advantages and weaknesses of the new method are discussed by comparing it with the previous PWC method (Mihanoviæ et al., 2009) and implementation (Ng and Chan, 2012a).

2 Theory

Wavelet analysis is based on the wavelet transform, which includes continuous wavelet transform and discrete wavelet transform. While the discrete wavelet transform is mainly used for data compression and noise reduction, the continuous wavelet transform is widely used for extracting scale-specific and localized features, as in the case of this study (Grinsted et al., 2004). The wavelet transform decomposes the spatial data (or time series) into a set of location- and scale-specific wavelet coefficients, which are scaled (contracted or expanded) and shifted versions of mother wavelets. Different mother wavelets are available for wavelet transform, among which the Morlet wavelet, composed of a complex exponential multiplied by a Gaussian window, provides a good balance between location and scale localization. Therefore, continuous wavelet transform with the Morlet wavelet is suitable for transforming spatial data (or time series) into a location–scale (or time–frequency) domain, which allows us to identify both location-specific amplitude and phase information of wavelet coefficients at different scales (Torrence and Compo, 1998). Wavelet coefficients and their complex conjugates are used to calculate auto-wavelet power spectra and cross-wavelet power spectra. BWC is calculated as the ratio of smoothed cross-wavelet power spectra of two variables to the product of their auto-wavelet power spectra (Grinsted et al., 2004). Hu and Si (2016) extended wavelet coherence from two to multiple (≥3) variables and developed MWC. Detailed information on the calculations of wavelet coefficients, auto- and cross-wavelet power spectra, BWC, and MWC based on the continuous wavelet transform can be found in previous studies (e.g., Torrence and Compo, 1998; Grinsted et al., 2004; Si and Farrell, 2004; Si, 2008; Hu and Si, 2016; Hu et al., 2017b). Here, we will only introduce the theory and calculation that are most relevant to PWC.

Similarly to BWC and MWC, PWC is calculated from auto- and cross-wavelet power spectra, for the response variable y, predictor variable x, and excluding variables Z (Z=Z1,Z2,,Zq). Koopmans (1974) developed the multivariate complex PWC in the frequency (scale) domain. Here, we extend the Koopmans (1974) method from the frequency (scale) domain to the time–frequency (location–scale) domain. Therefore, the complex PWC between y and x after excluding variables Z at scale s and location τ, γy,xZ(s,τ), can be written as

(1) γ y , x Z ( s , τ ) = 1 - R y , x , Z 2 ( s , τ ) γ y , x ( s , τ ) 1 - R y , Z 2 ( s , τ ) 1 - R x , Z 2 ( s , τ ) ,

where symbol is the notation for excluding variables; Ry,x,Z2(s,τ), Ry,Z2(s,τ), and Rx,Z2(s,τ) can be calculated by following Hu and Si (2016) as

(2)Ry,x,Z2(s,τ)=wy,Z(s,τ)wZ,Z(s,τ)-1wx,Z(s,τ)wy,x(s,τ),(3)Ry,Z2(s,τ)=wy,Z(s,τ)wZ,Z(s,τ)-1wy,Z(s,τ)wy,y(s,τ),(4)Rx,Z2(s,τ)=wx,Z(s,τ)wZ,Z(s,τ)-1wx,Z(s,τ)wx,x(s,τ).

Equation (1) can also be derived analogously from the complex partial spectrum for the frequency domain according to the definition of complex coherence between two variables in the time–frequency domain (see Sect. S1 in the Supplement for the derivation process). Note that Ry,x,Z2(s,τ) is a matrix with complex values, while Ry,Z2(s,τ) and Rx,Z2(s,τ) are matrices with real numbers. γy,x(s,τ) is the complex wavelet coherence between y and x, which can be written as

(5) γ y , x ( s , τ ) = w y , x ( s , τ ) w y , y ( s , τ ) w x , x ( s , τ ) ,

where () is the smoothing operator, () is the complex conjugate operator, ( )−1 indicates the inverse of the matrix, and

(6)wy,Z(s,τ)=wy,Z1(s,τ)wy,Z2(s,τ)wy,Zq(s,τ),(7)wx,Z(s,τ)=wx,Z1(s,τ)wx,Z2(s,τ)wx,Zq(s,τ),(8)wZ,Z(s,τ)=wZ1,Z1(s,τ)wZ1,Zq(s,τ)wZq,Z1(s,τ)wZq,Zq(s,τ),

where wA,B(s,τ) is the smoothed auto-wavelet power spectra (when A=B) or cross-wavelet power spectra (when AB) at scale s and location τ, respectively.

The squared PWC (hereinafter referred to as PWC) at scale s and location τ, ρy,xZ2, can be written as

(9) ρ y , x Z 2 = 1 - R y , x , Z 2 ( s , τ ) 2 R y , x 2 ( s , τ ) 1 - R y , Z 2 ( s , τ ) 1 - R x , Z 2 ( s , τ ) ,

where Ry,x2(s,τ) is squared BWC between y and x, which can be expressed as

(10) R y , x 2 ( s , τ ) = w y , x ( s , τ ) w y , x ( s , τ ) w y , y ( s , τ ) w x , x ( s , τ ) .

The phase angle (i.e., angle between two complex numbers) between y and x after the excluding effect of Z is

(11) ϑ y , x Z ( s , τ ) = φ y , x Z ( s , τ ) + ϑ y , x ( s , τ ) ,

where

(12) φ y , x Z ( s , τ ) = arg 1 - R y , x , Z 2 ( s , τ ) ,

and ϑy,x(s,τ) is the wavelet phase between y and x, which can be expressed as

(13) ϑ y , x ( s , τ ) = tan - 1 Im W y , x ( s , τ ) / Re W y , x ( s , τ ) ,

where arg denotes the argument of the complex number and Wy,x(s,τ) is the cross-wavelet power spectrum between y and x at scale s and location τ; Im and Re denote the imaginary and real parts of Wy,x(s,τ), respectively.

When only one variable (e.g., Z1) is excluded, Eq. (9) can be written as (see Sect. S2 in the Supplement for the derivation process)

(14) ρ y , x Z 1 2 = γ y , x ( s , τ ) - γ y , Z 1 ( s , τ ) γ x , Z 1 ( s , τ ) 2 1 - R y , Z 1 2 ( s , τ ) 1 - R x , Z 1 2 ( s , τ ) .

The widely used Monte Carlo method (Torrence and Compo, 1998; Grinsted et al., 2004; Si and Farrell, 2004) is used to calculate PWC at the 95 % confidence level. In brief, the PWC calculation is repeated for a sufficient number (i.e., minimum number required) of times using data generated by Monte Carlo simulations based on the first-order autocorrelation coefficient (r1). The first-order autoregressive model (AR(1)) is chosen because most geoscience data can be effectively simulated by it (Wendroth et al., 1992; Grinsted et al., 2004; Si and Farrell, 2004), although we recognize that a time series with long-range dependence is also common in many areas such as hydrology (Szolgayová et al., 2014). Different combinations of r1 values (i.e., 0.0, 0.5, and 0.9) were used to generate 10 to 10 000 AR(1) series with three, four, and five variables. Our results indicate that the noise combination has little impact on the PWC values at the 95 % confidence level as also found by Grinsted et al. (2004) for the BWC case (data not shown). The relative difference of PWC at the 95 % confidence level compared with that calculated from the 10 000 AR(1) series decreases with the increase in number of AR(1) series (Fig. S1 of Sect. S3 in the Supplement). When the number of AR(1) is above 300, a very low maximum relative difference (e.g., <2 %) is observed. Therefore, a repeating number of 300 seems to be sufficient for a significance test. However, if calculation time is not a barrier, a higher repeating number, such as ≥1000, is recommended. The 95th percentile of PWCs of all simulations at each scale represents PWC at the 95 % confidence level. The average PWC, percent area of significant coherence (PASC) relative to the whole wavelet location–scale domain (Hu and Si, 2016), and average value of significant PWC (PWCsig) are also calculated for different location–scale domains.

In the case of one excluding variable (Z={Z1}), Mihanoviæ et al. (2009) suggested that PWC can be calculated by an equation analogous to the traditional partial correlation squared (Kenney and Keeping, 1939) without giving a detailed derivation process. Their equation is the same as Eq. (14). Unfortunately, Ng and Chan (2012a) might have misinterpreted the equation of Mihanoviæ et al. (2009) and developed Matlab code for calculating PWC using the equation expressed as

(15) ρ y , x Z 1 2 = R y , x ( s , τ ) - R y , Z 1 ( s , τ ) R x , Z 1 ( s , τ ) 2 1 - R y , Z 1 2 ( s , τ ) 1 - R x , Z 1 2 ( s , τ ) ,

where Ry,x(s,τ), Ry,Z1(s,τ), and Rx,Z1(s,τ) are the square roots of Ry,x2(s,τ), Ry,Z12(s,τ), and Rx,Z12(s,τ), respectively. Ry,Z12(s,τ) and Rx,Z12(s,τ) can be calculated from Eq. (10) by replacing y and x with their corresponding variables. Equation (15) has been widely used to calculate PWC in the case of one excluding variable (Ng and Chan, 2012b; Rathinasamy et al., 2017; Aloui et al., 2018; Altarturi et al., 2018; Jia et al., 2018; Li et al., 2018; Mutascu and Sokic, 2020; Wu et al., 2020). Note that complex coherence and real coherence are involved in the numerators of Eqs. (14) and (15), respectively, while the denominators are exactly the same. Further comparison indicates that Eq. (15) underestimates PWC value relative to Eq. (14) unless γy,x(s,τ) and γy,Z1(s,τ)γx,Z1(s,τ) in Eq. (14) are collinear (i.e., their arguments are identical) under which the two equations produce the same PWC values. Differences between Eqs. (14) and (15) will be discussed further using both artificial data and a real dataset. For comparison purposes, we refer to Eqs. (14) and (15) as the new implementation and the classical implementation, respectively.

3 Method test using artificial data

3.1 Artificial data and analysis

PWC is first tested using the cosine-like artificial dataset produced following Yan and Gao (2007). The cosine-like artificial datasets are suitable for testing the new method because they mimic many spatial or time series data in geoscience such as climatic variables, hydrologic fluxes, seismic signals, El Niño–Southern Oscillation, land surface topography, ocean waves, and soil moisture. The procedures to test PWC are largely based on Hu and Si (2016), where the same dataset has been used to test the MWC method (refer to Hu and Si, 2016, for a detailed description of the artificial dataset). The response variable (y and z for the stationary and non-stationary cases, respectively) is the sum of five cosine waves (y1 to y5 and z1 to z5 for the stationary and non-stationary cases, respectively) at 256 locations (Hu and Si, 2016). For y1 to y5, they have consistent dimensionless scales of 4, 8, 16, 32, and 64, respectively, across the series. From z1 to z5, the dimensionless scales gradually change with location, with the maximum dimensionless scales of 4, 8, 16, 32, and 64, respectively. The variance of the response variables y and z is 2.5. All other variables are orthogonal to each other with equal variance of 0.5. The predictor and excluding variables (Fig. S2 of Sect. S3 in the Supplement) are selected from two of the five cosine waves (i.e., y2 and y4 or z2 and z4) and/or their derivatives. The exact variables and procedures to test the new PWC method are explained below.

First, PWC between response variable y (or z) and predictor variable, i.e., y2 (or z2), is calculated after excluding the effect of one variable. Four types of excluding variable are involved (Fig. S2 of Sect. S3 in the Supplement): (a) original series of y4 (or z4); (b) second halves of the original series of y2 (or z2) are replaced by 0 to simulate abrupt changes (i.e., transient and localized feature) in the spatial data. They are referred to as y2,h0 (or z2,h0); (c) white noises with zero-mean and standard deviations of 0.3 (weak noise), 1 (moderate noise), and 4 (high noise) are added to y2 (or z2) as suggested by Hu and Si (2016) to simulate non-perfect cyclic patterns of the excluding variables. They are referred to as y2,w (or z2,w), y2,m (or z2,m), and y2,s (or z2,s), respectively; and (d) a combination of type b and type c. They are referred to as y2,w,h0 (or z2,w,h0), y2,m,h0 (or z2,m,h0), and y2,s,h0 (or z2,s,h0), respectively.

Second, PWC between response variable y (or z) and predictor variable, i.e., y24 (sum of y2 and y4) for the stationary case or z24 (sum of z2 and z4) for the non-stationary case, is calculated with two excluding variables, which is a combination of y4 (or z4) and y2 (or z2) or its noised series (y2,w or z2,w, y2,m or z2,m, and y2,s or z2,s).

The merit of the artificial data is that we know the exact scale-specific and localized bivariate relationships after the effect of excluding variables is removed. Theoretically, we expect that (a) PWC is 1 at scales corresponding to the relative complement of excluding variable scales in predictor variable scales and 0 at other scales. For example, PWC between y and y24 after excluding the effect of y4 is expected to be 1 at the scale of 8, which is the relative complement of scale of excluding variable y4 (32) in scales of predictor variable y24 (8 and 32), and 0 at other scales; (b) PWC remains 1 in the second half of series, where spatial series is replaced by 0, and is 0 at the first half of the original series. For example, PWC between y and y2 after excluding the effect of y2,h0 is expected to be 0 and 1 in the first and second halves of series, respectively, at the scale of 8; and (c) PWC increases as more noises are included in the excluding variables. For example, PWC between y and y2 after excluding the effect of noised series of y2 is expected to increase with increasing noises in an order of y2,w<y2,m<y2,s at the scale of 8.

3.2 PWC with artificial data

3.2.1 PWC with one excluding variable using the new method

Figure 1 shows PWC between response variable y (or z) and predictor variable y2 (or z2) by excluding one variable. For the stationary case, there is one horizontal band (red color) representing an in-phase high PWC value at scales around 8 for all locations after eliminating the effect of y4 (Fig. 1a). Note that the PWC values between y and y2 after excluding the effect of y4 are not exactly 1 as would be expected at all location–scale domains, because of the effect of smoothing along locations and scales. However, the PWC values at the center of the significance band, which corresponds to the predictor variable y2 at exactly the scale of 8, are very close to 1 (0.996), and the mean PWCsig values are very high (i.e., 0.96). The result is similar to the BWC between y and y2 (data not shown). This is understandable because y4 is orthogonal to y2, and excluding the effect of y4 does not affect the relationship between y and y2 at all.

https://hess.copernicus.org/articles/25/321/2021/hess-25-321-2021-f01

Figure 1Partial wavelet coherency (PWC) between response variable y (or z) and predictor variable y2 (or z2) after excluding the effect of variables y4 (or z4), y2,s (or z2,s), y2,m (or z2,m), y2,w (or z2,w), y2,h0 (or z2,h0), y2,w,h0 (or z2,w,h0), y2,m,h0 (or z2,m,h0), and y2,s,h0 (or z2,s,h0) for the stationary (or non-stationary) case using the new method. Arrows represent the phase angles of the cross-wavelet power spectra between two variables after eliminating the effect of excluding variables. Arrows pointing to the right (left) indicate positive (negative) correlations. Thin and thick solid lines show the cones of influence and the 95 % confidence levels, respectively. All variables were generated by following Yan and Gao (2007) and Hu and Si (2016) and are explained in Sect 3.1 and shown in Fig. S2 of Sect. S3 in the Supplement.

Download

Compared with the case of the excluding variable of y4 (Fig. 1a), excluding the effect of y2,s (Fig. 1b) results in a slightly narrower band of significant PWC and slightly reduced mean PWCsig (0.94 versus 0.96). When less noise is included in the excluding variables (i.e., y2,m and y2,w) (Fig. 1c and d), the significant PWC band becomes narrower. The PASC values are 86 %, 77 %, and 32 % for excluding y2,s, y2,m, and y2,w, respectively, at scales of 6–10. Moreover, the mean PWCsig decreases from 0.94 (y2,s) to 0.93 (y2,m) and 0.89 (y2,w) when progressively less noise is added (Fig. 1b–d). For the non-stationary case, similar results are obtained (Fig. 1e–h). The only difference is that the scales with significant PWC values change with location, as is found for MWC (Hu and Si, 2016).

When the second half of the excluding variable series is replaced by 0, the PWC values in that half are close to 1, while those in the first half of the data series are 0 at scales corresponding to the predictor variable (Fig. 1i and m). For the stationary case, after excluding the effect of y2,h0, the PWC values are close to 1 (0.98) and 0 in the second and first halves of the data series, respectively, at the dimensionless scale of 8 (Fig. 1i). Similar results are observed for the non-stationary case (Fig. 1m). This is anticipated because the series of 0s is independent of the predictor variable and hence has no effect on the correlations between response and predictor variables at these locations. If different magnitudes of noises are added to the first half of the excluding variables (y2 or z2), the significant PWC band in the first half becomes wider as the magnitude of noises increases, while the significant PWC band in the second half remains almost unchanged (Fig. 1j–l and n–p). In the stationary case, for example, the PASC values at scales of 6–10 are 40 % (y2,w,h0), 74 % (y2,m,h0), and 86 % (y2,s,h0) in the first half, while those values vary from 86 % to 90 % in the second half (Fig. 1j–l). Meanwhile, the mean PWCsig in the first half at scales of 6–10 increases from 0.91 to 0.94 in both the stationary (Fig. 1j–l) and non-stationary (Fig. 1n–p) cases as more noises are added to the excluding variables y2 or z2. This indicates that the new PWC method can also capture the abrupt changes (Fig. 1i and m) in the data series and has the ability to deal with localized relationships.

3.2.2 PWC with two excluding variables using the new method

When both y2 and y4 (or z2 and z4) are considered in the predictor variables, there are two bands of wavelet coherence of 1 between y (or z) and y24 (or z24) (Hu and Si, 2016), which correspond to the scales of two predictor variables. However, after the effect of y4 (or z4) is removed, only one band with PWC of around 1 occurs at the scale of the predictor variable y2 (or z2) (Fig. 2a and f). After both predictor variables y2 and y4 (or z2 and z4) are excluded (Fig. 2b and g), PWC between y (or z) and y24 (or z24) is 0 at all location–scale domains as expected. When one of the excluding variables y2 (or z2) is added with noises, the relationship between response variable y (or z) and predictor variable y24 (or z24) becomes significant at scales of the excluding variable y2 (or z2) (Fig. 2c and h). Similar to the case of one excluding variable (Fig. 1), less noise in the excluding variable of y2 (or z2) results in a narrower significant PWC band and reduced mean PWCsig values, e.g., from 0.96 (y2,s) to 0.90 (y2,w) in the stationary case (Fig. 2c–e) and from 0.95 (z2,s) to 0.92 (z2,w) in the non-stationary case (Fig. 2h–j).

https://hess.copernicus.org/articles/25/321/2021/hess-25-321-2021-f02

Figure 2PWC between response variable y (or z) and predictor variable y24 (or z24) after excluding the effect of variables y4 (or z4), y2+y4 (or z2+z4), y2,s+y4 (or z2,s+z4), y2,m+y4 (or z2,m+z4), and y2,w+y4 (or z2,w+z4) for the stationary (or non-stationary) case using the new method. All variables were generated by following Yan and Gao (2007) and Hu and Si (2016) and are explained in Sect. 3.1 and shown in Fig. S2 of Sect. S3.

Download

4 Method application with the real dataset

4.1 Description of the free water evaporation dataset

The free water evaporation dataset was used to test MWC (Hu and Si, 2016). In brief, this dataset includes monthly free water evaporation (E), mean temperature (T), relative humidity (RH), sun hours (SH), and wind speed (WS) between January 1979 and December 2013 at the Changwu site in Shaanxi province provided by the China Meteorological Administration. During this period, the average daily temperature was 9.4 C, the average annual rainfall was 571 mm, and annual potential evapotranspiration was 883 mm. Because of its location between semi-arid and subhumid climates, agricultural production at the Changwu site is constrained by water availability. Results of the wavelet power spectrum of E and BWC between every two variables are shown in Figs. S3 and S4 (Sect. S3 in the Supplement), respectively.

4.2 PWC with the free water evaporation dataset

The PWC analysis indicates that the correlations between E and T after excluding the effect of each of the other three variables (RH, SH, and WS) were almost the same as those indicated by BWC (Figs. 3a–c and S4 of Sect. S3 in the Supplement). For example, E and T, after excluding the effect of RH, were positively correlated at the medium scales (8–32 months). The PASC was 61 % and the mean PWCsig value was 0.94. No significant correlations between E and T from 1979 to 1992 were found at scales around 64 months after eliminating the influence of RH (Fig. 3a–c). This implies that the influence of mean temperature on E at these scales and years may be associated with the negative influence of RH on both E and T (Fig. S4 of Sect. S3 in the Supplement).

https://hess.copernicus.org/articles/25/321/2021/hess-25-321-2021-f03

Figure 3PWC between evaporation (E) and each meteorological factor (T, mean temperature; RH, relative humidity; SH, sun hours; WS, wind speed) after excluding the effect of each of the other three meteorological factors.

Download

PWC between E and RH depended on the excluding variable and scale (Fig. 3d–f). The mean PWC and PASC between E and RH after excluding T were 0.60 and 34 %, respectively, which are comparable with the mean BWC (0.62) and PASC (40 %) between E and RH. The corresponding values after excluding SH and WS were 0.50 and 0.53 (PWC) and 22 % and 21 % (PASC), respectively. In addition, compared with the BWC between E and RH (Fig. S4 of Sect. S3), correlations between E and RH were weak at small scales (<8 months) and medium scales (8–32 months) after eliminating the influence of SH and WS (Fig. 3e–f), respectively. Therefore, excluding the variable of T had less influence on the coherence between E and RH compared with excluding the variables of SH and WS. This is mainly because RH and T are correlated with E at different scales (Fig. S4 of Sect. S3), i.e., mean temperature affected E mainly at medium scales, while RH affected E across all scales. However, the domain where SH and WS were correlated with E was a subset of that where RH and E were correlated (Fig. S4 of Sect. S3).

The relationships between E and SH after excluding the other three factors were less consistent (Fig. 3g–h). The areas with significant corrections were scattered over the whole location–scale domain but differed with excluding factor. The PASC varied from 12 % (excluding RH) to 20 % (excluding T and WS), which is much lower than the PASC (28 %) in the case of BWC. The significant relationships between E and WS were only limited to very small areas except for the case of SH being excluded, where E and WS were positively correlated at scales of 8–16 months most of the time (Fig. 3j–l).

In general, PASC decreased after excluding the effects of more factors (data not shown). The correlations between E and each variable after eliminating the effects of all other variables are shown in Fig. 4. The correlations between E and T were still significant at the medium scales (8–32 months) (Fig. 4a), where PASC value was 52 % with a mean PWCsig of 0.92. The E was still correlated with RH at large scales (>85 months) (Fig. 4b), where PASC value was 35 % with a mean PWCsig of 0.96. Interestingly, the domain with significant correlation between E and SH and WS was very limited (Fig. 4c–d). This indicates that the influences of SH and WS on E have already been covered by RH and T. This is in agreement with the MWC results that RH and T were best for explaining E variations at all scales (Hu and Si, 2016). Although the RH had the greatest mean wavelet coherence and PASC in the entire location–scale domains, the PWC analysis seems to support mean temperature being the most dominating factor for free water evaporation at the 1-year cycle (8–16 months), which is the dominant scale of E variation (Fig. S3 of Sect. S3).

https://hess.copernicus.org/articles/25/321/2021/hess-25-321-2021-f04

Figure 4PWC between evaporation (E) and each meteorological factor (T, mean temperature; RH, relative humidity; SH, sun hours; WS, wind speed) after excluding the effects of all three other factors.

Download

5 Discussion on the advantages and weaknesses of the new method

5.1 Advantages

We extend the partial coherence method from the frequency (scale) domain (Koopmans, 1974) to the time–frequency (location–scale) domain. The new method is an extension of previous work on PWC and MWC (Mihanoviæ et al., 2009; Hu and Si, 2016). The method test and application have verified that it has the advantage of dealing with more than one excluding variable and providing the phase information associated with PWC. In the case of one excluding variable, Mihanoviæ et al. (2009) have suggested calculating PWC by using an equation analogous to the traditional partial correlation squared (Eq. 14), which can be derived from our Eq. (9). However, their equation was, unfortunately, widely used by replacing the complex coherence in Eq. (14) with real coherence as expressed in Eq. (15) (Ng and Chan, 2012b, a; Rathinasamy et al., 2017; Aloui et al., 2018; Altarturi et al., 2018; Jia et al., 2018; Li et al., 2018; Mutascu and Sokic, 2020; Wu et al., 2020). This mistake is corrected in this paper.

The differences between the new (Eq. 14) and classical (Eq. 15) implementations are compared in the case of one excluding variable using both the artificial and real datasets. Except for the phase information, the two implementations generally produce comparable coherence for the artificial dataset (Fig. S5 of Sect. S3 in the Supplement). However, the new implementation produces consistently and slightly higher coherence than the classical implementation. For example, their mean PWCs between y and y2 at the scale of 8 after excluding the effect of y4 are 1.00 and 0.97, respectively. This indicates that the new implementation produces coherence between y and y2 at the scale (i.e., 8) of y2, closer to 1 as we expect. While the classical implementation produces similar PWC between E and other meteorological factors in most cases, especially for the coherence between E and T after excluding the effects of others (Fig. S6 of Sect. S3 in the Supplement), large differences between these two implementations can also be observed. For example, while the new implementation recognizes the strong coherence between E and RH after excluding the effect of T at scales of around 1 year (Fig. 3d), this coherence was negligible by the classical implementation (Fig. 5a). Mean PWC values by the new implementation were consistently higher than the classical implementation, and the differences ranged from 0.4 to 0.6 around the scale of 1 year (Fig. 5b). Considering the real coherence (Eq. 15) rather than complex coherence (Eq. 14) between every two variables in the numerators can potentially result in large underestimation of the partial wavelet coherence. Therefore, the ability of the new method and implementation to produce more accurate results than the classical implementation is one of its advantages.

https://hess.copernicus.org/articles/25/321/2021/hess-25-321-2021-f05

Figure 5PWC between evaporation (E) and relative humidity (RH) after excluding the effect of mean temperature (T) using the classical implementation (Eq. 15) (a) and differences in PWC between the new (Eq. 14) and classical implementations as a function of scale (b).

Download

Compared with the Mihanoviæ et al. (2009) method, the additional phase information from the new PWC is another advantage of this new method. This is because phase information is directly related to the type of correlation, i.e., in-phase and out-of-phase indicating positive and negative correlation, respectively. Different types of correlations were usually found at different locations and scales (Hu et al., 2017b). The phase information helps understand the differences in associated mechanisms or processes at different locations and scales. In addition, the phase information will allow us to detect the changes in not only the degree of correlation (i.e., coherence), but also the type of correlation after excluding the effect of other variables. For example, E and RH were positively correlated at the 1-year cycle (8–16 months) from years 1979 to 1995. This is because higher evaporation usually occurs in summer when high T coincides with high RH as influenced by the monsoon climate in the study area (Fig. S4 of Sect. S3). Interestingly, after excluding the effect of T, E was negatively correlated with RH at the scale of 1 year as we expect (Fig. 3d).

Moreover, our new PWC method applies to cases with more than one excluding variable, which is a knowledge gap. When multiple variables are correlated with both the predictor and response variables, the correlations between predictor and response variables may be misleading if the effects of all these multiple variables were not removed. For example, at the dominant scale (i.e., 1 year) of E variation, contrasting effects of RH on E existed after excluding the effects of T (negative) or SH (positive) (Fig. 3d and e). However, after the effects of all other variables were excluded, there were negligible effects of RH on E at this scale (Fig. 4b). In this case, the relationship between E and RH at the scale of 1 year can be misleading after removing the effects of only one variable. In addition, the dominant role of mean temperature in driving free water evaporation at the 1-year cycle was proved by removing the effects of all other meteorological factors (Fig. 4a). This also further verifies the suitability of the Hargreaves model (only air temperature and incident solar radiation required) (Hargreaves, 1989) for estimating potential evapotranspiration on the Chinese Loess Plateau (Li, 2012).

5.2 Weaknesses

The new method has the risk of producing spurious high correlations after excluding the effect from other variables. Take the artificial dataset for example: at the scale of 32, PWC values between y and y2 after excluding y4 are not significant but relatively high, partly because of small octaves per scale (octave refers to the scaled distance between two scales, with one scale being twice or half of the other, default of 1/12). This spurious unexpected high PWC is caused by low values in both the numerator (partly associated with the low coherence between response y and predictor variables y2 at the scale of 32) and denominator (partly associated with the high coherence between response y and excluding variable y4 at the scale of 32) in Eq. (9). The same problem also exists in the classical implementation (Fig. S5 of Sect. S3). So, caution should be taken to interpret those results. However, it seems that the domain with spurious correlation calculated by the new method is very limited, and it is located mainly outside of the cones of influence. Moreover, the unexpected results can be easily ruled out with knowledge of BWC between response and predictor variables. It is expected that the correlation between two variables should not increase after excluding one or more variables. Therefore, BWC analysis is suggested for better interpretation of the PWC results.

Similar to BWC and MWC, the confidence level of PWC calculated from the Monte Carlo simulation is based on a single hypothesis testing, but in reality, the confidence level of PWC values at all locations and scales needs to be tested simultaneously. Therefore, the significance test has the problem of multiple testing; i.e., more than one individual hypothesis is tested simultaneously (Schaefli et al., 2007; Schulte et al., 2015). The new method may benefit from a better statistical significance testing method. Options for multiple testing can be the Bonferroni adjusted p test (Westfall and Young, 1993) or false discovery rate (Abramovich and Benjamini, 1996; Shen et al., 2002), which is less stringent than the former. The AR(1) model was used to generate noise series for testing the confidence level of PWC. High-order autoregressive models rather than AR(1) may be beneficial for a significance test where spatial data (or time series) are characterized by long-range dependence (Szolgayová et al., 2014).

6 Conclusions

Partial wavelet coherency (PWC) is improved to investigate scale-specific and localized bivariate relationships after excluding the effect of one or more variables in geoscience. Method tests using stationary and non-stationary artificial datasets verified the known scale and localized bivariate relationships after eliminating the effects of other variables. Compared with the previous PWC method, the new PWC method has the advantage of dealing with more than one excluding variable and providing the phase information (i.e., correlation type) associated with PWC. In the case of one excluding variable, the PWC implementation provided here (in the paper and the published code) produces more accurate coherence than the previously published PWC implementation that considered wrongly real coherence rather than complex coherence between every two variables. Application of the new method to the real dataset has further proved its robustness in untangling the bivariate relationships after removing the effects of all other variables in multiple location–scale domains. The new method provides a much needed data-driven tool for unraveling underlying mechanisms in both temporal and spatial data. Thus, combined with wavelet transform, BWC, and MWC, the new PWC method can be used to analyze various processes in geoscience, such as streamflow, droughts, greenhouse gas emissions (e.g., N2O, CO2, and CH4), atmospheric circulation, and oceanic processes (e.g., El Niño–Southern Oscillation).

Code and data availability

The Matlab codes for calculating PWC, along with the updated MWC codes, are freely accessible (https://doi.org/10.6084/m9.figshare.13031123; Hu and Si, 2020). The codes are developed based on those provided by Aslak Grinsted (http://www.glaciology.net/wavelet-coherence, last access: 14 January 2021). The meteorological dataset can be obtained from the China Meteorological Administration.

Supplement

The supplement related to this article is available online at: https://doi.org/10.5194/hess-25-321-2021-supplement.

Author contributions

WH wrote the paper, developed the Matlab code, and analyzed the data. Both authors conceived the study, interpreted the results, and revised the paper.

Competing interests

The authors declare that they have no conflict of interest.

Financial support

This research was supported by the New Zealand Institute for Plant and Food Research Limited under the Sustainable Agro-ecosystems programme.

Review statement

This paper was edited by Bettina Schaefli and reviewed by three anonymous referees.

References

Abramovich, F. and Benjamini, Y.: Adaptive thresholding of wavelet coefficients, Comput. Stat. Data Anal., 22, 351–361, 1996. 

Aloui, C., Hkiri, B., Hammoudeh, S., and Shahbaz, M.: A multiple and partial wavelet analysis of the oil price, inflation, exchange rate, and economic growth nexus in Saudi Arabia, Emerg. Mark. Finance Trade, 54, 935–956, 2018. 

Altarturi, B. H. M., Alshammari, A. A., Saiti, B., and Erol, T.: A three-way analysis of the relationship between the USD value and the prices of oil and gold: A wavelet analysis, AIMS Energy, 6, 487–504, 2018. 

Biswas, A. and Si, B. C.: Identifying scale specific controls of soil water storage in a hummocky landscape using wavelet coherency, Geoderma, 165, 50–59, 2011. 

Centeno, L. N., Hu, W., Timm, L. C., She, D. L., Ferreira, A. D., Barros, W. S., Beskow, S., and Caldeira, T. L.: Dominant Control of Macroporosity on Saturated Soil Hydraulic Conductivity at Multiple Scales and Locations Revealed by Wavelet Analyses, J. Soil Sci. Plant Nutr., 20, 1686–1702, 2020. 

Das, N. N. and Mohanty, B. P.: Temporal dynamics of PSR-based soil moisture across spatial scales in an agricultural landscape during SMEX02: A wavelet approach, Remote Sens. Environ., 112, 522–534, 2008. 

Graf, A., Bogena, H. R., Drüe, C., Hardelauf, H., Pütz, T., Heinemann, G., and Vereecken, H.: Spatiotemporal relations between water budget components and soil water content in a forested tributary catchment, Water Resour. Res., 50, 4837–4857, 2014. 

Grinsted, A., Moore, J. C., and Jevrejeva, S.: Application of the cross wavelet transform and wavelet coherence to geophysical time series, Nonlin. Processes Geophys., 11, 561–566, https://doi.org/10.5194/npg-11-561-2004, 2004. 

Gu, X. F., Sun, H. G., Tick, G. R., Lu, Y. H., Zhang, Y. K., Zhang, Y., and Schilling, K.: Identification and Scaling Behavior Assessment of the Dominant Hydrological Factors of Nitrate Concentrations in Streamflow, J. Hydrol. Eng., 25, 06020002, https://doi.org/10.1061/(ASCE)HE.1943-5584.0001934, 2020. 

Hargreaves, G. H.: Accuracy of estimated reference crop evapotranspiration, J. Irrig. Drain. Eng., 115, 1000–1007, 1989. 

Hu, W., Chau, H. W., Qiu, W. W., and Si, B. C.: Environmental controls on the spatial variability of soil water dynamics in a small watershed, J. Hydrol., 551, 47–55, 2017a. 

Hu, W. and Si, B. C.: Technical note: Multiple wavelet coherence for untangling scale-specific and localized multivariate relationships in geosciences, Hydrol. Earth Syst. Sci., 20, 3183–3191, https://doi.org/10.5194/hess-20-3183-2016, 2016. 

Hu, W. and Si, B.: Matlab code for multiple wavelet coherence and partial wavelet coherency, https://doi.org/10.6084/m9.figshare.13031123, 2020. 

Hu, W., Si, B. C., Biswas, A., and Chau, H. W.: Temporally stable patterns but seasonal dependent controls of soil water content: Evidence from wavelet analyses, Hydrol. Process., 31, 3697–3707, 2017b. 

Jia, X., Zha, T., Gong, J., Zhang, Y., Wu, B., Qin, S., and Peltola, H.: Multi-scale dynamics and environmental controls on net ecosystem CO2 exchange over a temperate semiarid shrubland, Agric. For. Meteorol., 259, 250–259, 2018. 

Kenney, J. F. and Keeping, E. S.: Mayhematics of Statistics, D. van Nostrand, New York, United States, 1939. 

Koopmans, L. H.: The spectral analysis of time series, Academic Press, New York, United States, 1974. 

Lakshmi, V., Piechota, T., Narayan, U., and Tang, C.: Soil moisture as an indicator of weather extremes, Geophys. Res. Lett., 31, L11401, https://doi.org/10.1029/2004GL019930, 2004. 

Li, H., Dai, S., Ouyang, Z. et al. Multi-scale temporal variation of methane flux and its controls in a subtropical tidal salt marsh in eastern China, Biogeochemistry, 137, 163–179, https://doi.org/10.1007/s10533-017-0413-y, 2018. 

Li, Z.: Applicability of simple estimating method for reference crop evapotranspiration in Loess Plateau, Trans. Chin. Soc. Agricult. Eng., 28, 106–111, 2012. 

Mares, I., Mares, C., Dobrica, V., and Demetrescu, C.: Comparative study of statistical methods to identify a predictor for discharge at Orsova in the Lower Danube Basin, Hydrolog. Sci. J., 65, 371–386, 2020. 

Mihanoviæ, H., Orliæ, M., and Pasariæ, Z.: Diurnal thermocline oscillations driven by tidal flow around an island in the Middle Adriatic, J. Mar. Syst., 78, S157–S168, 2009. 

Mutascu, M. and Sokic, A.: Trade openness-CO2 emissions nexus: a wavelet evidence from EU, Environ. Model. Assess., 25, 1–18, 2020. 

Nalley, D., Adamowski, J., Biswas, A., Gharabaghi, B., and Hu, W.: A multiscale and multivariate analysis of precipitation and streamflow variability in relation to ENSO, NAO and PDO, J. Hydrol., 574, 288–307, 2019. 

Ng, E. K. and Chan, J. C.: Geophysical applications of partial wavelet coherence and multiple wavelet coherence, J. Atmos. Ocean. Technol, 29, 1845–1853, 2012a. 

Ng, E. K. and Chan, J. C.: Interannual variations of tropical cyclone activity over the north Indian Ocean, Int. J. Climatol., 32, 819–830, 2012b. 

Polansky, L., Wittemyer, G., Cross, P. C., Tambling, C. J., and Getz, W. M.: From moonlight to movement and synchronized randomness: Fourier and wavelet analyses of animal location time series data, Ecology, 91, 1506–1518, 2010. 

Rathinasamy, M., Agarwal, A., Parmar, V., Khosa, R., and Bairwa, A.: Partial wavelet coherence analysis for understanding the standalone relationship between Indian Precipitation and Teleconnection patterns, arXiv [preprint], arXiv:1702.06568, 2017. 

Schaefli, B., Maraun, D., and Holschneider, M.: What drives high flow events in the Swiss Alps? Recent developments in wavelet spectral analysis and their application to hydrology, Adv. Water. Resour., 30, 2511–2525, 2007. 

Schulte, J. A., Duffy, C., and Najjar, R. G.: Geometric and topological approaches to significance testing in wavelet analysis, Nonlin. Processes Geophys., 22, 139–156, https://doi.org/10.5194/npg-22-139-2015, 2015.  

Sen, A. and Choudhury, K. D.: On the co-movement of crude, gold prices and stock index in the Indian market, International Journal of Financial Engineering, 7, 2050036, https://doi.org/10.1142/S242478632050036X, 2020. 

Shen, X., Huang, H.-C., and Cressie, N.: Nonparametric hypothesis testing for a spatial signal, J. Am. Stat. Assoc, 97, 1122–1140, 2002. 

Si, B. C.: Spatial scaling analyses of soil physical properties: A review of spectral and wavelet methods, Vadose Zone J., 7, 547–562, 2008. 

Si, B. C. and Farrell, R. E.: Scale-dependent relationship between wheat yield and topographic indices: A wavelet approach, Soil Sci. Soc. Am. J., 68, 577–587, 2004. 

Si, B. C. and Zeleke, T. B.: Wavelet coherency analysis to relate saturated hydraulic properties to soil physical properties, Water Resour. Res., 41, W11424, https://doi.org/10.1029/2005WR004118, 2005. 

Song, X. M., Zhang, C. H., Zhang, J. Y., Zou, X. J., Mo, Y. C., and Tian, Y. M.: Potential linkages of precipitation extremes in Beijing-Tianjin-Hebei region, China, with large-scale climate patterns using wavelet-based approaches, Theor. Appl. Climatol., 141, 1251–1269, 2020. 

Su, L., Miao, C., Duan, Q., Lei, X., and Li, H.: Multiple wavelet coherence of world's large rivers with meteorological factors and ocean signals, J. Geophys. Res. Atmos., 124, 4932–4954, 2019. 

Szolgayová, E., Arlt, J., Blöschl, G., and Szolgay, J.: Wavelet based deseasonalization for modelling and forecasting of daily discharge series considering long range dependence, J. Hydrol. Hydromech., 62, 24–32, 2014. 

Tan, X., Gan, T. Y., and Shao, D.: Wavelet analysis of precipitation extremes over Canadian ecoregions and teleconnections to large-climate anomalies, J. Geophys. Res.-Atmos., 121, 14469–14486, 2016. 

Torrence, C. and Compo, G. P.: A practical guide to wavelet analysis, B. Am. Meteorol. Soc., 79, 61–78, 1998. 

Wendroth, O., Alomran, A. M., Kirda, C., Reichardt, K., and Nielsen, D. R.: State-Space Approach to Spatial Variability of Crop Yield, Soil Sci. Soc. Am. J., 56, 801–807, 1992. 

Westfall, P. H. and Young, S. S.: Resampling-based multiple testing: Examples and methods for p-value adjustment, John Wiley & Sons, New York, United States, 1993. 

Wu, K., Zhu, J., Xu, M., and Yang, L.: Can crude oil drive the co-movement in the international stock market? Evidence from partial wavelet coherence analysis, North Am. J. Econ. Finance, 2020, 101194, https://doi.org/10.1016/j.najef.2020.101194, 2020. 

Yan, R. and Gao, R. X.: A tour of the tour of the Hilbert-Huang transform: an empirical tool for signal analysis, IEEE Instrum. Meas. Mag., 10, 40–45, 2007. 

Zhao, R., Biswas, A., Zhou, Y., Zhou, Y., Shi, Z., and Li, H.: Identifying localized and scale-specific multivariate controls of soil organic matter variations using multiple wavelet coherence, Sci. Total Environ., 643, 548–558, 2018. 

Download
Short summary
Partial wavelet coherency method is improved to explore the bivariate relationships at different scales and locations after excluding the effects of other variables. The method was tested with artificial datasets and applied to a measured dataset. Compared with others, this method has the advantages of capturing phase information, dealing with multiple excluding variables, and producing more accurate results. This method can be used in different areas with spatial or temporal datasets.