Short memory or long memory

Introduction Conclusions References Tables Figures


Introduction
Recent analyses have claimed the possible presence of non-stationarity is produced by the presence of either trend or long-term cyclic fluctuations.However, it is well known that a reliable assessment of the presence of non-stationarity in hydrological records is not an easy task, because of the limited extension of the available data sets.This makes difficulty distinguishing between non-stationarity, sample variability and longterm climatic fluctuations (Brath et al. 1999).
Studies by Cheung (1993); Diebold and Inoue (2001) have shown that there is a bias in favour of finding long-memory processes when structural breaks are not accounted for in the series.The observed long memory behaviour can be due to neglected structural breaks.The presence of breaks in the series and/or of long memory behaviour Figures in the break-free series could indicate whether the series shows real breaks and long memory.A further insight can be obtained by estimating the difference parameter on the subseries identified by splitting the original series according to the estimated breakdates: in case of erroneous long memory identification, the sub series are expected to show short memory.
If a time series has one or more structural breaks then the series has one or more discontinuities in the data generating process (DGP).In this case a structural break method will report a number of breaks which will divide the series into regimes which are of different subpopulations.The statistical properties of these subpopulations within the regimes will need to be estimated.The estimated differences will be the result of actual differences between the samples.The task of estimating the break-dates can be accomplished within the framework of least squares regression.
In the last decade lot of interest, both in the econometric and statistical literature, has been paid to the issue of confusing long memory and occasional structural breaks in mean, see among the others (Diebold and Inoue, 2001;Granger and Hyung, 2004;Smith, 2005).
Indeed, there is evidence that a stationary short memory process that encounters occasional structural breaks in mean can show a slower rate of decay in the autocorrelation function and other properties of I(d ) processes, where d can be a fraction.For the purpose of this paper, we used the term true long memory to refer to fractionally integrated series.On the other hand, when the (DGP) is an integrated or fractionally integrated process (with no breaks), several breaks can be detected spuriously: in fact, there's a relation between the number of breaks and the value of d .
The literature on the tests to distinguish between true long memory and various spurious long memory models has been steadily growing recently.For example (Berkes et al., 2006;Shao, 2011) proposed a testing procedure to discriminate a stationary long memory time series from a short-range dependent time series with change points in the mean.Their null hypothesis corresponds to changes in the mean and their alternative is that the series is stationary with long memory.The test statistic is a modification Figures

Back Close
Full of the cumulative sum (CUSUM-type) test, which is quite popular in the literature of change point detection.Time series with structural breaks can generate a strong persistence in the autocorrelation function, that is, the observed long memory behaviour can be due to neglected structural breaks and, on the contrary, long memory processes may cause breaks to be detected spuriously.In order to have an insight on "which is which", we propose to fit long memory and structural break separately following (Cappelli and Angela, 2006).In case both provide plausible explanation of the data generating process (DGP) of the data at hand, the long memory and structural break analysis are repeated on the series break-free and on the filtered series, respectively.However in Malaysia little or not much has been done in this area.
Heavy rainfall could bring disaster such as floods and landslides.Of course, the shortage of rainfall could also affect the water management system in such a way it could bring problems to the economic activities.Therefore, there are needs to investigate the characteristic of rainfall of a country intensively and comprehensively.Modelling of daily rainfall using various mathematical models has been done throughout the world to give a better understanding about the rainfall pattern and its characteristics (Suhaila and Abdul Aziz, 2008).
The purpose of this paper is to apply a simple strategy to study whether the data generating process of daily rainfall data sets of nine weather stations across Malaysia exhibit true long memory behaviour, since the presence of long memory in hydrological time series is a well-known phenomenon and, on the other hand, the presence of structural breaks in these series represents a relevant environmental issue.
The long memory or long term dependence property describes the high-order correlation structure of a series.If a series exhibits long memory, there is persistent temporal dependence even between distant observations; such series are characterized by distinct but non-periodic cyclical patterns.Fractionally integrated processes can give rise to long memory (Alptekin, 2006).Fractional Integration is part of the larger classification of time series, commonly referred to as long memory models.Long memory models Introduction

Conclusions References
Tables Figures

Back Close
Full address the degree of persistence in the data.In empirical modelling of long memory processes, the autoregressive fractionally integrated moving average (ARFIMA) model that was proposed by Granger andJoyeux, (1980) andHosking, (1981) is used.

Statistical model
A time series process {X t , t = 0, ±1, . . .} is said to be (covariance) stationary if the mean and the variance do not depend on time and the covariance between any two observations depends on the temporal distance between them but not on their specific location in time.This is a minimal requirement in time series analysis to make statistical inference.
Given a zero-mean covariance-stationary process {X t , t = 0, ±1, . . .}, with auto covariance function γ µ = E {X t , X t+µ }, we say that X t is integrated of order zero (denoted by If the time series is nonstationary, one possibility for transforming the series into a stationary one is to take first differences, such that where B is the lag-operator (BX t = X t−1 ) and µ t is I(0) as defined above.In such a case, X t is said to be integrated of order 1 (denoted X t ≈ I( 1)).Likewise, if two differences are required, the series is integrated of order 2 (I( 2)).If the number of differences required to get I(0) stationary is not an integer value but a fractional one, the process is said to Introduction

Conclusions References
Tables Figures

Back Close
Full In other words, we say that X t is integrated of order d where d is a fractional value (I(d )) if with I(0) equal to µ t .Note that the expression in the left-hand-side in Eq. ( 3) can be presented in terms of its Binomial expansion, such that, for all real d , If d is a positive integer value, X t will be a function of a finite number of past observations, while if d is not an integer, X t depends strongly upon values of the time series far in the past (e.g.Granger and Ding, 1996;Dueker and Asea, 1998).Moreover, the higher the value of d , the higher will be the level of association between the observations.
The parameter d plays an important role from a statistical viewpoint.Thus, if −0.5 < d < 0.5, µ t is a stationary and ergodic process with a bounded and positively valued spectrum at all frequencies.One important class of process occur when µ t is I(0) and is covariance stationary.For 0 < d < 1/2, the process exhibits long memory in the sense of Eq. ( 1), its autocorrelations are all positive and decay at a hyperbolic rate.For −0.5 < d < 0, the sum of absolute values of the process autocorrelations tends to constant, so that it has a short memory according to Eq. (1).In this situation the ARFIMA (0, d , 0) process is said to be anti-persistent or to have intermediate memory and all its autocorrelations excluding lag zero are negative and decay hyperbolically to zero.As d increases beyond 1/2 and through 1 (the unit root case) X t can be viewed as becoming "more nonstationary" in the sense, for example, that the variance of the partial sums increases in magnitude.This is also true for d > 1.

Test and estimation of order of integration
There exist several procedures for estimating the fractional differencing parameter in semi parametric contexts.Of these, the log-periodogram regression estimate proposed by Geweke and Porter-Hudak, (1983) has been the most widely used (Shimotsu, 2002).
Given a fractional integrated process {Y t }, its spectral density is given by where ω is the Fourier frequency, f u (ω) is the spectral density corresponding to u t and u t is a stationary short memory disturbance with zero mean.Consider the set of harmonic frequencies,ω j = (2π j /n), j = 0, 1, . . .n/2, where n is the sample size.Taking the logarithm of the spectral density we have, ln f (ω j ) = ln f u (0) − d ln 4 sin 2 (ω j /2) , which may be re-written in the alternative form The fractional differencing parameter d can be estimated by the regression equations constructed from (1 − 4X 2 j 2 )e −2X 2 j 2 (6) GPH showed that using a periodogram estimate of f (ω j ), if the number of frequencies m used is a function g(n) (a positive integer) of the sample size n where m = g(n) = n α with 0 < α < 1, it can be demonstrated that the least squares estimate d using the above regression is asymptotically normally distributed in large samples.Where U j = ln[4 sin 2 (ω j /2)] and Ū is the sample of U j , j = 1 . . .g(n).
Under the null hypothesis, of no long memory (d = 0), the t-statistic t d =0 = has limiting standard normal distribution.
The value of the power factor α is the main determinant of the ordinates included in the regression.Traditionally the number of periodogram ordinates m is chosen from the interval [T 0.45 , T 0.55 ].However, Hurvich and Deo (1998) showed that the optimal m is of order O (T 0.8 ).

Results and Discussion
Data of daily rainfall record for nine stations across Malaysia for the period 1 January 1968-31 December 2003 obtained from Malaysian Meteorological Department were analysed in this study.Table 1 illustrates the generalized geographic information of the selected weather stations used for this study.We start the analysis by discussing the descriptive statistic of the considered daily rainfall series.Table 2 give some information about the distribution of the series.The skewness indicates that the mass of the distribution is concentrated on the right, or we may say that the distribution is left-skewed.
The positive kurtosis shows a leptokurtic condition, and has fatter tail than normal.
The standard deviations given are greater than the corresponding mean values, this indicates that the daily rainfall fluctuate significantly through time.
Figure 2 below depicts time series plots of Kota Bahru and Kuantan.From the figure, the time series plots show very persistent behaviour.The autocorrelation function (Fig. 2) provides a measure of temporal correlation between rainfall data points with different time lags.Thus, autocorrelation provides initial information relevant to the internal organisation of each time series data (Mahdi and Petra, 2002).The prevalence of autocorrelation in a data series is also an indication of persistence in the series of observations.The autocorrelation coefficients provide an essential hint whether Figures forecasting models can be developed based on the given data (Janssen and Laatz, 1997).For a purely random event, all autocorrelation coefficients are zero, apart from r(0) which is equal to 1.The ACF plots in Fig. 2 decay with hyperbolic rates and indicate that the time series are strongly correlated that is, decay up to long lag.These are the main characteristic of long memory appearance.
Plots of the first differenced data (Fig. 3), with their corresponding correlogram and periodogram are displayed in (Fig. 4).The series may now be stationary, though the correlogram still show significant values even at some lags relatively far away from zero.This may be an indication that fractional differencing smaller than or greater than 1 may be more appropriate than first differences.In addition, the periodogram in data sets show values close to 0 at the zero frequency, which might suggest that these series are now over differenced.
It is well known fact that, when analysing climatic time series, to distinguish between long-term fluctuations and non-stationarity is not a simple task.Nevertheless, such a distinction would be extremely interesting.In fact, the presence of long-term climatic fluctuations, rather than non-stationarity, would imply that the patterns found in the data set could likely be attributed to cyclical behaviour rather than to irreversible tendencies.A useful tool for the above mentioned distinction is the detection of the possible presence of long memory in the data.In fact, it was mentioned before that one of the effects of long-memory is the attitude of the time series to be subjected to long-term cycles.
A time series which is generated by a true long memory process has a uniform data generating process (DGP) throughout the entire series.Thus if a structural break location method is mistakenly applied to the series it may report a number of breaks where no breaks exist.These spurious breaks will yield a number of partitions of differing lengths but this partition will only be subsamples of a single population.Thus the subsamples will have the same statistical properties as the full series because subsamples have been drawn from a single population.Any estimated differences will be the result of randomness and long range serial correlation not differences between the samples.If a time series has one or more structural breaks then the series has one or more Introduction

Conclusions References
Tables Figures

Back Close
Full discontinuities in the DGP.In this case a structural break method will report a number of breaks which will divide the series into regimes which are different subpopulations.
The statistical properties of these subpopulations within the regimes will need to be estimated.The estimated differences will be the result of actual differences between the samples.
The results for the estimated values of the fractional differencing parameters along with ARFIMA (p, d , q) model for all data sets are displayed in Table 3. From (Table 3), we can observe that all the rainfall series can be described in terms of fractional integration which is part of the larger classification of time series, commonly referred to as long memory models.In order to show how the above described strategy can help in dealing with the discrimination between long memory and structural breaks, we have analysed the series(s) on which we have identified an ARFIMA (0, d , 0) model.The maximum likelihood estimate of the fractional parameter d was obtained by means of GPH.The presence of long memory in all data sets is displayed in Table 3.At the same time, we have detected the fluctuation process using the OLS based CUSUM test with F-statistic for break.The break-date results found in these data sets differ from station to station, Sitiawan and Subang, Ipoh and Bayan Lepas falls within the same year respectively.Figure 3 below gives the fluctuation process and the plots of F-statistic for each data set considered.The series were partitioned according to their break dates and all subseries where tested for long memory.Table 4 gives the break date along with the estimates of the differencing parameters d 1 and d 2 for series before and after break respectively.

Conclusions
It is now well established that long memory and structural change are easily confused, however, most researches choose to ignore the problem of structural break in testing for long memory.It is a known fact that short memory with structural break may exhibits the properties of long memory.The main contribution of the paper was to detect if data Introduction

Conclusions References
Tables Figures

Back Close
Full generating process (DGP) of daily rainfall series in some locations across Malaysia are generated by a true long memory process.The approach based on fractional integration FI(d ) process that can characterize a series was applied.The findings indicate that all the data sets exhibit long memory, but, is it a true long memory?To answer this we employed a method which allowed detects a structural break in each series.The series were partitioned according to the break date identified and similar test was applied to the subseries and found that all subseries displayed same properties as the original series.Therefore we conclude that all the rainfall datasets considered were generated by a true long memory.Introduction

Conclusions References
Tables Figures

Back Close
Full  Full  Full  Full  Full Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | be fractionally integrated or I(d ).
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Granger, C. W. J. and Ding, Z.: Varieties of long memory models, J. Econometrics, 73, 61-77, 1996.Granger, C. W. J. and Hyung, N.: Occasional structural breaks and long memory with an application to the S&P 500 absolute stock returns, J. Empir.Financ., 11, 213-228, 2004.Granger, C. W. J. and Joyeux, R.: An introduction to long-range time series models and frac-Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Table 1 .
Selected weather station in Malaysia and their general geographic information.

Table 2 .
Summary statistics for the rainfall data sets.

Table 4 .
CUSUM test with corresponding break date and estimates of the differencing parameters d 1 and d 2 .