Multi-step-ahead predictor design for effective long-term forecast of hydrological signals using a novel wavelet neural network hybrid model

In order to increase the accuracy of serialpropagated long-range multi-step-ahead (MSA) prediction, which has high practical value but also great implementary difficulty because of huge error accumulation, a novel wavelet neural network hybrid model – CDW-NN – combining continuous and discrete wavelet transforms (CWT and DWT) and neural networks (NNs), is designed as the MSA predictor for the effective long-term forecast of hydrological signals. By the application of 12 types of hybrid and pure models in estuarine 1096-day river stages forecasting, the different forecast performances and the superiorities of CDW-NN model with corresponding driving mechanisms are discussed. One type of CDW-NN model, CDW-NF, which uses neuro-fuzzy as the forecast submodel, has been proven to be the most effective MSA predictor for the prominent accuracy enhancement during the overall 1096-day long-term forecasts. The special superiority of CDW-NF model lies in the CWT-based methodology, which determines the 15-day and 28-day prior data series as model inputs by revealing the significant short-time periodicities involved in estuarine river stage signals. Comparing the conventional single-stepahead-based long-term forecast models, the CWT-based hybrid models broaden the prediction range in each forecast step from 1 day to 15 days, and thus reduce the overall forecasting iteration steps from 1096 steps to 74 steps and finally create significant decrease of error accumulations. In addition, combination of the advantages of DWT method and neuro-fuzzy system also benefits filtering the noisy dynamics in model inputs and enhancing the simulation and forecast ability for the complex hydro-system.


Introduction
The hydrological signal forecast, especially a long-term forecast, is important for the study and guidance of water resource management.Nevertheless, the hydrological signals are highly complex nonlinear systems and have severe variations in time and space, which make accurate forecasts difficult.Generally, the hydrological time series were predicted with models based on physical considerations or other numerical theories, such as the LR (linear regressive) analysis methods (Salas et al., 1980) based on the stochastic theory, the grey models (Deng, 1992) based on the grey information theory, the chaos models (Jayawardena and Lai, 1994;Islam and Sivakumar, 2002) based on the local similarity of signals, the fuzzy prediction models (Jang, 1993;Jang et al., 1997;Chen, 2005) based on the fuzzy theory, the TAR (threshold auto-regression), BL (bilinear time series), and SVM (support vector machine) models (Tong, 1990;Liong and Sivapragasm, 2002;Zou et al., 2010) based on the nonlinear time-series analysis, the ANN (artificial neural network) models (Raman and Sunlikumar, 1995;Yu et al., 2008;Yang et al., 2009) based on the black-box theory, and the NNB (nearest neighbour bootstrapping) regressive models (Wang et al., 2001) based on the nonparametric prediction theory.However, these models were generally not successful enough in producing accurate predictions due to some inaccurate initial conditions, parameterisation schemes of sub-scale phenomena, and limited spatial resolution (Olson et al., 1995).
Many hybrid models have been proposed as predictors to improve the accuracy of hydrological time-series forecasts, such as the wavelet artificial neural network (ANN) model (Anctil and Tape, 2004;Krishna et al., 2011;Nayak et al., 2013), the periodic ANN (PANN) model (Wang et al., 2006), the chaotic ANN model (Karunasinghe and Liong, 2006), the hybrid fuzzy-ANN model (Nayak et al., 2007), the wavelet-based grey model (Chou, 2007), the wavelet-based NF (neuro-fuzzy) model (Partal and Kisi, 2007;Engin et al., 2007;El-Shafie et al., 2007), the non-supervised ANN-EA (evolutionary algorithms) model (Cao and Park, 2007;Chang et al., 2007), the fuzzy-SVM model (Hua et al., 2008), the wavelet-based multi-layer perceptron model (Kisi, 2008), the wavelet-regression (WR) model (Kisi, 2011), and the wavelet-based fuzzy logic model (Ozger et al., 2012).These hybrid models have shown different advantages for accurate predictions due to their capabilities of utilising present information effectively.Among these hybrid models, the neural network (NN) models, such as the NF (neuro-fuzzy) and ANN, are the most popularly utilised sub-models for signal forecast due to their capabilities of effectively learning complex and nonlinear relationships (Maier et al., 2010).The ANN model has been commonly used in hydrological signal forecasts by a number of researchers (French et al., 1992;Jain et al., 1999;ASCE, 2000;Cigizoglu, 2005;Marzano et al., 2006;Zou et al., 2010).The NF model has been introduced and successfully used in the hydrological sciences in recent years (Nayak et al., 2004(Nayak et al., , 2005;;Kisi, 2005;Chang and Chang, 2006).In addition, the wavelet transform, as a strong mathematical tool in providing the good local representation of a signal in both the time and frequency domains, has become a useful method for analysing variations, periodicities, and trends in time series (Daubechies, 1994;Torrence and Compo, 1998;Coulibaly and Burn, 2004;Partal and Kucuk, 2006).Among the various types of wavelet transforms, the discrete wavelet transform (DWT) is popularly used as the data preprocessing method in a hybrid model to decompose the original signal inputs due to its capabilities of effectively classifying a hydro-meteorological time series into distinct time and frequency domains (Smith et al., 1998;Kim and Valdes, 2003;Labat, 2005).
Because of the common Markovian property (Bolch et al., 2006) embedded in the hydro-meteorological time series, most recent pure and hybrid models use data series at different previous time points as model inputs to forecast the original data series at the current time point.For daily time series, the data series from 1 day prior to a few days prior are usually used as model inputs, namely using data series S t−1 , S t−2 , . . .as inputs, to forecast S t .The data series at 1 day prior is always selected as one of the inputs because of the usually high lag-1 autocorrelation (Kisi, 2008(Kisi, , 2011;;Zhou et al., 2008).This selection principle denotes a type of commonly used single-step-ahead (SSA) prediction (Parlos et al., 2000), in which each single forecasting step of the Markovian property-based model can only predict the next 1-day datum (Fig. 1a).However, the SSA prediction may not provide enough information, especially in the situation in which it is desirable to understand the behaviour of multiple steps in the future, such as signal processing and time-series prediction.Given this issue, the serial-propagated multi-step-ahead (MSA) prediction (Fig. 1b), which attempts to make predictions several time steps into the future without the availability of output measurements, has attracted an increasing number of scientific studies (Su et al., 1992;Schenker and Agarwal, 1995;Coulibaly et al., 2000;Gao et al., 2002;Chang et al., 2007;Yong et al., 2010;Chang et al., 2012).However, the MSA predictors, especially the long-range MSA predictors, are difficult to develop, because the lack of measurements in the prediction horizon necessitates the recursive use of SSA predictors to reach the end point on the horizon.Even small SSA prediction errors at the beginning of the horizon accumulate and propagate, often resulting in a poor prediction accuracy.Over the last 20 yr, the MSA predictor design to increase the MSA prediction accuracy has received much attention, and different types of neural networks have been used successfully for some short-range MSA predictions (Su et al., 1992;Parlos et al., 2000;Chang et al., 2007).
As mentioned above, the crucial and most difficult point in a "true" long-term forecast of hydrological signal (Yu et al., 2013) is the development of effective models to reduce the error accumulation and increase the accuracy of the long-range serial-propagated MSA prediction.In view of this, the present study designed a novel hybrid model CDW-NN, combining continuous and discrete wavelet transforms and neural networks, as the MSA predictor for effective longterm forecast of hydrological signals by broadening the prediction range in each forecast step and reducing the total iteration steps in the long-term forecasting process.In the remainder of this paper, the long-term forecast methodologies of the MSA predictor CDW-NN are presented.In the next section, the details of daily river stage data series in different hydro-stations in Yangtze River estuary, China, are presented, and the CDW-NN hybrid models are applied to the long-term forecasts of different river stage signals.The results are discussed by comparing with the performances of other pure and hybrid models in the subsequent section, and finally conclusions are drawn.

Continuous wavelet transform (CWT) and discrete wavelet transform (DWT)
Wavelet transform is a mathematical tool that allows the decomposition of the signal f (t) in terms of elementary contributions called wavelets (Sadowskey, 1996;Labat et al., 2005).For the time series f (t) ∈ L 2 (R) or finite energy signal, the CWT of the signal f (t) with the analysing wavelet φ is the convolution of f (t) with a set of dilated and translated wavelets: where φ(t) is the complex conjugate function of φ(t), a the dilation (scale or frequency) parameter, b the translation (position or time) parameter, R the domain of real numbers, and δt the time interval of the data series.In this paper, the time interval of the data series equals 1.0 day, and the popularly used Morlet wavelet is selected as φ (Mallat, 1989;Daubechies, 1994;Torrence and Compo, 1998).The Morlet wavelet, which is a complex wavelet consisting of a plane wave modulated by a Gaussian function, is defined by where ω 0 is the non-dimensional frequency (usually taken to be 6 to satisfy the admissibility condition) (Farge, 1992).The global wavelet power spectrum is defined as the power density at different timescale a, which is calculated by where N is the length of the data.The signal's periodicity can be indicated at the timescale at which the wave crest of wavelet power spectrum is observed.The significance of the global wavelet power spectrum is tested using a white or red noise model by comparing with the theoretical global wavelet power spectrum (P ).P is given as (Torrence and Compo, 1998) where σ 2 is the variance of data series, x 2 v (p) the inverse of chi-squared cumulative distribution with v degrees of freedom at the requested confidence level 1 − p, and p the distribution fraction.For the lag-1 autocorrelation, r(1) < 0.1 P a is the white noise spectrum, and for r(1) > 0.1, P a is the red noise spectrum.For the Morlet wavelet, P a is given as Eq. ( 5), and v is given as Eq. ( 6).In this study, a significance level of 0.005 was selected (e.g.χ 2 2 (99.5 %) = 10.597).
The continuous wavelet (Eq. 1) is often discrete in real applications.When a = a j and k and j are integer numbers, the DWT of f (t) can be written as Based on the commonly used Mallat algorithm for calculating discrete wavelet coefficients, the most common and simplest choice for the parameters a 0 and b 0 is 2 and 1 time steps, respectively, and the Daubechies wavelet, which has no explicit mathematical expressions and can be calculated only numerically, is commonly used in the DWT (Mallat, 1989;Daubechies, 1994;Partal and Kucuk, 2006;Kisi, 2011).For a discrete time series f (t) occurring at different times t (e.g.integer time steps are used herein), the DWT can be defined as where N is the number of discrete time steps, and W f (j, k) is the wavelet coefficient for the discrete wavelet of scale a = 2 j and time b = 2 j k.

Neuro-fuzzy (NF) and BP-ANN
The popular neural network (NN) model neuro-fuzzy, based on a adaptive neuro-fuzzy inference system (ANFIS) (Jang et al., 1997;Partal and Kisi, 2007), is utilised as the sub-model for the different hydro-meteorological signals forecasts in this paper.The ANFIS, first introduced by Jang (1993), is a universal approximator and, as such, is capable of approximating any real continuous function on a compact set to any degree of accuracy.The ANFIS is functionally equivalent to the Sugeno first-order fuzzy model (Jang et al., 1997;Drake, 2000), and its typical architecture with five learning layers is shown in Fig. 2. The crucial point in determining the optimal NF model structure is the selection of transfer function and rule numbers in Layer 1 of ANFIS architecture (Engin et al., 2007).Gauss and Bell functions are two commonly used functions, and the least two rule numbers of each node in Layer 1 are necessary (Jang et al., 1997).Since there is no theory yet about the suitable selection for any case, the optimal NF model structure is usually determined by many trials (i.e.trying different transfer functions and different rule numbers).Because more rule numbers may increase more complexity and calculation difficulty of the ANFIS, the rule numbers from 2 to 5 with a step size of 1 in each trial are used in testing the optimal NF model structure in the presented case study.The significant advantage of the NF model depends on the hybrid learning algorithm in ANFIS, which combines gradient descent, back-propagation, and the leastsquares method and can rapidly train and adapt the ANFIS.Each learning epoch of the ANFIS is composed of a forward pass and a backward pass, and more information for neurofuzzy and ANFIS can be found in Jang's papers (Jang, 1993;Jang et al., 1997).
Another popular NN model, BP-ANN (back-propagation artificial neural networks), is utilised in our case to compare the forecast performance with the NF model.Based on the back-propagation algorithm, a common three-layer feed-forward type of BP-ANN is considered; the Levenberg-Marquardt methodology, which is more powerful than conventional gradient descent techniques (Hagan and Menhaj, 1994;Kisi, 2011), is used to adjust the weights of the ANN model, and the tangent sigmoid and linear activation functions are used for the hidden and output node(s), respectively.Same as the NF model structure determination, since there is no theory yet to determine how many hidden layer nodes in the BP-ANN are needed to approximate any given function, the hidden layer node number in BP-ANN is commonly determined by the trial-and-error approach (Nayak et al., 2013).Since each type of ANN model in the presented case study has two input layer nodes, the trial and error procedure starts with one hidden layer node initially, and the hidden layer node numbers are increased up to 10 with a step size of 1 in each trial.By many trials, the optimal BP-ANN structure is determined by selecting the hidden layer node number with the best model training efficiency and simulating performance.

Architecture of the long-term forecasting based on the MSA predictor CDW-NN
In order to reduce the error accumulation and increase the accuracy of the long-range serial-propagated MSA prediction, the present study designs a novel hybrid model CDW-NN, combining CWT, DWT, and NN, as the MSA predictor for effective long-term forecast of hydrological signals.The architecture of CDW-NN hybrid model is shown in Fig. 3. Firstly, for the original given daily data series x(1) ∼ x(t), the CWT method is utilised to reveal its shortterm periodicities, i.e. the periods at a1 ∼ ai days in Fig. 3 (a1 < a2 < . . .a i).Meanwhile, the decomposition of the original signal DWT is carried out to get new data series TD(1) ∼ TD(t), which is constructed by selecting and combining optimal DWT decomposition components.Then, by combining the CWT and DWT results, the new TD series at a1 ∼ a i days ahead (TD(t − a1) ∼ TD(t − a i)) are selected as the NN model inputs for model training to forecast the datum x(t).According to the serial-propagated prediction principle, using TD(t − a1 + 1) ∼ TD(t − a i + 1) as model inputs can predict the first future day datum y(t + 1), and using TD(t) ∼ TD(t − a i + a1) as inputs can predict y(t + a1).Here, the first batch of outputs y(t + 1) ∼ y(t + a1) are predicted from the first forecasting step, and then can be used as new observations for the second step DWT decomposing and NN forecasting to predict the second batch of outputs y(t + a1 + 1) ∼ y(t + 2 a1).Just as the 1-day prediction from each SSA forecast involved in conventional MSA forecast, the day number of predictions in the output of each CDW-NN forecasting step is a1.So, after about n/a1 steps of forecasting process, the final long-term prediction series y(t + 1) ∼ y(t + n) can be obtained.

Studied area and data
The high-tide level data at two time points each day during 13 yr ( 4748 3).The remaining 3 yr of river stage data (1096 days) are used for testing the long-term forecasting performance of hybrid models (i.e.n = 1096 in Fig. 3).

Short-term periodic features of estuarine daily river stage series by CWT
The Morlet wavelet transform coefficients of the training data series at relatively fine timescales (from 1-day to 50-day Further calculating the global wavelet power spectrums and the corresponding theoretical power spectrums using white noise model, the significance of wavelet power densities of daily river stage series at different timescales at Santiao and Qinglong stations is calculated and shown in Fig. 5b and d. Results show that the Morlet wavelet transform coefficients of daily river stage series at Santiao station generate obvious two kinds of quasi-periodic oscillations (QPOs), namely at 12-day and 23-day timescales, and both of their global wavelet power spectrums are prominent at the 99.5 % confidence level.The QPO of estuarine daily river stage at a fine timescale is often nested in a broad timescale.At the 12-day timescale, the average changing periodicity (T ) of river stage time series (i.e. the average cycle days between each two time domains with positive wavelet coefficients) is 15 days obtained by calculating and averaging the day numbers of each two neighbouring high and low river stage periods.At the 23-day scale, the average T is 28 days.At Qinglong station, the Morlet wavelet transform coefficients of daily river stage series generate obvious two QPOs at 12day and 22-day timescales, of which the corresponding T s are 15 days and 28 days, the same as that at Santiao station, and both of their global wavelet power spectrums are prominent at the 99.5 % confidence level.Based on the prominent short-term periodic features of estuarine daily river stage time series, estuarine daily river stages at 15 days prior and 28 days prior are determined to simulate and forecast the river stage at the current day (i.e.a1 and a2 in Fig. 3 equal 15 and 28, respectively).

Decomposition of daily river stage time series and optimal DWT components combination
The decomposition process of DWT consists of a number of filtering steps following the Mallat algorithm.The original signal of training data series is first decomposed into an approximation (A 1 ) and details (D 1 ), and the A 1 is then broken down into many lower resolution components (A i and D i ).The details are the low-scale high-frequency components of the signal, while the approximations are the highscale low-frequency components.The higher scales consist of the extended version of a wavelet, and the corresponding coefficients refer to the slowly changing coarse features of low-frequency components.The lower scales present the condensed wavelet and follow the rapidly changing details (high-frequency components) of the signal (Mallat, 1989).
The squared correlation coefficients (R 2 ) between each discrete wavelet component (A 10 , D 1 -D 10 ) at 15 days and 28 days prior (t − 15 and t − 28) and the original time series at time t (S t ) are computed and presented in Table 1.By the single-factor and double-factor analysis, at both Santiao and Qinglong stations, the D 3 , D 4 and D 8 components at 15 days and 28 days prior show the prominently higher R 2 with S t than the other DWT components, especially significant in the double-factor analysis.Instead of using each DWT component individually as the model input, employment of the added suitable DWT components is more useful and can highly increase the forecast performance.Based on the revealed dominant DWT components of different hydrological series, the new series (TD) obtained by adding D 3 , D 4 and D 8 at 15 days and 28 days prior are selected as two NN model inputs for the daily river stage forecast at both Santiao and Qinglong stations.Comparing the lagged TD series with the lagged original series (S), the lagged new series (TD) show slightly higher correlations at both 15-day and 28-day delay time nodes with S t , which indicate that the new TD series keeps the main information of the original signal dynamics in spite of the filtering of many other weakly correlated information by DWT.

CDW-NN model training and long-term forecasting of daily river stage signals
According to the above CWT and DWT results, two new daily series TD(1) ∼ TD(t − 28) and TD( 14) ∼ TD(t − 15) extracted from the training data series are used as the NN model inputs to simulate and forecast the original series x(29) ∼ x(t).Program codes were written in MATLAB language for training the neuro-fuzzy and BP-ANN submodels and determining their optimal model structures.At Santiao station, by many trials the optimal CDW-NF hybrid model structure is determined as CDW-NF (5-Bell), which denotes the Bell-type transfer function and five rules for each input in the layer one of ANFIS, and the optimal CDW-ANN hybrid model structure is determined as CDW-ANN(2-3-1), which denotes two input layer nodes, three hidden layer nodes and one output layer node in the BP-ANN submodel.At Qinglong station the optimal CDW-NF(4-Gauss) model and the optimal CDW-ANN(2-4-1) model are obtained.After where n is the number of data sets, and Y i is the daily river stage.
The forecast performances of CDW-NF and CDW-ANN models at Santiao and Qinglong stations are shown in Fig. 6. Results show that the CDW-NF models perform the significantly better correlations between the observed and predicted river stage data during 2008-2010 with the higher R 2 of 0.284 and 0.173 at Santiao and Qinglong stations, respectively, while the CDW-ANN with the lower R 2 of 0.020 and 0.030 at Santiao and Qinglong stations, respectively.The CDW-ANN hybrid model shows better forecast performances during the first year than that in the last two years.in each forecasting step, the CDW-NF hybrid model shows the most significant performances among all the 12 models, especially at Santiao station.In addition, due to the prominent ability of the decomposition approach DWT in filtering weakly correlated details from original signal, each DWTbased hybrid model performs better than the corresponding model without DWT both in the training and test periods.

Driving mechanism of the advantages of CDW-NN models in long-range MSA predictions
Prediction performance details in respect to R for the 12 types of hybrid and pure models at different forecasting steps during the overall 1096-day river stage forecasting are calculated and shown in Fig. 7.According to the serialpropagated MSA prediction theory, the error accumulation increases with the iteration step increase in a long-term forecast.Therefore, the prediction performances of all 12 types of models have overall decreasing trends with the increasing of predicted data length from 1 day to 1096 days.Nevertheless, during approximately the first 200-600 days of river stage forecasting, all six types of CWT-based models performed better than the other six types of SSA-based models.In particular, both at Santiao and Qinglong stations the CDW-NF models show the significantly better performances than the other models during the overall 3 yr (1096 days) river stage forecasting steps.The main explanations are that the CWT-based models reduce the overall forecasting iteration steps to 74 steps by using the 15-day prior data series as the first model input, while the conventional SSA-based models needs 1096 steps by using 1-day prior data series as the first model input.The prominent decrease of forecasting steps consequently brings significant reduction of error accumulation in the long-range MSA prediction.In addition, the combination of the advantages of the DWT method and neuro-fuzzy system also benefits weakening the noisy dynamics of the model inputs and enhancing the simulation and forecast ability for the complex hydro-system.
In view of the above discussion, the CDW-NF hybrid model is proven to be the most effective MSA predictor for the long-term forecasts of estuarine daily river stage signals.
In addition, the CDW-ANN hybrid model can be taken as the second selection for the short-term and mid-term forecasts of estuarine daily river stage signals because of its high performance during the first year forecast process.It should be noted that the methodology of accurate MSA predictor design by reducing iteration steps and error accumulation is innovated in this study by revealing the short-time periodic features of estuarine river stage dynamics, which is mainly caused by the half-month periodicity involved in the astronomical tidal fluctuation in river estuary.With respect to other kinds of hydro-meteorological signals, which have non-significant short-term periodic features, other types of algorithms and models for reducing error accumulation in long-term forecasting steps need to be further studied in future research.In addition, although the novel CDW-NF hybrid model succeeds in prominently increasing the longrange MSA prediction accuracy in R 2 , RMSE and MAE, the overall relatively poor performances of most of models in the "true" long-term forecast lead to the difficulties in the accurate peak value estimation.Thus, the improvement of modelling capabilities in hydro-signal's peak value estimation and physical process revealed by developing more advanced algorithms still needs further research.

Conclusions
Studies on the long-term forecast of hydrological signals have high practical value.However, the accurate long-range MSA predictor design is very difficult, especially in conventional SSA-based MSA predictions because of the huge error accumulation in the serial-propagated long-term forecast.In this study, we design a novel hybrid model CDW-NN, combining continuous and discrete wavelet transforms and neural networks, as the MSA predictor for effective long-term forecasts of hydrological signals.By the application of CDW-NN hybrid models and the other 10 types of hybrid and pure models in estuarine daily river stage series long-term forecasts, the 1096-day estuarine river stage data are forecasted, and the superiorities of CDW-NN models with corresponding driving mechanisms are proven as follows.
1. Comparing the conventional SSA-based models, the CWT-based hybrid models broaden the prediction range in each forecast step from conventional 1 day to now 15 days and reduce the overall forecasting iteration steps from conventional 1096 steps to now 74 steps by using the 15-day and 28-day prior data series as model inputs, which is determined by revealing the hydro-signal's significant short-time periodicities using CWT.The prominent reduction of forecast steps has created significant decrease of error accumulations and increase of long-term forecast performances in the CWT-based hybrid models.
2. Among the CWT-based models, one type of CDW-NN model, CDW-NF, has been proven to be the most effective MSA predictor for the prominent accuracy enhancement during the overall 1096-day long-term forecasts of estuarine hydro-signals.The other type of CDW-NN model, CDW-ANN, has been proven to be the second selection for short-term and mid-term forecasts of estuarine hydro-signals.The main explanation is the combination of the advantages of the CWT and DWT methods and neuro-fuzzy system in reducing the error accumulation, filtering weakly correlated details from original signal and weakening the noisy dynamics of the model inputs, and enhancing the simulation and forecast ability for the complex hydro-system.
3. It should be noted that because the successful application of the novel CDW-NF model in hydro-signals long-term forecasts largely depends on the significant short-term periodicities involved in estuarine hydrosignals, some other innovative algorithms and models still need to be further studied in future research for other kinds of hydro-meteorological signals without significant short-term periodic features.
days) covering 1998-2010, supported by the Water Resources Department of Jiangsu Province, were observed and collected from estuarine Santiao Port hydrologic station (31.721 • N, 121.698 • E) and Qinglong Port hydrologic station (31.862 • N, 121.239 • E), located about 19.0 and 70.0 km, respectively, upstream from Chinese Yangtze River entrance into the East China Sea (Fig. 4).The daily river stage data series at each station are obtained based on the average value of high-tide levels at two time points each day.The first 10 yr of river stage data (3652 days) are used for training and establishing hybrid models (i.e.t = 3652 in Fig.

Fig. 5 .
Fig. 5. Real parts of the Morlet wavelet transform coefficients at Santiao station (a) and Qinglong station (c), and their global wavelet power spectrums and corresponding confidence tests using white noise models (b, d), based on the 3652-day river stage data during 1998-2007 (1000 dpi, 139 mm × 111 mm).

Fig. 6 .
Fig. 6.Observed and predicted values of the daily river stages from 2008 to 2010 using the CDW-ANN hybrid model and CDW-NF hybrid model at Santiao station (a, b) and Qinglong station (c, d) (1000 dpi, 136 mm × 105 mm).

Fig. 7 .
Fig. 7. Forecast performances in respect to the correlation coefficients (R) of 12 types of hybrid models during the 1096-day river stages forecasting at Santiao station (a) and Qinglong station (b) (1000 dpi, 132 mm × 46 mm).

Table 1 .
Squared correlation coefficients (R 2 ) between each discrete wavelet component series at 15 days and 28 days prior (t − 15 and t − 28) and original river stage data series on the current day (S t ).

Table 2 .
Comparison among the performances of 12 types of river stage forecasting models in respect to root mean square errors (RMSEs), mean absolute errors (MAEs) and R square (R 2 ) in the training and test periods.

Forecast performance comparison among CDW-NN models and the other 10 types of hybrid and pure models
predictors (i.e.generally using S t−1 and S t−2 as model inputs to forecast S t ).In view of this, six types of conventional SSA-based long-term forecast hybrid and pure models are established for comparing with the CWTbased hybrid models.Among the six types of models, the DW-R, DW-ANN and DW-NF hybrid models utilise the optimum decomposition components combinations (TD t−1 and TD t−2 ), determined by DWT, as model inputs to forecast S t .With respect to the pure LR, BP-ANN and neuro-fuzzy models, the original daily river stage series at 1 day prior and 2 days prior (S t−1 and S t−2 ) are used as model inputs to forecast S t .By the model training processes, the opti-