随机森林与深度学习结合的碳价预测新方法：CEEMDAN-SE-LATM-RF模型

需积分: 0 75 浏览量更新于2024-08-05 1 收藏 3.73MB PDF 举报

"本文提出了一种新的预测模型——基于非平稳非线性数据的CEEMDAN-SE-LATM-RF模型，用于碳价格预测，该模型结合了改进的特征提取方法和深度学习，旨在提高预测精度和泛化能力。在多个中国碳交易市场进行了有效性测试，并与其他4种基准方法进行了比较，结果显示提出的模型表现更优。" 正文: 在当前全球关注气候变化和可持续发展的背景下，碳价格预测成为了一个至关重要的研究领域。本文介绍了一种创新的预测方法，它利用随机森林为基础的非线性集成范式，结合了改进的特征提取技术和深度学习，称为CEEMDAN-SE-LATM-RF模型。这个模型专门设计用来处理非平稳和非线性的碳价格数据，以提高预测的准确性和模型的鲁棒性。 CEEMDAN（Complete Ensemble Empirical Mode Decomposition with Adaptive Noise）是一种先进的信号分解技术，用于将复杂的时间序列数据分解成一系列本征模态函数（IMFs），这样可以更好地捕捉数据中的非线性与非平稳特性。这一过程有助于揭示隐藏在原始数据背后的模式，从而为特征提取提供更有价值的信息。 SE-LA（Self-Exciting Linear Autoregressive）模型是一种自激线性自回归模型，它能够捕获数据中的短期相关性和趋势，进一步增强特征的表示能力。与传统的ARIMA（自回归整合滑动平均模型）相比，SE-LA模型在处理具有突发性和自激性的序列时表现出更高的适应性。 TM（Temporal Merger）是时间序列融合技术，它将来自不同分解层次的IMFs进行组合，以构建出更全面、更具代表性的特征向量。这种融合策略有助于减少噪声干扰，同时保留关键信息，提高预测的稳定性。最后，RF（Random Forest）是随机森林算法，一种强大的集成学习方法，它通过构建大量的决策树并取其平均结果来降低过拟合风险，提升模型的泛化性能。在CEEMDAN-SE-LATM-RF模型中，随机森林作为最终的预测器，利用提取的特征进行碳价格预测。实证研究部分，该模型在中国的不同碳交易市场上进行了应用，结果表明，CEEMDAN-SE-LATM-RF模型相对于传统的线性模型以及其他4种基准方法（可能包括ARIMA、支持向量机、神经网络等）具有更高的预测精度。这证明了该模型在处理碳价格这种复杂且易变的数据时，不仅提高了预测的准确性，而且展示了良好的鲁棒性。总结来说，这篇论文提出了一种新颖的、结合深度学习与随机森林的碳价格预测模型，通过改进的特征提取技术提升了模型对非线性非平稳数据的适应性，从而在实际应用中展现出优越的预测性能。这对于政策制定者和市场参与者在碳排放交易决策中提供了有力的工具，有助于更好地理解和应对碳市场的不确定性。

(2011) applied the wavelet packet transforms into carbon price fore-

casting, which represented a good performance. Sun et al. (2018) also

used WT as the basis decomposing model in the research of China

Emiss ions-Trading Scheme. But these representations of effectivity

mainly depend on the wavelet basis function selected by researchers'

subjectivity without a speciﬁc theory foundation. Empirical mode de-

composition (EMD) is an adaptive method overcoming the drawback

of reliance on the subjective experience of setting a basis function previ-

ously. Zhu et al. (2017) proved that using EMD can well capture several

components with different features. Gao and Jian (2014) proposed a hy-

brid model comprising particle swarm optimization (PSO), SVM and

EMD. In the EMD part, several stationary intrinsic mode functions

(IMFs) and a residual series will be put into a neural network for train-

ing. For the sake of innovation, Gilles (2013) built a new self-adaptive

signal decomposition method named empirica l wavelet transform

(EWT) by combing EMD with WT, whose ﬁnal result showed a better

performance. However, the process of decomposition through EMD is

easy to emerge modal mixing problem and its physical meaning is lack-

ing (Tian and Hao, 2020). To tackle the problem, Wu and Huang (2009)

carried out a study and improved EMD, which was named as ensemble

empirical mode decomposition (EEMD). Qin et al. (2015) utilized EEMD

as a data preprocessing method for improving the prediction effect of

the carbon price. Wu et al. (2019) also combined EEMD with LSTM to

predict the spot price of west texas intermediate crude oil. It can be

found that EEMD is an enhanced EMD, which can improve the phenom-

enon of modal mixing effectively by offsetting and restraining the ef-

fects of noises in man y times' experiences. Despite robustness and

effectiveness of forecasting based on EEMD, there is still a drawback to-

wards it. Increasing the times of integration can reduce the error of re-

construction, whereas it expands the scale of calculation and remains

residual noises to a certain amplitude. Besides, Wu and Huang (2009)

said that the problem of modal splitting may occur. To overcome this

defect, complete ensemble empirical mode decomposition (CEEMD)

as an improved method of EEMD is applied. Zhang et al. (2018) proved

that complete ensemble empirical mode decomposition with adaptive

noise (CEEMDAN) as a signal processing technology can not only solve

the modal aliasing problem, but also lessen the white noise interference

and save the computing time. What's more, Cao et al. (2019) combined

CEEMDAN with LSTM, indicating that CEEMDAN can exploit more hid-

den information than EMD and the hybrid model surpasses the single

one. Therefore, CEEMDAN can be seen as a relatively progressive de-

composition method at present and utilized by this paper for the reason

that its error of reconstruction is nearly zero by adding adaptive white

noise into each phase.

Extant studies have shown that the AI prediction models with

the feature extraction part can not only achieve the effects of data

preprocessing and improv e the calcul ating efﬁciency, but also es-

tablish an appropriate prediction model for the time series. But

several major drawbacks still remain. First of all, after

decomposing the carbon price series, each sub-sequence has been

put into a prediction model for the output results, which didn't

consider the similar complexity and correlation among them so

as to lower efﬁciency and accuracy. Secondly, the predicti on

model for each sub-sequence is the same without the realization

that each mode is different for its unique feature and frequency,

so the respectiv e establishment of mo dels with more prope r pa-

rameters is of vital importa nce (Che, 2015). Thirdly, after achieving

the prediction results of each sub-sequence, existing ﬁnal ensem-

ble models mainly limit to the linear form such as obtaining the

ﬁnal forecast result t hrough combining the prediction values of

all the decomposed modes (Zhu et al., 2018). For the reason that

it is not usually applicabl e for all the cases, a line ar ensemble ap-

proach may affect the accuracy of predicting (Liao and Tsao,

2006

). There are two main ty pes of nonlinear integration methods.

One o f them is serialization methods with strong dependencies

among individual learners and the o ther is paralleliz ation met hods

generated simultaneously without strong dependencies among in-

dividual learners. Representative of the former i s boost ing and the

latter is bagging, which develop the extre me gradient boosting

(XGboost) and the random forest (RF) respectively. By comparing

these two methods, we can ﬁnd that the XGb oost is more se nsitive

to overﬁtting if the data is noisy and it is often takes longer for

being built in sequence (Fan et al., 2020). What's more, RF is

more adjustable.

In order to solve these existing problems towards carbon price

forecast, a novel hybr id model incorporat ing CEEMDAN, Sample

entropy (SE), LSTM and Random forest (RF) is put forward. From

the perspectiv e of meth odology, it develops an innov ative r andom

forest-based nonlinear ensemble paradigm of improved feature

extraction and deep learning algorithm for higher accuracy in the

case of nonst ationary and nonl inear carbon pr ice forecast. Firstly,

the original carbon price series is decomposed into several simple

stationary modes with the application of CEEMDAN algorithm.

Then, the obtained simple modes with similar co mplexity are

recombined according to the SE algorithm, so as to boost calculat-

ing efﬁciency and accuracy. Considering that different modes

have their own frequency and characteristic, LSTM can then be ap-

plied t o set an appropriate prediction model for each reconstructed

component because of i ts strong long and shor t term memory. At

last, after forecast results of reconstructed components have been

achieved through the deep learning algorithm, RF as a nonlinear

ensemble bagging learning model is utilized to aggregate the ﬁnal

carbon price forecast result for the further improved predictio n

accuracy.

From the above, the main innovations and contributions of this re-

search compared to the ﬁndings in the literature are shown in the fol-

lowing four points:

a. Considering the neglect of similar complexity and correlation among

decomposed modes, an improved feature extraction incorporating

CEEMDAN and SE is adopted for screening different features effec-

tively from the original carbon price series so as to the higher efﬁ-

ciency and accuracy.

b. With the realization that respective establishment of models is of

vital importance and in order to capture more complicated features,

LSTM replaces RNN as the crucial prediction model.

c. For the reason that nonlinear ensemble learning can get smaller er-

rors and more stability than a linear approach, this research applies

RF as integrated algorithm to improve the forecast accuracy.

d. The novel hybrid model for carbon price forecast setting as an adap-

tive nonlinear ensemble learning paradigm is ﬁrstly proposed,

which excels single model and represents its unique robustness.

The structure of the rest of this paper is as follows: the methodologies

and brief proposed model structure are outlined in Section 2. The case

study with data collecting, preprocessing and relative measurement indi-

ces are elaborated in Section 3. Section 4 describes the forecast results as

well as discussions in more detail. At last, Section 5 draw a conclusion.

2. Methodology

2.1. Complete ensemble empirical mode decomposition (CEEMDAN)

EMD proposed b y Huang et al. (1998) has been widely utilized in

many ﬁelds, which is an adaptive si gnal decomposition met hod

without any assumptions about data. However, the problem of

modal aliasing causes the decomposed intrinsic functions affecting

each other, which deprives the physical meaning of t he IMF. To

solve this probl em, Wu and Huang (2009) proposed EEMD, which

can offset the effects of noise during the procession of decomposition

by making several times' experiments. U nfortunately, there is resid-

ual noise in the components, which lowers efﬁciency. Ove rall,

J. Wang, X. Sun, Q. Cheng et al. Science of the Total Environment 762 (2021) 143099

剩余12页未读，继续阅读

创业青年骁哥

粉丝: 28
资源: 341

随机森林与深度学习结合的碳价预测新方法：CEEMDAN-SE-LATM-RF模型

最新资源