FEDformer：融合傅里叶变换的长期序列预测Transformer

需积分: 0 107 浏览量更新于2024-06-13 收藏 540KB PDF 举报

"FEDformer.pdf，一篇关于时间序列预测的学术论文，提出了一种名为FEDformer的新方法，该方法结合了Transformer模型和季节性趋势分解技术，旨在解决长期序列预测中的计算复杂度高和全局视野捕捉不足的问题。通过引入频率增强的Transformer，利用Fourier变换来提升对稀疏表示的捕获能力，从而提高长期预测的性能。" 在时间序列分析和预测领域，Transformer模型已经显著提升了预测的准确度，特别是在处理复杂序列任务时表现优秀。然而，尽管Transformer具有强大的注意力机制，能够捕捉到序列中的局部细节，但它们在处理长期序列预测时面临两个主要问题：计算效率低和难以把握全局趋势。 FEDformer（Frequency Enhanced Decomposed Transformer）正是为了解决这些问题而提出的。该方法融合了季节性趋势分解技术，这一技术能够将时间序列分解为季节性、趋势和其他成分，从而更好地理解数据的整体趋势。同时，FEDformer引入了一个频率增强的Transformer组件，它利用了时间序列在Fourier基中的稀疏特性。通过这种方式，模型能够更高效地学习和表示时间序列中的周期性和周期性模式。 Fourier变换是一种数学工具，可以将信号从时域转换到频域，帮助识别不同频率的成分。在FEDformer中，它被用来揭示时间序列的周期性结构，这些结构在原始数据中可能不易察觉。这种频率增强的策略使得模型能够在进行长期预测时，更准确地捕获和利用这些周期性特征。此外，FEDformer的另一个优势是效率。与标准Transformer相比，它的线性时间复杂度意味着在处理大规模时间序列数据时，FEDformer能更快地收敛，且计算成本更低。这使得FEDformer不仅在预测精度上有所提升，而且在实际应用中更具可行性，尤其对于那些需要实时或近实时预测的场景，如能源需求预测、金融市场分析和气象预报等。 FEDformer通过结合经典的时间序列分解方法和创新的频率增强Transformer，为长期序列预测提供了一种更高效、更准确的解决方案，有望成为未来时间序列分析领域的研究热点。

Submission and Formatting Instructions for ICML 2022

(FEA) conn ecting encoder and decoder, and the Mixture

Of Experts Decomposition block (MOEDe c omp). The de-

tailed description of FEB, FEA, a nd MOEDecomp blocks

will be given in the following Section

3.2, 3.3, and 3.4 re-

spectively.

The encoder adopts a multilayer structure as: X

Encoder(X

l−1

), where l ∈ {1, ··· , N} denotes the out-

put of l-th encoder layer and X

∈ R

I×D

is the embed ded

historical series. The Enc oder(·) is formalized as

l,1

−

= MOEDecomp(FEB



l−1



+ X

l−1

l,2

−

= MOEDecomp(FeedForward



l,1



+ S

l,1

= S

l,2

(1)

where S

l,i

, i ∈ {1, 2} represents the seasonal compo-

nent a fter the i-th decomposition block in th e l-th layer r e-

spectively. For FEB module, it has two d ifferent versions

(FEB-f & FEB-w) which are implemented through Discrete

Fourier transform (DFT) and Discrete Wavelet transform

(DWT) mechanism respec tively and can seamlessly re place

the self- attention bloc k.

The decoder also a dopts a multilayer structure as:

, T

= D e coder(X

l−1

, T

l−1

), where l ∈ {1, ··· , M}

denotes the output of l-th decoder lay er. The Decoder(·) is

formalized as

l,1

, T

l,1

= MOEDecomp



FEB



l−1



+ X

l−1



l,2

, T

l,2

= MOEDecomp



FEA



l,1

, X



+ S

l,1



l,3

, T

l,3

= MOEDecomp



FeedForward



l,2



+ S

l,2



= S

l,3

= T

l−1

+ W

l,1

·T

l,1

+ W

l,2

·T

l,2

+ W

l,3

·T

l,3

(2)

where S

l,i

, T

l,i

, i ∈ {1, 2, 3 } represent the seaso nal and

trend component after the i-th decomposition block in the l-

th layer respectively. W

l,i

, i ∈ {1, 2, 3} represents the pro-

jector for the i-th extracted trend T

l,i

. Similar to FEB, FEA

has two different versions (FEA-f & FEA-w) which are im-

plemented through DFT and DWT projectio n respectively

with attention design, and can replace the cross-attention

block. The detailed description of FEA(·) will be given in

the following Section

3.3.

The ﬁnal prediction is the sum of the two r e ﬁned decom-

posed components as W

· X

+ T

, where W

is to

project the d e ep transformed seasonal component X

the target dimensio n.

3.2. Fourier Enhanced Structure

Discrete Fourier Transform (DFT) The proposed

Fourier Enhanced Structures u se discrete Fourier transfor m

(DFT). Le t F denotes the Fourier transform and F

−1

de-

−1

q ∈ R

L×D

Sampling

Q ∈ C

N ×D

M ×D

Q ∈

R ∈

Y ∈

M ×D

padding

Y ∈ C

N ×D

y ∈ R

×D

M ×D×D

l−1

en/de

MLP

Figure 3.

Frequency Enhanced Block with Fourier transform

(FEB-f) structure.

MLP

F + Sampling

MLP

F + Sampling

σ(·)

Padding + F

−1

y ∈ R

×D

q ∈ R

×D

⊤

l,1

Figure 4.

Frequency Enhanced Attention with Fourier transform

(FEA-f) structure, σ(·) is the activation function.

notes the inverse Fourier transform. Given a sequence of

real numbers x

in time domain, where n = 1, 2...N. DFT

is deﬁned as X

N−1

n=0

−iωln

, where i is the im a g-

inary unit and X

, l = 1, 2...L is a sequence o f complex

numbers in the frequency domain. Similarly, the inverse

DFT is deﬁned as x

L−1

l=0

iωln

. The complex-

ity of DFT is O(N

). With fast Fourier transform (FFT),

the computation complexity can be reduced to O(N log N).

Here a random subset of the Fourier basis is used and

the scale of the subset is b ounded by a scalar. When we

choose the mode index b efore DFT and reverse DFT oper-

ations, the computation complexity can be further red uced

to O(N).

Frequency Enhanced Block with Fourier Transform

(FEB-f) The FE B-f is used in both encoder and decoder

as shown in Figure

2. The input (x ∈ R

N×D

) of the

FEB-f block is ﬁrst linearly projected with w ∈ R

D×D

, so

q = x·w. Then q is converted f rom the time dom ain to the

frequency domain. The Fourier transform of q is denoted

as Q ∈ C

N×D

. In frequency domain , only the randomly

selected M modes are kept so we use a select operator as

Q = Select(Q) = Select(F(q)), (3)

where

Q ∈ C

M×D

and M << N. Then, the FEB-f is

deﬁned as

FEB-f(q) = F

−1

(Padding(

Q ⊙ R)), (4)

where R ∈ C

D×D×M

is a parameterized kernel in itialized

randomly. Let Y = Q ⊙ C, with Y ∈ C

M×D

. The pro-

duction operator ⊙ is deﬁned as: Y

m,d

, where d

= 1, 2...D is the input channel and

= 1, 2...D is the output chan nel. The result of Q ⊙ R

is then z e ro-padded to C

N×D

before performing inverse

Fourier transform bac k to the time domain. T he structure

is shown in Figure

剩余18页未读，继续阅读

Chase～711

粉丝: 0
资源: 1

FEDformer：融合傅里叶变换的长期序列预测Transformer

时间序列预测-Transformer,Informer,Autoformer,FEDformer复现结果

基于Transformer的长时间序列代码汇总（Autoformer,PEDformer,Informer...等15个算法代码

fedformer复现

FEDformer优缺点

如何用FEDformer进行预测

微信Java开发工具包，支持包括微信支付、开放平台、公众号、企业微信、视频号、小程序等微信功能模块的后端开发

如何制作MC（需要下载海龟编辑器2.0，下载pyglet==1.5.15）

民宿预订管理系统 SSM毕业设计 附带论文.zip

最新资源

民宿预订管理系统 SSM毕业设计附带论文.zip