深度学习时间序列预测：模型与进展

需积分: 50 42 浏览量更新于2024-09-05 1 收藏 502KB PDF 举报

"本文深入探讨了深度学习在时间序列预测中的应用，涵盖了单步和多步预测中的编码器和解码器设计，并讨论了如何利用这些模型整合时间信息以进行精准预测。此外，还提到了混合深度学习模型的最新进展，这种模型结合了统计学方法和神经网络组件，以增强预测能力。文章最后提到了深度学习在决策支持中的作用。" 深度学习时间序列预测是近年来研究的热点，其主要目标是利用神经网络技术对未来的数据序列进行预测。时间序列数据广泛存在于许多领域，如金融、气象、交通和电力系统等，这些领域的预测任务通常具有高度的复杂性和不确定性。首先，文章讨论了编码器和解码器的设计。编码器负责将输入的时间序列数据转化为紧凑的向量表示，这一过程称为特征提取。通过不同的神经网络结构，如长短时记忆网络（LSTM）或门控循环单元（GRU），编码器可以捕捉到时间序列中的长期依赖关系。解码器则使用编码器生成的向量来生成预测结果，它通常会逐步生成未来时间点的值，对于多步预测尤为关键。其次，混合深度学习模型是近年来的一个重要发展方向。这些模型结合了经典的统计模型，如自回归移动平均模型（ARIMA）或状态空间模型，与神经网络，以充分利用各自的优点。例如，统计模型能够提供理论上的解释和稳定性，而神经网络则能学习复杂的非线性模式。这样的混合模型在提高预测精度和应对不确定性方面表现出色。文章还强调了深度学习在决策支持中的作用。除了预测，深度学习模型还可以用于不确定性估计，这对于决策制定至关重要。通过概率建模或蒙特卡洛采样，模型可以输出预测的分布，从而提供关于预测不确定性的信息。此外，解释性和对抗性预测（counterfactual prediction）也是深度学习在时间序列预测中的新应用，它们帮助用户理解模型的预测依据，以及如果某些条件改变，预测结果会如何变化。这篇综述文章提供了深度学习在时间序列预测领域的全面视角，不仅涵盖了基础的模型设计，还包括了最新的研究趋势和技术挑战。对于希望在这个领域进行研究或应用的读者来说，这是一份宝贵的参考资料。

rsta.royalsocietypublishing.org Phil. Trans. R. Soc. A 0000000

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

(i) Convolutional Neural Networks

Traditionally designed for image datasets, convolutional neural networks (CNNs) extract local

relationships that are invariant across spatial dimensions [

]. To adapt CNNs to time series

datasets, researchers utilise multiple layers of causal convolutions [

] – i.e. convolutional

ﬁlters designed to ensure only past information is used for forecasting. For an intermediate feature

at hidden layer l, each causal convolutional ﬁlter take the form below:

l+1

= A



(W ∗ h) (l, t)



, (2.4)

(W ∗ h) (l, t) =

τ=0

W (l, τ)h

t−τ

, (2.5)

where

∈ R

is an intermediate state at layer

at time

∗

is the convolution operator,

W (l, τ) ∈

out

×H

is a ﬁxed ﬁlter weight at layer

, and

A(.)

is an activation function, such as a sigmoid

function, representing any architecture-speciﬁc non-linear processing.

Considering the 1-D case, we can see that Equation

(2.5)

bears a strong resemblance to ﬁnite

impulse response (FIR) ﬁlters in digital signal processing [

]. This leads to two key implications

for temporal relationships learnt by CNNs. Firstly, in line with the spatial invariance assumptions

for standard CNNs, temporal CNNs assume that relationships are time-invariant – using the same

set of ﬁlter weights at each time step and across all time. In addition, CNNs are only able to use

inputs within its deﬁned lookback window, or receptive ﬁeld, to make forecasts. As such, the

receptive ﬁeld size

needs to be tuned carefully to ensure that the model can make use of all

relevant historical information. It is worth noting that a single causal CNN layer is equivalent to

an auto-regressive (AR) model.

Dilated Convolutions

Using standard convolutional layers can be computational challenging

where long-term dependencies are signiﬁcant, as the number of parameters scales directly with the

size of the receptive ﬁeld. To alleviate this, modern architectures frequently make use of dilated

covolutional layers [23, 24], which extend Equation (2.5) as below:

(W ∗ h) (l, t, d

) =

bk/d

τ=0

W (l, τ)h

t−d

, (2.6)

where

b.c

is the ﬂoor operator and

is a layer-speciﬁc dilation rate. Dilated convolutions can hence

be interpreted as convolutions of a down-sampled version of the lower layer features – reducing

resolution to incorporate information from the distant past. As such, by increasing the dilation rate

with each layer, dilated convolutions can gradually aggregate information at different time blocks,

allowing for more history to be used in an efﬁcient manner. With the WaveNet architecture of [

]

for instance, dilation rates are increased in powers of 2 with adjacent time blocks aggregated in

each layer – allowing for 2

time steps to be used at layer l as shown in Figure 1a.

(ii) Recurrent Neural Networks

Recurrent neural networks (RNNs) have had a historically been used in sequence modelling

[

], with strong results on a variety of natural language processing tasks [

]. Given the natural

interpretation of time series data as sequences of inputs and targets, many RNN-based architectures

have been developed for temporal forecasting applications [

]. At its core, RNN cells

contain an internal memory state which acts as a compact summary of past information. The

memory state is recursively updated using with new observations at each time step as shown in

Figure 1b, i.e.:

= ν (z

t−1

) , (2.7)

剩余11页未读，继续阅读

syp_net

粉丝: 158

深度学习时间序列预测：模型与进展

Rossmann Store Sales 罗斯曼商店销售-数据集

时间序列预测讲义（ARIMA&LSTM;）及python代码

Walmart Recruiting - Store Sales Forecasting 沃尔玛招聘-商店销售预测-数据集

深度学习时间序列预测模型-基于TCN-Attention-Bilstm的Matlab回归预测框架及其实证分析,深度学习时间序列预测模型：基于TCN-Attention-Bilstm的Matlab代码

深度学习时间序列预测应用合集（源码）

Python深度学习时间序列预测指南

深度学习时间序列预测实战教程

利用Matlab进行深度学习时间序列预测

深度学习时间序列预测教程与代码下载

欧盟碳排放数据的深度学习时间序列预测

最新资源