深度卷积神经网络提升多变量时间序列分类效率

94 浏览量更新于2024-07-15 5 收藏 2MB PDF 举报

本文探讨了在多变量时间序列分类任务中，如何利用多通道深度卷积神经网络（Multi-Channel Deep Convolutional Neural Networks, MCDCNN）来提高效率并克服传统方法的局限性。时间序列分类与健康信息学、金融和生物信息学等领域密切相关，研究人员已经开发出多种算法来处理此类任务，如多变量时间序列分类（Multivariate Time Series Classification）。其中，基于k近邻（k-NN）分类，特别是结合动态时间 warping (DTW) 的1-NN方法，因其在性能上的优越性而受到广泛关注。然而，当数据集规模增大时，1-NN与DTW的高计算复杂度成为显著问题。由于1-NN依赖于逐个样本间的全局比较，随着样本数量增加，其时间消耗呈指数级增长，这在实际应用中可能导致不可接受的延迟。因此，本文提出了一种新型的MCDCNN模型，旨在通过深度学习架构的优势，利用多通道特征提取和局部特征匹配，降低对全局时间平移的依赖，从而实现更高效的多变量时间序列分类。 MCDCNN模型通常包含几个关键组件：首先，输入数据被分解为多个独立的通道，每个通道关注不同的特征或时间尺度。这有助于捕捉不同特征之间的关系和演变模式。其次，深度卷积层被用于学习这些通道内的局部特征表示，通过滤波器的滑动和卷积操作提取时序数据中的有用特征。接着，池化层进一步减小数据维度，同时保留关键信息，防止过拟合。最后，全连接层和softmax层将这些特征映射到类别概率上，进行最终的分类决策。与传统的1-NN+DTW相比，MCDCNN的优势在于它能够并行处理数据，减少了计算复杂度，尤其是在大规模数据集上。此外，深度学习模型通常具有更好的泛化能力，可以自动学习特征，无需手动设计复杂的距离度量。然而，训练多通道深度模型可能需要大量的标注数据和计算资源，并且模型的解释性可能不如传统方法直观。本文通过实验验证了MCDCNN在多变量时间序列分类任务中的有效性，包括在基准数据集上的准确性和速度方面的提升。研究结果表明，这种新型方法能够在保持甚至提高性能的同时，显著减少计算成本，对于处理大规模实时数据分析具有实际意义。未来的研究方向可能涉及如何进一步优化模型结构，以及如何处理非平稳和复杂的时间序列数据。这篇研究论文为多变量时间序列分类提供了强大的工具，推动了该领域的发展。

Yi ZHENG et al. Exploiting Multi-Channels Deep Convolutional Neural Networks for Multivariate Time Series Classiﬁcation

where W should satisfy these three constraints above. One

step further, for two multivariate time series X and Y, similar

to Euclidean distance, DTW between X and Y can be deﬁned

as follows:

DTW(X, Y) =

i=1

DTW(x

, y

)

where l denotes the number of components in multivariate

time series, and both of x

and y

represent the i

univariate

time series of them, respectively.

It is common to apply dynamic programming to compute

DTW(Q, C) (or DTW(X, Y)), which is very eﬃcient and

has a time complexity O(n

) in this context. However,

when the size of data set grows large and the length of

time series becomes long, it is very time consuming to

compute DTW combined with k-NN method. Hence, to

reduce the time consumption, window constraint DTW has

been adopted widely instead of full DTW in many previous

work [10, 18–20]. On the other hand, from the intuition, the

warping path is unlikely to go very far from the diagonal of

the distance matrix [10]. In other words, for any element

= d(q

, c

) in the warping path, the diﬀerence between i

and j should not be too large. By limiting the warping path

to a warping window, some previous work [10, 19] showed

that relatively tight warping windows actually improve the

classiﬁcation accuracy.

According to above discussions, we consider both

Euclidean distance and window constraint DTW as the

default distance measures in the following.

3 Multi-Channels Deep Convolutional Neural

Networks

In this section, we will introduce a deep learning

framework for multivariate time series classiﬁcation: Multi-

Channels Deep Convolutional Neural Networks (MC-

DCNN). Traditional Convolutional Neural Networks (CNN)

usually include two parts. One is a feature extractor,

which learns features from raw data automatically. The

other is a trainable fully-connected MLP, which performs

classiﬁcation based on the learned features from the previous

part. Generally, the feature extractor is composed of multiple

similar stages, and each stage is made up of three cascading

layers: ﬁlter layer, activation layer and pooling layer. The

input and output of each layer are called feature maps [13].

In the previous work of CNN [13], the feature extractor

usually contains one, two or three such 3-layers stages. For

remainder of this section, we ﬁrst introduce the components

of CNN brieﬂy and more details of CNN can be referred

to [13, 21]. Then, we show the gradient-based learning of

our model. After that, the related unsupervised pretraining is

given at the end of this section.

3.1 Architecture

In contrast to image classiﬁcation, the inputs of multivariate

time series classiﬁcation are multiple 1D subsequences but

not 2D image pixels. We modify the traditional CNN and

apply it to multivariate time series classiﬁcation task in this

way: We separate multivariate time series into univariate

ones and perform feature learning on each univariate series

individually, and then a traditional MLP is concatenated at

the end of feature learning that is used to do the classiﬁcation.

To be understood easily, we illustrate the architecture of MC-

DCNN in Fig. 3. Speciﬁcally, this is an example of 2-stages

MC-DCNN with pretraining for activity classiﬁcation. Once

the pretraining is completed, the initial weights of network

are obtained. Then, the inputs of 3-channels are fed into a

2-stages feature extractor, which learns hierarchical features

through ﬁlter, activation and pooling layers. At the end

of feature extractor, the feature maps of each channel are

ﬂatten and combined as the input of subsequent MLP for

classiﬁcation. Note that in Fig. 3, the activation layer is

embedded into ﬁlter layer in the form of non-linear operation

on each feature map. We describe how each layer works in

the following subsections.

3.1.1 Filter Layer

The input of each ﬁlter is a univariate time series, which is

denoted x

∈ <

, 1 6 i 6 n

, where l denotes the layer which

the time series comes from, n

and n

are number and length

of input time series. To capture local temporal information, it

requires to restrict each trainable ﬁlter k

i j

with a small size,

which is denoted m

, and the number of ﬁlter at layer l is

denoted m

. Recalling the example described in Fig. 3, in

ﬁrst stage of channel 1, we have n

= 1, n

= 256, m

= 5

and m

= 8. We compute the output of each ﬁlter according

to this:

l−1

∗ k

i j

+ b

, where the ∗ is convolution operator

and b

is the bias term. To determine the size of each ﬁlter

i j

, we follow the earlier studies [22] and set it to 5 (m

= 5)

as they suggested.

3.1.2 Activation Layer

The activation function introduces the non-linearity into

neural networks and allows it to learn more complex model.

剩余18页未读，继续阅读

weixin_38698860

粉丝: 5

深度卷积神经网络提升多变量时间序列分类效率

使用交叉与内通道并行卷积的多变量时间序列分类

SageFormer：多变量时间序列预测的Transformer新解

MATLAB神经网络案例分析：PID神经元网络与多变量系统控制

用于多步时间序列预测的卷积神经网络

基于多通道小波卷积神经网络的路面异常检测算法.pdf

时间序列预测14：如何开发用电量预测CNN模型详解 01 单变量多步时间序列预测

Python 实现SSA-CNN-BiLSTM-Attention多变量时间序列预测（SE注意力机制）（含模型描述及示例代码）

MATLAB实现CNN卷积神经网络多输入多输出预测（完整源码和数据）

Python 实现SSA-CNN-BiLSTM-Attention多变量时间序列预测（SE注意力机制）（含完整的程序，GUI设计和代码详解）

使用脑电图信号的深度学习情感识别：卷积神经网络新方法

最新资源