特征点与DTW结合的混合时间序列匹配算法

29 浏览量更新于2024-08-28 1 收藏 228KB PDF 举报

"这篇研究论文提出了一种名为‘基于特征点和DTW的混合时间序列匹配算法’的新方法，旨在解决传统特征点方法忽视细节以及DTW（动态时间规整）计算复杂性高的问题。作者来自中国海洋大学、华盛顿大学和重庆大学，他们通过结合时间序列的特征点提取和DTW距离计算，设计了一个两步匹配过程：首先，从时间序列中提取特征点作为粗粒度表示，并计算这些特征点之间的DTW距离；随后，在特征点分割上应用均匀采样作为细粒度的匹配步骤。" 时间序列分析是数据科学中的一个关键领域，特别是在监控、预测和模式识别等任务中。动态时间规整（Dynamic Time Warping, DTW）是一种用于比较两个时间序列的算法，即使它们的速度不同，DTW也能找到最佳对齐方式，有效地衡量它们的相似性。然而，DTW的主要缺点是计算复杂度高，对于大规模数据集可能不切实际。特征点方法则是另一种时间序列分析技术，它关注时间序列的主要趋势，但可能忽略掉重要的局部细节。论文中提出的混合算法试图结合两者的优势。首先，通过对时间序列进行特征点提取，可以得到一个简化的表示，保留主要变化点，降低计算复杂性。特征点的选择通常基于时间序列的突变、峰值或谷值等显著点。接下来，论文采用了DTW距离来评估特征点之间的相似性。DTW允许时间序列在匹配过程中有弹性地拉伸或压缩，确保了即使在速度不同步的情况下，也能准确捕捉到两个序列的相似部分。然后，通过在特征点分割上应用均匀采样，进一步细化匹配过程，降低了DTW的计算成本，同时保持了对序列细节的敏感性。该混合算法的创新之处在于它试图平衡计算效率和匹配精度。通过先用特征点简化问题，再用DTW进行精细化匹配，这种方法有望在保持较高匹配质量的同时，减少计算需求，适用于需要实时或近实时处理大量时间序列数据的应用场景，如金融市场的交易信号检测、生物医学信号分析或视频运动识别等领域。这项工作为时间序列匹配提供了一种新的有效途径，对于提高匹配效率和准确性具有重要意义。未来的研究可能会进一步优化这种混合方法，例如，改进特征点提取策略或探索更高效的采样技术，以适应更加复杂和多样化的时间序列数据。

A Hybrid Time Series Matching Algorithm Based on Feature-Points and DTW

Xi Wang

∗

, Mingxing Jiang

†

, Sheng Chen

‡

, Chao Yang

, Wei Jing

∗

and Zhongwen Guo

∗

Department of Computer Science and Technology, Ocean University of China, Qingdao, China

Email: guozhw@ouc.edu.cn

†

Department of Computer Foundation, Ocean University of China, Qingdao, China

Email: jiangmx@ouc.edu.cn

‡

University of Washington, Seattle, WA, USA

Email: shengc5@uw.edu

College of Communication Engineering, Chongqing University, Chongqing, China

Email: yang249@163.com

Abstract—Feature-points based time series approximation

representation utilizes the tendency information, but lacks

consideration of the details. DTW (Dynamic Time Warping)

based similarity measurement eliminates the time line warp,

but computation complexity is high. Based on feature-points

and DTW, this paper proposed a hybrid time series matching

algorithm. Firstly, we extracted the feature-points of time series

as the coarse-grained representation, calculated the DTW dis-

tance between feature-points; then applied uniform sampling

on feature-points segmentations as the ﬁne-grained representa-

tion, calculated the Euclidean distance between corresponding

segmentations; at last, we summed the two distances as the ﬁnal

distance. The algorithm achieved a high matching accuracy

while lowered the computation overhead. This paper used

several time series data sets from UCR to do the experiments,

veriﬁed the effectiveness of the proposed algorithm.

Keywords-Time Series; DTW; Feature Points; Hybrid Match-

ing

I. INTRODUCTION

A time series is a series of data points listed in time

order, for example curves of temperature and stock price.

Time series are applied in many areas such as ﬁnancial

[1], sensor networks [2] and energy industry [3]. As the

development of social economy and technologies in IOT and

big data, data volume of time series increases rapidly and

time series mining is gaining more and more attentions from

researchers. Time series matching is the basis of time series

mining and becomes one of the most studied areas in the

time series mining literature [12].

Time series matching includes approximation representa-

tion and similarity measure [7]. Approximation representa-

tion is to extract features by some means and the major

reason for approximation representation is to reduce the

dimension of the original data. Common methods for repre-

sentation are piecewise linear representation [4], frequency

domain representation [5], symbolic representation [6], 𝑒𝑡𝑐.

Among them piecewise linear representation performs better

in approximation with the original data by separating a

time series into several parts which are expressed by linear

formulation.

Based on the main idea in piecewise linear representation,

some feature-points based methods are proposed [8] [9].

These approaches utilize the feature-points as the cut-off

points and make further approximation with the original

data. Piecewise linear representation reﬂects the overall trend

but in the meantime some details are missing because of the

sampling.

For the similarity measure, the most commonly used

measures are Euclidean distance and dynamic time warping

(DTW) distance [10]. Euclidean distance is relatively simple

but the time line shifting problem is not considered. DTW

solves the shifting problem but dynamic programming is

adopted which leads to a high computation complexity.

This paper is organized as follows. Firstly, we present

problem statement in section II. The approximation repre-

sentation will be presented in section III. In section IV, we

give the similarity measure used in this paper. In section

V, we propose our hybrid matching algorithm. Section VI

investigates the effectiveness of the proposed algorithm.

Finally, we draw some conclusions in section VII.

II. P

ROBLEM STATEMENT

Deﬁnition 1 (Time Series): Time series 𝑇 is a set of

observation data listed in time order.

𝑇 = {(𝑝

,𝑡

), (𝑝

,𝑡

), ..., (𝑝

𝑖

,𝑡

𝑖

)} (𝑡

<𝑡2 < ... < 𝑡

𝑖

)

(1)

Where, 𝑝

𝑖

is the 𝑖𝑡ℎ data point, 𝑡

𝑖

is the corresponding

time stamp, 𝑖 =1, 2, ..., 𝑛, 𝑛 is called the length of time

series 𝑇 . Fig. 1 shows an example of temperature time series.

Deﬁnition 2 (Time Series Matching): Given two time se-

ries 𝑇

and 𝑇

, 𝑇

′

and 𝑇

′

are the approximation represen-

tation, distance 𝐷 is computed by some similarity measure,

if 𝐷<𝜖(𝜖 is a predeﬁned distance threshold), then we

consider that 𝑇

and 𝑇

are matched.

III. A

PPROXIMATION REPRESENTATION

The nature of time series data includes: large in data

size, high dimensionality. If matching is carried out on the

2016 9th International Symposium on Computational Intelligence and Design

DOI 10.1109/ISCID.2016.153

171

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38740827

粉丝: 7
资源: 947

特征点与DTW结合的混合时间序列匹配算法

ee-dynamic-time-warping

DTW_dtw_时间序列_dtw算法_时间序列分类_

dtw代码matlab-Dynamic-Time-warping:用于对齐两个时间序列（理想情况下是3D加速度计值）并计算动态时间扭曲(DTW

有没有和dtw相似的时间序列分析算法

时间序列dtw距离算法matlab

多个时间序列DTW算法R语言

dtw时间序列聚类实战

dtw实现时间序列影像的代码

基于dtw的语音识别算法

基于DTW算法的语音识别的实验目的

最新资源