Dynamic time warping based on cubic spline interpolation for time series data
mining
Hailin Li
College of Business Administration
Huaqiao University
QuanZhou, China 362021
Email: hailin@mail.dlut.edu.cn
Xiaoji Wan, Ye Liang and Shile Gao
College of Business Administration
Huaqiao University
QuanZhou, China 362021
Email:WanXiaoji@hqu.edu.cn
Abstract—Dynamic time warping (DTW) and derivative dy-
namic time warping (DDTW) are two robust distance measures
for time series, which allows similar shapes to match even
if they are out of phase in the time axis. In this paper,
we propose a novel dynamic time warping based on cubic
spline interpolation (SIDTW) to improve the performance.
The derivative of every point of time series is calculated by
cubic spline interpolation and is used to replace the estimated
derivatives in DDTW. After interpolation we use derivative-
based sequences to represent the original time series, which is
better to describe the trend of the original time series and more
reasonable to warp. Meanwhile, we empirically point out that
the quality of similarity measure for the three warping methods
is nothing to do with the amount of warping. We experimentally
perform the proposed method and compare with the existing
ones, which demonstrates that in most cases our approach not
only can produce much less singularities and obtain the best
warping path with shorter length but also is an alternative
version of DTW when time series datasets are not suitable for
DTW to be measured.
Keywords-dynamic time warping, time series data mining,
cubic spline interpolation, similarity measure
I. INTRODUCTION
Time series is a type of common data existing in our daily
life. Valuable information and knowledge are hiding in large
time series database, including bioinformation, engineering,
financial market, medicine, etc. Recently more and more
attention has been paid on time series mining, many models
and algorithms have been applied to this field, such as
clustering [2], [23], classification [9], [21], motifs finding
[13] and indexing [8].
In most cases, we should compare one time series to
another and obtain similarity before time series clustering,
classification and other time series data mining tasks. Eu-
clidean distance measure is a common way to calculate
the similarity between two time series, which is widely
used in many cases [11], [6]. However, Euclidean distance
measure is very brittle for time series [20], [5], [8] measure.
Moreover, time series approximately have the same overall
component shapes which often do not line up in the time
axis. Therefore, the work is looking for a way to align the
time axis so that these shapes can line up in time axis.
The available solution to address the above issue is dy-
namic time warping (DTW) [18], [4], [7], [2], which is often
used to measure the similarity of signals or time series with
equal length [16]. It is also used to measure the similarity
between two time series by warping the time axis of one
sequence (or two sequences simultaneously ). In the past
decade, DTW was successfully used in different fields, such
as DNA expression data [1], time series patterns discovery
[4] and articulated motion recognition [15]. Although it is
more effective than Euclidean distance and also applied
widely to various domains, it has some defects. The most
obvious one is the abnormal results it produced, which is
found by Keogh and Pazzani [7]. They stated that DTW
trying to explain the variability in the Y-axis by warping
time axis (X-axis) will cause non-intuitive alignments where
a single point on one time series maps into a large subsection
of another time series. To overcome the drawbacks, they
modified it and provided derivative dynamic time warping
(DDTW). It is also used in many fields. For examples,
Muscillo et al. used it to classify accelerometer data [14],
Zhou and Wong used it for time scaling searching [24].
DDTW [7] produced less “singularities” (less warping) to
measure the time series than DTW. However, it still produces
many “singularities” and needs a large amount of warping
in some cases. In this paper, for the corresponding cases
we propose a novel method called spline-interpolation-based
dynamic time warping (SIDTW) and denote it as a template
for an improvement of DTW. We also suggest that anyone
much more efficient and effective than cubic spline interpo-
lation can be used. The authors [15] also combines cubic
spline interpolation with DTW to match the normalized
sequences, but it focuses on the values of the re-sampled
sequences rather than the derivatives of interpolated points.
SIDTW uses the cubic spline interpolation to calculate the
much more accurate derivative of every point of time series.
Moreover, the derivatives are used to replace those of DDTW
which are computed by the roughly estimated method.
The main motivation of our approach is obtaining more
accurate derivatives to reflect the trend of time series and
improve the effectiveness of the similarity measure for the
2014 IEEE International Conference on Data Mining Workshop
978-1-4799-4274-9/14 $31.00 © 2014 IEEE
DOI 10.1109/ICDMW.2014.21
19