X. Huang et al. / Information Sciences 367–368 (2016) 1–13 3
Dynamic Time Warping (DTW) [35,45] has been proposed to automatically deal with time deformations and different speeds
(e.g., speech recognition) associated with time-dependent data. However, clustering time series data by using the DTW dis-
tance is a computationally expensive task. Begum et al. proposed an accelerating DTW clustering method with an admissible
pruning strategy [4] . To cluster a short time series data set, Möller-Levet et al. proposed a computational method of calcu-
lating the distance between two short time series as the sum of the squared differences of their corresponding slopes [33] .
Another group of time series clustering algorithms has been developed based on edit distance which mainly includes
longest common subsequence (LCSS) model [40,41] and edit sequence on real sequence (EDR) model [10] . Chen et al. pro-
posed a Spatial Assembling Distance (SpADe) method [12] which was able to deal with shifting and scaling in both temporal
and amplitude dimensions. The main idea of SpADe is to discover matching time segments named patterns, within an entire
time series by shifting and scaling in both the temporal and amplitude dimensions. Bahadori et al. proposed a Functional
Subspace Clustering (FSC) algorithm [3] which extended the power and flexibility of subspace clustering to time series
data by permitting the deformations that underlie many popular functional similarity measures. To overcome the issues of
high dimensionality, contextual constraints, and temporal smoothness, Cai et al. proposed a comprehensive method named
FACETS [6] to simultaneously capture all these aspects by using tensor factorization and performing careful popularizations
to tackle both contextual and temporal issues. Ferreira and Zhao [17] transformed a set of time series objects into a network
by using different distance functions. More specifically, every time series object is represented by a vertex and the most
similar vertexes are connected to uncover clusters. A fast clustering method for large-scale time series data named YADING
was developed [16] . In particular, time series objects were allocated to clusters that were initially induced based on sampled
subsets of the input data. As a whole, these existing algorithms do not take into account the possible subspaces of a time
series data set.
2.2. Subsequence clustering
Prior to 2003, subsequence time series clustering was generally accepted as a valid technique for time series analysis
[14,18,25] . For example, Fu et al. discovered patterns from stock data by using subsequence time series analysis technique
[18] . However, Keogh et al. claimed that subsequence time series clustering was meaningless in 2003 because the centroids
produced by subsequence time series clustering became sinusoidal pseudo-pattern for almost all kinds of time series data
[30] . Moreover, further work [9,24,38] was published to explain the problem; for example, Idé theoretically explained why
the centroids of subsequence time series data produced by k -means clustering method naturally formed sinusoidal patterns
due to the Fourier state [24] .
Nevertheless, some researchers continued to research new methods that produce meaningful subsequence clustering
results [7,8] . Chen claimed that subsequence clustering could indeed be meaningful if distances were correctly measured in a
delay space [9] . However, the sliding windows technique adopted for constructing the delay space is generally suboptimal as
it causes the delay space that represents the data series dynamics to be aligned closely along the bisectrix of the delay space.
Since similarity computation for time series is the bottleneck for clustering large-scale time series data, Rakthanmanon
proposed the UCR-DTW method [36] which is capable of searching and mining trillions of time series subsequences under
dynamic time warping. Most of the existing subsequence time series clustering approaches can only solve the kinds of
problems which are characterized by some predefined parameters and the range of width variability is small. However, such
an assumption turns out to be unrealistic for many real-world applications. Therefore, Madicar et al. proposed an enhanced
parameter-free subsequence time series clustering algorithm for high-variability-width data [31] . Zolhavarieh et al. offered
a solution to perform online pattern recognition for subsequence time series clustering [46] . Agarwal et al. discovered that
high quality subsequence clusters could be uncovered from vehicular sensor data by using bounded spherical clustering, and
the proposed method characterized by linear time complexity [1] . Due to the encouraging results of this work [1] , authors
found that the problem of generating meaningless subsequence clusters [30] could be resolved by using bounded spherical
clustering.
2.3. Characteristics of the proposed TSkmeans algorithm
Similar to the traditional clustering methods that operate on the whole sequence, our proposed TSkmeans algorithm
can iteratively discover the subspaces of the whole sequence, and then clusters objects based on the uncovered subspaces
instead of the whole sequence. Following such an idea, we propose a new k -means type clustering framework that can
smoothly assign weights to different time stamps for clustering time series data.
Although some weighted k -means type algorithms have been proposed by using different weighting methods
[21,23,26,39] , these algorithms aim to cluster data whose features do not have a chronological order. To effectively explore
the temporal sequence information associated with time series data, the proposed TSkmeans algorithm tries to smooth the
weights of adjacent time stamps, and hence the uncovered subspaces become more meaningful for clustering time series
data.