Multi-sensor Activity Recognition Using 2DPCA
and K-means Clustering Based on Dual-measure
Distance
Hong He
*
Member, IEEE , Jifeng Huang
College of Information, Mechanical and Electrical
Engineering
Shanghai Normal University
Shanghai 200234, China
*Corresponding author, Email:
heh@shnu.edu.cn;honghe_aca2@163.com
Wuxiong Zhang
Key Laboratory of wireless Sensor Network and
Communication
SIMIT, Chinese Academy of Sciences
Shanghai, China
Abstract—Nowadays the activity recognition based on
multiple wearable sensors is still a challenging task due to the
diversity of human activities. The application of unsupervised
classification is helpful to discovery new activity classes and
improve the activity classification model. Therefore, a new
multi-sensor activity recognition scheme using the two-
dimensional principal component analysis (2DPCA) and the k-
means clustering with dual-measure distance (DMk-means) is
proposed in this paper. Multiple activity signals are firstly
decomposed by the wavelet packet decomposition. Then the
2DPCA is applied to wavelet feature matrices of the activity
samples without changing the inherent data structure. In the
DMk-means, different activities are grouped into clusters
through measuring their feature vectors with both Euclidean
distance and Pearson correlation distance. The recognition
performance of proposed scheme is verified by the public
dataset WARD. Clustering results show that more useful
wavelet features can be captured by the 2DPCA than by the
PCA. The dual-measure distance can calculate both the shape
variance and the magnitude difference of feature vectors. The
clustering indices of 2DPCA_DMk-means are superior than
those of 2DPCA_k-means for activity recognition.
Keywords—Activity recognition; 2DPCA; K-means clustering;
wavelet packet decomposition; dual-measure distance
I. INTRODUCTION
With the prevalence of mobile devices, activity
recognition (AR) based on multiple wearable devices has
become an important feasible solution for automatically
identifying human behaviors in real time [1]-[3]. Compared
to vision-based AR, multi-sensor based AR has
characteristics of unobtrusive measurement, less memory,
low computation and power requirements [3]-[5]. Since there
is no common definition of human activities, recognition
approaches for human activity face a number of challenges
brought by intraclass variability and interclass similarity of
activities [2],[3]. Hence, in order to accurately identify the
diverse activities, a variety of classification approaches have
been applied to the sensor-based AR systems, including
Naive Bayesian classifier [6], artificial neural networks [7]-
[9] support vector machine (SVM)[10][11] , hidden Markov
model (HMM) [12] and decision trees [7],[13]. Some of
them even achieved the high recognition rate for complex
activities [2].
However, motion data recorded from an accelerometer or
gyroscope is often more difficult to interpret than data from
cameras. Labeling data in the training phase is a tedious and
complex process of supervised AR systems [3]. Moreover,
since the practical sensing environment of sensor-based AR
system is complex, it is necessary to update the
classification model periodically by new feature data so as
to achieve high recognition accuracy. Generally, if only a
few labeled training samples are available, it is better to use
semisupervised or unsupervised approaches, i.e. clustering
algorithm, for recognizing activities [2]. In addition,
automatically identifying unknown activities by clustering
algorithm is helpful for simplifying the labeling process and
improving the AR classification model. Nevertheless, up till
now only few HAR works use clustering methods for multi-
sensor activity recognition [2],[3].
Therefore, we develop a unsupervised classification
strategy in this paper for multi-sensor activities recognition.
To extract more useful information, the signals from
multiple wearable sensors are firstly decomposed by
wavelet packet transform (WPT). Furthermore, without
changing the inherent structure of data, the two-dimensional
Principal Component Analysis (2DPCA) is applied to
wavelet feature matrices so as to reduce the dimensionality
of the data and decrease the complexity of computation.
Finally, in order to partition the feature vectors of activity
samples, a k-means clustering algorithm with dual-measure
distance (DMk-means) is proposed. Combining the
Euclidean distance with the Pearson correlation, the
difference of two activity samples is comprehensively
measured from both magnitude and variation tendency of
feature vectors. The schematic diagram of the proposed
activity recognition scheme based on 2DPCA and DMk-
means is shown in Fig. 1. Except for the preprocessing of
sensor data, the procedure of activity recognition involves
four stages, i.e. segmentation, feature extraction, dimension
reduction and clustering.