ORIGINAL ARTICLE
An efficient KPCA algorithm based on feature correlation
evaluation
Zizhu Fan
•
Jinghua Wang
•
Baogen Xu
•
Pengzhi Tang
Received: 4 December 2012 / Accepted: 20 April 2013
Springer-Verlag London 2013
Abstract Classic kernel principal component analysis
(KPCA) is less computationally efficient when extracting
features from large data sets. In this paper, we propose an
algorithm, that is, efficient KPCA (EKPCA), that enhances
the computational efficiency of KPCA by using a linear
combination of a small portion of training samples, referred
to as basic patterns, to approximately express the KPCA
feature extractor, that is, the eigenvector of the covariance
matrix in the feature extraction. We show that the feature
correlation (i.e., the correlation between different feature
components) can be evaluated by the cosine distance
between the kernel vectors, which are the column vectors
in the kernel matrix. The proposed algorithm can be easily
implemented. It first uses feature correlation evaluation to
determine the basic patterns and then uses these to recon-
struct the KPCA model, perform feature extraction, and
classify the test samples. Since there are usually many
fewer basic patterns than training samples, EKPCA feature
extraction is much more computationally efficient than that
of KPCA. Experimental results on several benchmark data
sets show that EKPCA is much faster than KPCA while
achieving similar classification performance.
Keywords Kernel principal component analysis (KPCA)
Feature extraction Feature correlation Cosine distance
1 Introduction
Kernel principal component analysis (KPCA) [1–7]isan
effective approach to feature extraction and has been used
with success in a variety of pattern recognition problems as
well as in numerous image-related machine learning appli-
cations [8–16]. KPCA is a nonlinear generalization of clas-
sical principal component analysis (PCA), which seeks to
extract a certain number of the most representative features
from data [17–20]. Basically, in KPCA, the input data are first
transformed into a high or even infinite dimensional feature
space F through a nonlinear mapping, and then classic PCA is
performed in the feature space F. KPCA employs an appro-
priately chosen kernel function to stand for inner products of
sample vectors in the feature space, which means that it does
not need to explicitly carry out the nonlinear mapping [21, 22].
KPCA often outperforms the classical PCA because it can
effectively extract the nonlinear features of a sample, but
KPCA-based feature extraction is computationally expensive
since its computational efficiency is inversely proportional to
the number of training samples [23]. Inevitably then, the
greater the number of training samples, the lower the effi-
ciency of feature extraction.
A number of reformulated algorithms [24–32] have
recently been proposed to improve the computational
efficiency of KPCA. Rosipal and Girolami [24] proposed to
improve the training efficiency of KPCA using an expec-
tation maximization approach. Zheng et al. [25] enhanced
the training efficiency of KPCA by grouping the training
samples. Schraudolph et al. [26] improved the convergence
of the kernel Hebbian algorithm (KHA) for iterative kernel
PCA. Moerland described how an expectation–maximiza-
tion algorithm for classical PCA could be adapted to kernel
PCA without having to store the kernel matrix [27]. These
methods, however, do not consider the feature extraction
Z. Fan (&) B. Xu P. Tang
School of Basic Science, East China Jiaotong University,
Nanchang 330013, China
e-mail: zzfan3@yahoo.com.cn
J. Wang
Department of Computing, The Hong Kong Polytechnic
University, Kowloon, Hong Kong, China
123
Neural Comput & Applic
DOI 10.1007/s00521-013-1424-9