
Feature extraction based on fuzzy 2DLDA
Wankou Yang
a,
, Xiaoyong Yan
b
, Lei Zhang
c
, Changyin Sun
a
a
School of Automation, Southeast University, Nanjing 210096, People’s Republic of China
b
School of Information Technology, Jinling Institute of Technology, Nanjing 210001, People’s Republic of China
c
Dept. of Computing, The Hong Kong Polytechnic University, Hong Kong
article info
Available online 12 March 2010
Keywords:
Fisher
LDA
2DLDA
Fuzzy
Feature extraction
Face recognition
abstract
In the paper, fuzzy fisherface is extended to image matrix, namely, the fuzzy 2DLDA (F2DLDA). In the
proposed method, we calculate the membership degree matrix by fuzzy K-nearest neighbor (FKNN),
and then incorporate the membership degree into the definition of the between-class scatter matrix
and the within-class scatter matrix. Finally, we get the fuzzy between-class scatter matrix and fuzzy
within-class scatter matrix. In our definition of the between-class scatter matrix and within-class
matrix, the fuzzy information is better used than fuzzy fisherface. Experiments on the Yale, ORL and
FERET face databases show that the new method works well.
& 2010 Elsevier B.V. All rights reserved.
1. Introduction
Feature extraction by dimensionality reduction is an important
research topic in computer vision and pattern recognition. The
curse of high dimensionality is a major cause of limitations in
many practical technologies, while the large quantities of features
may even degrade the performances of the classifiers when the
size of the training set is small compared with the number of
features [1]. In the past several decades, many feature extraction
methods have been proposed, and the most well-known ones are
principle component analysis (PCA) and linear discriminant
analysis (LDA) [2].
Un-supervised learning cannot properly model the underlying
structure and characteristics of different classes. Discriminant
features are often obtained by supervised learning. LDA [2] is the
traditional approach to learn discriminant subspace. Unfortu-
nately, it cannot be applied directly to small sample size (SSS)
problems [3] because the within-class scatter matrix is singular.
As we know, face recognition is a typical SSS problem. Many
works have been reported to use LDA for face recognition. The
most popular method, called fisherface, was proposed by Swets
et al. [4] and Belhumeur et al. [5]. In their methods, PCA is first
used to reduce the dimension of the original features space to N–c
(N is the number of total training samples, c is the class number),
and the classical Fisher linear discriminant analysis (FLDA) is then
applied to reduce the dimension to d (dr c). Since the smallest
projection components are thrown away in the PCA step, some
useful discriminatory information may be lost. On the other hand,
the PCA step cannot guarantee the transformed within-class
scatter matrix be non-singular. More discussions about PCA and
LDA can be found in [6].
To solve the singularity problem, a singular value perturbation
can be added to the within-class scatter matrix [7]. A more
systematic method is the regularized discriminant analysis (RDA)
[8]. In RDA, one tries to obtain more reliable estimates of the
eigenvalues by correcting the eigenvalue distortion with a ridge-
type regularization. Penalized discriminant analysis (PDA) is
another regularized version of LDA [9,10]. The goals of PDA are
not only to overcome the SSS problem but also to smooth the
coefficients of discriminant vectors for better interpretation.
The main problem of RDA and PDA is that they do not scale well.
In applications such as face recognition, the dimensionality is
often more than ten thousand. It is not practical for RDA and PDA
to process such a large covariance matrix.
A well-known null subspace method is the LDA+PCA method
[11]. When within-class scatter matrix is of full rank, LDA+PCA
only calculates the maximum eigenvectors of (S
w
)
1
S
b
to form
the transformation matrix. Otherwise, a two-stage procedure is
employed. First, the data are transformed into the null space V
0
of
S
w
. Second, it maximizes the between-class scatter in V
0
.
LDA+PCA could be sub-optimal because it maximizes the
between-class scatter in the null space of S
w
instead of the
original input space. Direct LDA is another null space method that
discards the null space of S
b
[12]. It is achieved by diagonalizing
first S
b
and then S
w
, which is in the reverse order of conventional
simultaneous diagonalization procedure. If S
t
, instead of S
w
,is
used in direct LDA, it is actually equivalent to the PCA+LDA. Gao
et al. [13] proposed a singular value decomposition (SVD) based
LDA approach to solving the single training sample per person
problem for face recognition. Dai et al. [14,15] developed an
ARTICLE IN PRESS
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/neucom
Neurocomputing
0925-2312/$ - see front matter & 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.neucom.2009.12.025
Corresponding author.
E-mail address: wankou_yang@yahoo.com.cn (W. Yang).
Neurocomputing 73 (2010) 1556–1561