J. Xu et al.
linear discriminant analysis (LDA) [1]. PCA is an unsupervised method. PCA can project
the original data into a low-dimensional subspace which is spanned by the eigenvectors
associated with the largest eigenvalues of the covariance matrix of all the samples. LDA is
supervised. It searches for the projective axes on which the data points of different classes
are far from each other while requiring data points of the same class to be close to each
other. Intrinsically, both of PCA and LDA try to estimate the global statistics, i.e. mean and
covariance. However, they fail to discover the underlying structure, if the data live on or close
to a sub-manifold of the ambient space.
In the past few years, more and more researchers have tried to use nonlinear techniques
to learn the local geometry structure [3–13]. Isometric feature mapping (ISOMAP) [3], local
linear embedding (LLE) [4], local tangent space alignment [5] and Laplacian Eigenmap (LE)
[6] are four representative algorithms. Some experiments have been done to show that they
can yield impressive results. However, we find the common goal of these manifold learning
algorithms is for visualization. All of them suffer from the out-of-sample problem and thus
are not quite efficient for recognition task. To this end, some linear expansions have been
proposed. He and Niyogi proposed locality preserving projections (LPP) [7,8], which is the
linear expansion of LE. LPP builds a graph incorporating neighborhood information of the
data set and provides a way to the projection of novel test data point. Inspired by LPP, Yang et
al. proposed unsupervised discriminant projection (UDP) [9]. UDP takes into account both
the local and nonlocal quantities. It characterizes the local scatter as well as the nonlocal
scatter, seeking to find a projection that simultaneously maximizes the nonlocal scatter and
minimizes the local scatter. This characteristic makes UDP more intuitive and more powerful
than LPP.
More recently, the concept of “fuzzy set” [14] has attracted increasing attention in the
fields of image processing and pattern recognition. Many studies have been carried on for
fuzzy image filtering [15–19], fuzzy image segmentation [20–24], fuzzy edge detection [25–
34] and fuzzy classification [35–37]. Among many fuzzy classifications, we notice the fuzzy
K-nearest neighbor (FKNN) classifier [38]. FKNN was proposed by Keller et al. in 1985.
From then on, the “fuzzification” of class assignment was around and lasted for a long time.
FKNN assigns the class membership to a sample rather than assigning the sample to a partic-
ular class. That means FKNN can specify what degree the object belongs to each class and
which information is mostly useful. FKNN records above information in the membership
degree matrix. The membership degree matrix is a useful bridge connecting the samples and
the categories. Inspired by this, the mechanisms of “fuzzy set” have been augmented into
recognition techniques and arose many fuzzy extensions [39,42–49]. Fuzzy fisherface [39–
41] is one of them. In Fuzzy fisherface, the class membership of the binary labeled faces was
incorporated in the construction of the fuzzy within-class and between-class scatter matrices.
After that, a regularized version of fuzzy fisherface [42] was proposed. And Li et al. [43]tried
to combine nearest line subspace learning with “fuzzy set” theory and proposed FLNFL for
facial expression recognition. Ye et al. introduced non-negative matrix factorization based on
fuzzy K nearest neighbour graph [44], in which the membership degree matrix was used to
define the intra-class and inter-class fuzzy K nearest neighbour graphs. Wan et al. [45]pro-
posed fuzzy local discriminant embedding (FLDE). FLDE could reduce the environmental
conditions effect to obtain the correct local distribution information to some extent. In addi-
tion, fuzzy 2DLDA [47,48] and F2DLGEDA [49] are developed. All these algorithms have
a common characteristic that they redefine fuzzy class-mean and fuzzy mean of samples by
using the membership degree matrices, such that the constructed scatters of samples contain
more discriminative information than before. From the reported experimental results, we can
find that the fuzzy extensions really make the algorithms more robust.
123