Robust Sparse Coding for Face Recognition
Meng Yang Lei Zhang
∗
Hong Kong Polytechnic Univ.
Jian Yang
Nanjing Univ. of Sci. & Tech.
David Zhang
Hong Kong Polytechnic Univ.
Abstract
Recently the sparse representation (or coding) based
classification (SRC) has been successfully used in face
recognition. In SRC, the testing image is represented as
a sparse linear combination of the training samples, and
the representation fidelity is measured by the 𝑙
2
-norm or
𝑙
1
-norm of coding residual. Such a sparse coding model
actually assumes that the coding residual follows Gaus-
sian or Laplacian distribution, which may not be accurate
enough to describe the coding errors in practice. In this
paper, we propose a new scheme, namely the robust sparse
coding (RSC), by modeling the sparse coding as a sparsity-
constrained robust regression problem. The RSC seeks for
the MLE (maximum likelihood estimation) solution of the
sparse coding problem, and it is much more robust to out-
liers (e.g., occlusions, corruptions, etc.) than SRC. An
efficient iteratively reweighted sparse coding algorithm is
proposed to solve the RSC model. Extensive experiments
on representative face databases demonstrate that the RSC
scheme is much more effective than state-of-the-art meth-
ods in dealing with face occlusion, corruption, lighting and
expression changes, etc.
1. Introduction
As a powerful tool for statistical signal modeling, sparse
representation (or sparse coding) has been successfully used
in image processing applications [16], and recently has led
to promising results in face recognition [24, 25, 27] and
texture classification [15]. Based on the findings that nat-
ural images can be generally coded by structural primitives
(e.g., edges and line segments) that are qualitatively similar
in form to simple cell receptive fields [18], sparse coding
techniques represent a natural image using a small number
of atoms parsimoniously chosen out of an over-complete
dictionary. Intuitively, the sparsity of the coding coefficient
vector can be measured by the 𝑙
0
-norm of it (𝑙
0
-norm counts
the number of nonzero entries in a vector). Since the com-
binatorial 𝑙
0
-norm minimization is an NP-hard problem, the
∗
Corresponding author. This research is supported by the Hong Kong
General Research Fund (PolyU 5351/08E).
𝑙
1
-norm minimization, as the closest convex function to 𝑙
0
-
norm minimization, is widely employed in sparse coding,
and it was shown that 𝑙
0
-norm and 𝑙
1
-norm minimizations
are equivalent if the solution is sufficiently sparse [3]. In
general, the sparse coding problem can be formulated as
min
𝜶
∥𝜶∥
1
s.t. ∥𝒚 − 𝐷𝜶∥
2
2
≤ 𝜀, (1)
where 𝒚 is a given signal, 𝐷 is the dictionary of coding
atoms, 𝜶 is the coding vector of 𝒚 over 𝐷, and 𝜀>0 is a
constant.
Face recognition (FR) is among the most visible and
challenging research topics in computer vision and pattern
recognition [29], and many methods, such as Eigenfaces
[21], Fisherfaces [2] and SVM [7], have been proposed in
the past two decades. Recently, Wright et al. [25] applied
sparse coding to FR and proposed the sparse representation
based classification (SRC) scheme, which achieves impres-
sive FR performance. By coding a query image 𝒚 as a
sparse linear combination of the training samples via the
𝑙
1
-norm minimization in Eq. (1), SRC classifies the query
image 𝒚 by evaluating which class of training samples could
result in the minimal reconstruction error of it with the as-
sociated coding coefficients. In addition, by introducing an
identity matrix 𝐼 as a dictionary to code the outlier pixels
(e.g., corrupted or occluded pixels):
min
𝜶,𝜷
∥[𝜶; 𝜷]∥
1
s.t. 𝒚 =[𝐷, 𝐼] ⋅ [𝜶; 𝜷] , (2)
the SRC method shows high robustness to face occlusion
and corruption. In [9], Huang et al. proposed a sparse rep-
resentation recovery method which is invariant to image-
plane transformation to deal with the misalignment and
pose variation in FR, while in [22] Wagner et al. proposed
a sparse representation based method that could deal with
face misalignment and illumination variation. Instead of di-
rectly using original facial features, Yang and Zhang [27]
used Gabor features in SRC to reduce greatly the size of
occlusion dictionary and improve a lot the FR accuracy.
The sparse coding model in Eq. (1)iswidelyusedin
literature. There are mainly two issues in this model. The
first one is that whether the 𝑙
1
-norm constraint ∥𝜶∥
1
is good
enough to characterize the signal sparsity. The second one is
625