Robust Principal Component Analysis Based On L
1í2
Metric
Fanlong Zhang, Zhangjing Yang, Minghua Wan
,
Guowei Yang
School of Technology
Nanjing Audit University
Nanjing, China
csfzhang@126.com, yzj@nau.edu.cn, wmh36@sina.com, ygw_ustb@163.com
Abstract—Robust principal component analysis (RPCA) is a
new emerging method for exact recovery of corrupted low-
rank matrices. Given a data matrix, RPCA can decompose it
into the sum of a low-rank matrix and a sparse matrix
exactly by minimizing a weighted combination of the nuclear
norm and the L
1
norm. It assumes that the error matrix is
sparse, and metric it by L
1
norm. However, L
1
norm often
leads to bias estimation and the solution is not as accurate as
desired. Recently the difference of L
1
and L
2
norms, called
L
1-2
metric, is proposed as the approximation to the L
0
norm.
Motivated by the L
1-2
metric’s better approximation to the L
0
norm than L
1
norm, this paper presents a method called
robust principal component analysis based on L
1-2
metric
(RPCA-L
1-2
) for recovering the corrupted data. This method
measures the data error by the L
1-2
metric. Moreover,
RPCA-L
1-2
is solved by DC (difference of convex functions)
programming. Extensive experiments on removing occlusion
from face images and background modeling from
surveillance videos demonstrate the effectiveness of the
proposed methods.
Keywords- robust principal component analysis; low-rank;
L
1-2
metric; DC programming
I.
I
NTRODUCTION
Principal component analysis (PCA) [1] is widely
investigated and applied in pattern recognition and
machine learning for subspace learning and feature
extraction. PCA, however, is sensitive to outliers. To
overcome the limitations of PCA, a surge of robust
principal component analysis methods have been
proposed. Wright et al. recently established a robust
principal component analysis (RPCA) [2] [3] method,
which assumes the error matrix is sparse and the clean
data matrix is low rank. Under the restricted isometry
property (RIP) condition, RPCA can decompose the
corrupted data into the sum of a low-rank matrix and a
sparse matrix exactly by minimizing a weighted
combination of the nuclear norm and the L
1
norm. As an
important extension of RPCA, the low-rank representation
(LRR) [4], [5] was presented to segment subspace from a
union of multiple linear subspaces. LRR sought the lowest
rank representation among all the candidates that
represent all vectors as the linear combinations of the
basis vectors in a dictionary. Like RPCA, LRR also
assumes the error term is sparse.
Most of the abovementioned methods characterize the
error via L
1
or L
2
norm, which both are convex
regularizers. However, convex regularizers often lead to
inaccurate solution [6]. As a result, many nonconvex
regularizers are designed, such as capped-L
1
norm [6], L
p
norm [7], and log-sum-penalty [8]. Recently the
difference of L
1
and L
2
norms [9] [10], called L
1-2
metric,
is proposed as the nonconvex regularizer to the L
0
norm.
L
1-2
metric is nonconvex yet Lipschitz continuous. The
computation results [10] show that even if the RIP
condition is unsatisfying, the L
1-2
metric can work well
than existing nonconvex regularizers.
Inspired by the L
1-2
metric’s better approximation to
the L
0
norm, this paper presents a method called robust
principal component analysis based on L
1-2
metric
(RPCA-L
1-2
) for recovering the corrupted data. The
RPCA-L
1-2
measures the data error by the L
1-2
metric,
instead of L
1
metric in RPCA.
Although the L
1-2
metric is non-convex, it can be
decomposed into the difference of two convex functions.
Then the DC programming [11] [12] can be employed to
solve our model. The “DC” means “difference of convex
functions”. DC programming is a special kind of
optimization method, whose objective function can be
decomposed into the difference of two convex functions.
In [10], the DC programming is also employed for solving
compressed sensing based on L
1-2
metric. Furthermore,
Lou and Man [17] derive an analytical solution for the
proximal operator of the L
1-2
metric, which makes some
fast L
1
solvers applicable for L
1-2
.
The contributions include two aspects. (1) A robust
data recovery model is proposed, called RPCA-L
1-2
. The
motivation is that the L
1-2
metric is better approximation
to the L
0
norm than L
1
norm. (2) The DC programming is
employed to solve proposed model. DC algorithm
decomposes the original problem into series RPCA
problems, which can be solved efficiently by inexact
augmented Lagrange multiplier algorithms (inexact ALM)
[11].
The rest of this paper is organized as follows. Section
II reviews the related work. Section III presents our model
and corresponding algorithm. Section IV reports
experimental results. Section V offers conclusions.
II.
R
ELATED
W
ORKS
Given a data set
12
= [ , ,..., ]
Xxx x
, where each
i
x
is a
sample. The nuclear norm of the matrix X is defined
by
*
|| ||
σ
=
¦
X
i
i
, which is the sum of the singular values
of X. Besides, the L
2
and L
1
norms of a matrix X are
defined by
2
,
|| || ( ) ,XX
Fij
ij
=
¦
1
,
|| || | |=
¦
XX
ij
ij
,
respectively, where
X
ij
means the (i,j)-th entry.
A. RPCA
The data X is usually corrupted. RPCA tries to
decompose X into two matrices D and E, where the
2017 4th IAPR Asian Conference on Pattern Recognition
2327-0985/17 $31.00 © 2017 IEEE
DOI 10.1109/ACPR.2017.8
394