Tensor RPCA by Bayesian CP Factorization with Complex Noise
Qiong Luo
1,2
, Zhi Han
1∗
, Xi’ai Chen
1,2
, Yao Wang
3
, Deyu Meng
3
, Dong Liang
3
, Yandong Tang
1
1
State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences;
2
University of Chinese Academy of Sciences;
3
Xi’an Jiaotong University
{luoqiong, hanzhi, chenxiai, ytang}@sia.cn, {dymeng, liangdong}@mail.xjtu.edu.cn, yao.s.wang@gmail.com
Abstract
The RPCA model has achieved good performances in
various applications. However, two defects limit its effec-
tiveness. Firstly, it is designed for dealing with data in ma-
trix form, which fails to exploit the structure information of
higher order tensor data in some pratical situations. Sec-
ondly, it adopts L
1
-norm to tackle noise part which makes
it only valid for sparse noise. In this paper, we propose a
tensor RPCA model based on CP decomposition and model
data noise by Mixture of Gaussians (MoG). The use of ten-
sor structure to raw data allows us to make full use of the
inherent structure priors, and MoG is a general approxima-
tor to any blends of consecutive distributions, which makes
our approach capable of regaining the low dimensional lin-
ear subspace from a wide range of noises or their mixture.
The model is solved by a new proposed algorithm inferred
under a variational Bayesian framework. The superiority of
our approach over the existing state-of-the-art approaches
is demonstrated by extensive experiments on both of syn-
thetic and real data.
1. Introduction
In the fields of data analysis, principal component anal-
ysis (PCA) has been a classical and prevalent tool and has
extensive applications [16]. Originally, PCA aims to find
the best L
2
-norm low-rank approximation of a specified
matrix due to its smoothness and has many fast numerical
solvers [9, 24, 25, 26, 35, 41]. But L
2
-norm is only suitable
for Gaussian noise and too susceptible to outliers and gross
noise. To increase the robustness of PCA, a series of works
have been conducted in recent years [12, 17, 13, 19].
Inspired by the improvement of low-rank matrix analy-
sis [4, 5, 30], the robust principal component analysis (RP-
CA) [40] has been proposed for remedying the deficiency of
traditional PCA, in which, a high dimensional observation
matrix is assumed to consist of a low-rank component and
∗
Corresponding author.
a sparse component. Specifically, let Y ∈ R
m×n
be the ob-
servation data matrix, X ∈ R
m×n
be the low-rank matrix,
E ∈ R
m×n
be the sparse noise matrix, and then we can
describe the RPCA as the following optimization problem:
min
X,E
X
∗
+ λE
1
s.t. Y = X + E, (1)
where X
∗
=
r
σ
r
(X) denotes the nuclear norm of
X, σ
r
(X) (r =1, 2, ..., min (m, n)) is the r
th
singular val-
ue of X, E
1
=
ij
|e
ij
| denotes the L
1
-norm of E, and
e
ij
is the element in the i
th
row and j
th
column of E. It has
been proved that if L and S satisfy a certain incoherence
condition, the RPCA can uniquely extract X and E from Y
[6]. RPCA has played an important role in handling vari-
ous problems, including robust matrix recovery [40], face
alignment [27], subspace segmentation [21] and so forth.
Recently, it has been noticed that more and more modern
applications contain data with a higher order tensor struc-
ture, such as background extraction [7], face recognition
and representation [40, 34, 38, 2], structure from motion
[36], object recognition [37] and motion segmentation [39].
Matrices can be viewed as second order tensors, howev-
er, moving from matrices to higher order tensors presents
significant new challenges. A direct way to address these
challenges is to unfold tensors to matrices and then directly
apply the matrix RPCA model. Unfortunately, as recently
pointed out by [7], the multilinear structure is lost in such
matricization and as a result, methods constructed based on
these techniques often lead to suboptimal results. As such,
it is helpful to handle such raw data by using a direct ten-
sor representation, and several researches have been made
in the literatures [11, 20].
Moreover, L
1
-norm and L
2
-norm can characterize spe-
cific Laplace and Gaussian distributions, respectively, but
the real noise is generally not of a particular kind of noise
configurations, as already shown in [42]. Mixture of Gaus-
sians (MoG) is capable to commonly approximate wider
range of distributions due to its universal approximation ca-
pability, and Laplacian and Gaussian are regarded as a spe-
cial case of MoG [3]. It has been demonstrated that MoG
2017 IEEE International Conference on Computer Vision
2380-7504/17 $31.00 © 2017 IEEE
DOI 10.1109/ICCV.2017.537
5029