4
B. Relations Between Segmentation and Row Space
Let X
0
with skinny SVD U
0
Σ
0
V
T
0
be a collection of data
samples strictly drawn from a union of multiple subspaces
(i.e., X
0
is clean), the subspace membership of the samples
is determined by the row space of X
0
. Indeed, as shown
in [12], when subspaces are independent, V
0
V
T
0
forms a
block-diagonal matrix: the (i, j)-th entry of V
0
V
T
0
can be
nonzero only if the i-th and j-th samples are from the same
subspace. Hence, this matrix, termed as Shape Interaction
Matrix (SIM) [12], has been widely used for subspace seg-
mentation. Previous approaches simply compute the SVD of
the data matrix X = U
X
Σ
X
V
T
X
and then use |V
X
V
T
X
|
1
for
subspace segmentation. However, in the presence of outliers
and corruptions, V
X
can be far away from V
0
and thus the
segmentation using such approaches is inaccurate. In contrast,
we show that LRR can recover V
0
V
T
0
even when the data
matrix X is contaminated by outliers.
If the subspaces are not independent, V
0
V
T
0
may not be
strictly block-diagonal. This is indeed well expected, since
when the subspaces have nonzero (nonempty) intersections,
then some samples may belong to multiple subspaces simul-
taneously. When the subspaces are pairwise disjoint (but not
independent), our extensive numerical experiments show that
V
0
V
T
0
may still be close to be block-diagonal, as exemplified
in Fig. 3. Hence, to recover V
0
V
T
0
is still of interest to
subspace segmentation.
C. Problem Statement
Problem 1.1 only roughly describes what we want to study.
More precisely, this paper addresses the following problem.
Problem 3.1 (Subspace Clustering): Let X
0
∈ R
d×n
with
skinny SVD U
0
Σ
0
V
0
store a set of n d-dimensional samples
(vectors) strictly drawn from a union of k subspaces {S
i
}
k
i=1
of unknown dimensions (k is unknown either). Given a set of
observation vectors X generated by
X = X
0
+ E
0
,
the goal is to recover the row space of X
0
, or to recover the
true SIM V
0
V
T
0
as equal.
The recovery of row space can guarantee high segmentation
accuracy, as analyzed in Section III-B. Also, the recovery of
row space naturally implies the success in error correction. So
it is sufficient to set the goal of subspace clustering as the
recovery of the row space identified by V
0
V
T
0
. For ease of
exploration, we consider the problem under three assumptions
of increasing practicality and difficulty.
Assumption 1: The data is clean, i.e., E
0
= 0.
Assumption 2: A fraction of the data samples are grossly
corrupted and the others are clean, i.e., E
0
has sparse column
supports as shown in Fig.2(c).
Assumption 3: A fraction of the data samples are grossly
corrupted and the others are contaminated by small Gaussian
noise, i.e., E
0
is characterized by a combination of the models
shown in Fig.2(a) and Fig.2(c).
1
For a matrix M, |M | denotes the matrix with the (i, j)-th entry being the
absolute value of [M]
ij
.
Unlike [14], the independent assumption on the subspaces is
not highlighted in this paper, because the analysis in this work
focuses on recovering V
0
V
T
0
other than a pursuit of block-
diagonal matrix.
IV. LOW-RANK REPRESENTATION FOR MATRIX
RECOVERY
In this section we abstractly present the LRR method
for recovering a matrix from corrupted observations. The
basic theorems and optimization algorithms will be presented.
The specific methods and theories for handling the subspace
clustering problem are deferred until Section V.
A. Low-Rank Representation
In order to recover the low-rank matrix X
0
from the given
observation matrix X corrupted by errors E
0
(X = X
0
+ E
0
),
it is straightforward to consider the following regularized rank
minimization problem:
min
D,E
rank (D) + λ kEk
ℓ
, s.t. X = D + E, (2)
where λ > 0 is a parameter and k·k
ℓ
indicates certain
regularization strategy, such as the squared Frobenius norm
(i.e., k · k
2
F
) used for modeling the noise as show in Fig.2(a)
[6], the ℓ
0
norm adopted by [7] for characterizing the random
corruptions as shown in Fig.2(b), and the ℓ
2,0
norm adopted
by [14], [16] for dealing with sample-specific corruptions
and outliers. Suppose D
∗
is a minimizer with respect to the
variable D, then it gives a low-rank recovery to the original
data X
0
.
The above formulation is adopted by the recently established
Robust PCA (RPCA) method [7] which has been used to
achieve the state-of-the-art performance in several applications
(e.g., [35]). However, this formulation implicitly assumes that
the underlying data structure is a single low-rank subspace.
When the data is drawn from a union of multiple subspaces,
denoted as S
1
, S
2
, ··· , S
k
, it actually treats the data as being
sampled from a single subspace defined by S =
P
k
i=1
S
i
.
Since the sum
P
k
i=1
S
i
can be much larger than the union
∪
k
i=1
S
i
, the specifics of the individual subspaces are not well
considered and so the recovery may be inaccurate.
To better handle the mixed data, here we suggest a more
general rank minimization problem defined as follows:
min
Z,E
rank (Z) + λ kEk
ℓ
, s.t. X = AZ + E, (3)
where A is a “dictionary” that linearly spans the data space.
We call the minimizer Z
∗
(with regard to the variable Z)
the “lowest-rank representation” of data X with respect to a
dictionary A. After obtaining an optimal solution (Z
∗
, E
∗
),
we could recover the original data by using AZ
∗
(or X −
E
∗
). Since rank (AZ
∗
) ≤ rank (Z
∗
), AZ
∗
is also a low-
rank recovery to the original data X
0
. By setting A = I, the
formulation (3) falls back to (2). So LRR could be regarded
as a generalization of RPCA that essentially uses the standard
bases as the dictionary. By choosing an appropriate dictionary
A, as we will see, the lowest-rank representation can recover
the underlying row space so as to reveal the true segmentation
of data. So, LRR could handle well the data drawn from a
union of multiple subspaces.