IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. XX, NO. XX, XX 2015 3
A. Stiefel and Grassmann manifold
The space of d × m-dimensional matrices (R
d×m
, m < d)
with orthonormal columns is a special type of Riemannian
manifold known as a Stiefel manifold S(m, d), where the no-
tion of normality is formulated by an orthonormality condition,
S(m, d) , Y ∈ R
d×m
: Y
T
Y = I
m
(1)
for m by m identity matrix I
m
. Grassmann manifold G(m, d)
is a mathematical object with several similarities to Stiefel
manifold. It can be defined as a quotient manifold of S(m, d)
with the equivalence relation
Y
1
∼ Y
2
if and only if Span(Y
1
) = Span(Y
2
) (2)
where Span(Y) denotes the subspace spanned by columns of
Y ∈ S(m, d). Thus, a Grassmann manifold G(m, d) is the set
of m-dimensional linear subspaces of R
d
.
G(m, d) is a m(d − m)-dimensional compact Riemanian
manifold. An element of G(m, d) can be represented by an
orthonormal matrix Y of size d by m such that Y
T
Y = I
m
.
Thus, we usually pay more attention to Span(Y ) rather than
the specific values of Y . Since the matrix representation of a
point in G(m, d) is not unique, two matrices Y
1
and Y
2
are
considered the same if and only if Span(Y
1
) = Span(Y
2
).
Propositions 1. For Grassmann manifold G(m, d), Y
1
∼ Y
2
if and only if Y
1
= Y
2
R
m
for some orthonormal matrix R
m
∈
O(m), where O(·) denotes the orthogonal group.
A canonical distance between two subspaces is the Rieman-
nian distance, i.e., the length of geodesic path connecting two
corresponding points on Grassmann manifold. A more intuitive
and computationally attractive way to define the geodesic
distance derives from the principal angle. If denote by Y
1
, Y
2
two orthonormal matrices with d × m in size, the principal
angles 0 ≤ θ
1
≤ · · · ≤ θ
m
≤
π
2
between two subspaces
Span(Y
1
) and Span(Y
2
), are defined recursively by
cos θ
k
= max
u
k
∈Span(Y
1
)
max
v
k
∈Span(Y
2
)
u
T
k
v
k
subject to u
T
k
u
k
= 1, v
T
k
v
k
= 1,
u
T
k
u
j
= 0, v
T
k
v
j
= 0, for j = 1, ..., k − 1.
(3)
The first principal angle θ
1
corresponds to the smallest angle
between a pair of unit vectors each from two subspaces, i.e.,
the first canonical correlation. Similarly, the k-th principal
angle and canonical correlation can be described recursively.
Riemannian manifold is a special type of differentiable
manifold with a smooth inner product (Riemannian metric)
endowed [28]. For two tangent vectors 4
1
, 4
2
at point Y, its
Riemannian metric is defined as
h4
1
, 4
2
i
Y
= tr(4
T
1
(I
d
−
1
2
YY
T
)4
2
) = tr(4
T
1
4
2
). (4)
The metric (4) induces a geodesic distance, the length of short-
est curve connecting two points (m-dimensional subspaces),
namely arc length. It is known that the principal angles are
related to the geodesic distance [20],
δ
2
Arc
(Y
1
, Y
2
) =
X
j
θ
2
j
= kΘk
2
2
(5)
for Θ = [θ
1
, θ
2
, ..., θ
m
].
To obtain the principal angles, it is no need to solve the
maximization problem (3). Instead, it can be derived from the
singular value decomposition of the product of two matrices,
Y
T
1
Y
2
= USV (6)
for unitary matrices U = [u
1
, ..., u
m
], V = [v
1
, ..., v
m
], and
the diagonal matrix S = diag(cos(θ
1
), ..., cos(θ
m
)).
B. Distances for subspaces
In this subsection, distance refers to any assignment of
nonnegative values for a pair of points in space Ω. A valid
metric is a distance that satisfies the additional axioms.
Propositions 2 (Metric). A real-valued function δ : Ω×Ω 7→ R
is called a metric if
• δ(x
1
, x
2
) ≥ 0;
• δ(x
1
, x
2
) = 0 if and only if x
1
= x
2
;
• δ(x
1
, x
2
) = δ(x
2
, x
1
);
• δ(x
1
, x
2
) + d(x
2
, x
3
) ≥ δ(x
1
, x
3
);
for all x
1
, x
2
, x
3
∈ Ω.
According to the distance law, it is able to come the
conclusion that the Grassmann distance should be invariant
under different representations.
Propositions 3. For any distance function δ(·, ·) : R
d×m
×
R
d×m
7→ R, it is a Grassmann distance if δ(Y
1
, Y
2
) =
δ(Y
1
R
1
, Y
2
R
2
), for ∀ R
1
, R
2
∈ O(m).
In the following, several distances resulting from the prin-
cipal angles for subspaces are provided [20], [23].
1) Projection distance. The projection distance is defined
as the 2-norm of the sine of principal angle,
δ
2
P j
(Y
1
, Y
2
) =
m
X
j=1
sin
2
(θ
j
) = m −
m
X
j=1
cos
2
(θ
j
)
= m − kY
T
1
Y
2
k
2
F
=
1
2
kY
1
Y
T
1
− Y
2
Y
T
2
k
2
F
(7)
where k · k
F
denotes the matrix Frobenius norm. It is
a Grassmann distance and a metric too.
2) Binet-Cauchy distance. The Binet-Cauchy distance is
defined as the product of canonical correlations
δ
2
B
(Y
1
, Y
2
) = 1−
m
Y
j=1
cos
2
(θ
j
) = 1−det(Y
T
1
Y
2
) (8)
It is invariant under different representations, and fur-
thermore is a metric.
3) Max correlation. The max correlation distance derives
from the largest canonical correlation (or the smallest
principal angle)
δ
2
Max
(Y
1
, Y
2
) = 1 − cos
2
(θ
1
) = sin(θ
1
). (9)
It is a Grassmann distance, yet not a metric.