没有合适的资源?快使用搜索试试~ 我知道了~
首页Laplacian Eigenmaps for Dimensionality Reduction and Data Represenation
资源详情
资源评论
资源推荐
LETTER Communicated by Joshua B. Tenenbaum
Laplacian Eigenmaps for Dimensionality Reduction and Data
Representation
Mikhail Belkin
misha@math.uchicago.edu
Department of Mathematics, University of Chicago, Chicago, IL 60637, U.S.A.
Partha Niyogi
niyogi@cs.uchicago.edu
Department of Computer Science and Statistics, University of Chicago,
Chicago, IL 60637 U.S.A.
One of the central problems in machine learning and pattern recognition
is to develop appropriate representations for complex data. We consider
the problem of constructing a representation for data lying on a low-
dimensional manifold embedded in a high-dimensional space. Drawing
on the correspondence between the graph Laplacian, the Laplace Beltrami
operator on the manifold, and the connections to the heat equation, we
propose a geometrically motivated algorithm for representing the high-
dimensional data. The algorithm provides a computationally efficient ap-
proach to nonlinear dimensionality reduction that has locality-preserving
properties and a natural connection to clustering. Some potential appli-
cations and illustrative examples are discussed.
1 Introduction
In many areas of artificial intelligence, information retrieval, and data min-
ing, one is often confronted with intrinsically low-dimensional data lying in
a very high-dimensional space. Consider, for example, gray-scale images of
an object taken under fixed lighting conditions with a moving camera. Each
such image would typically be represented by a brightness value at each
pixel. If there were n
2
pixels in all (corresponding to an n × n image), then
each image yields a data point in
R
n
2
. However, the intrinsic dimensional-
ity of the space of all images of the same object is the number of degrees of
freedom of the camera. In this case, the space under consideration has the
natural structure of a low-dimensional manifold embedded in
R
n
2
.
Recently, there has been some renewed interest (Tenenbaum, de Silva,
& Langford, 2000; Roweis & Saul, 2000) in the problem of developing low-
dimensional representations when data arise from sampling a probabil-
ity distribution on a manifold. In this letter, we present a geometrically
Neural Computation 15, 1373–1396 (2003)
c
2003 Massachusetts Institute of Technology
1374 M. Belkin and P. Niyogi
motivated algorithm and an accompanying framework of analysis for this
problem.
The general problem of dimensionality reduction has a long history. Clas-
sical approaches include principal components analysis (PCA) and multi-
dimensional scaling. Various methods that generate nonlinear maps have
also been considered. Most of them, such as self-organizing maps and other
neural network–based approaches (e.g., Haykin, 1999), set up a nonlin-
ear optimization problem whose solution is typically obtained by gradient
descent that is guaranteed only to produce a local optimum; global op-
tima are difficult to attain by efficient means. Note, however, that the re-
cent approach of generalizing the PCA through kernel-based techniques
(Sch¨olkopf, Smola, & M¨uller, 1998) does not have this shortcoming. Most of
these methods do not explicitly consider the structure of the manifold on
which the data may possibly reside.
In this letter, we explore an approach that builds a graph incorporating
neighborhood information of the data set. Using the notion of the Laplacian
of the graph, we then compute a low-dimensional representation of the data
set that optimally preserves local neighborhood information in a certain
sense. The representation map generated by the algorithm may be viewed
as a discrete approximation to a continuous map that naturally arises from
the geometry of the manifold.
It is worthwhile to highlight several aspects of the algorithm and the
framework of analysis presented here:
• The core algorithm is very simple. It has a few local computations and
one sparse eigenvalue problem. The solution reflects the intrinsic geo-
metric structure of the manifold. It does, however, require a search for
neighboring points in a high-dimensional space. We note that there are
several efficient approximate techniques for finding nearest neighbors
(e.g., Indyk, 2000).
• The justification for the algorithm comes from the role of the Laplace
Beltrami operator in providing an optimal embedding for the mani-
fold. The manifold is approximated by the adjacency graph computed
from the data points. The Laplace Beltrami operator is approximated
by the weighted Laplacian of the adjacency graph with weights cho-
sen appropriately. The key role of the Laplace Beltrami operator in the
heat equation enables us to use the heat kernel to choose the weight
decay function in a principled manner. Thus, the embedding maps for
the data approximate the eigenmaps of the Laplace Beltrami operator,
which are maps intrinsically defined on the entire manifold.
• The framework of analysis presented here makes explicit use of these
connections to interpret dimensionality-reduction algorithms in a ge-
ometric fashion. In addition to the algorithms presented in this letter,
we are also able to reinterpret the recently proposed locally linear em-
Laplacian Eigenmaps 1375
bedding (LLE) algorithm of Roweis and Saul (2000) within this frame-
work.
The graph Laplacian has been widely used for different clustering and
partition problems (Shi & Malik, 1997; Simon, 1991; Ng, Jordan, &
Weiss, 2002). Although the connections between the Laplace Beltrami
operator and the graph Laplacian are well known to geometers and
specialists in spectral graph theory (Chung, 1997; Chung, Grigor’yan,
& Yau, 2000), so far we are not aware of any application to dimen-
sionality reduction or data representation. We note, however, recent
work on using diffusion kernels on graphs and other discrete struc-
tures (Kondor & Lafferty, 2002).
• The locality-preserving character of the Laplacian eigenmap algorithm
makes it relatively insensitive to outliers and noise. It is also not prone
to short circuiting, as only the local distances are used. We show that by
trying to preserve local information in the embedding, the algorithm
implicitly emphasizes the natural clusters in the data. Close connec-
tions to spectral clustering algorithms developed in learning and com-
puter vision (in particular, the approach of Shi & Malik, 1997) then
become very clear. In this sense, dimensionality reduction and cluster-
ing are two sides of the same coin, and we explore this connection in
some detail. In contrast, global methods like that in Tenenbaum et al.
(2000), do not show any tendency to cluster, as an attempt is made to
preserve all pairwise geodesic distances between points.
However, not all data sets necessarily have meaningful clusters. Other
methods such as PCA or Isomap might be more appropriate in that
case. We will demonstate, however, that at least in one example of such
a data set ( the “swiss roll”), our method produces reasonable results.
• Since much of the discussion of Seung and Lee (2000), Roweis and
Saul (2000), and Tenenbaum et al. (2000) is motivated by the role that
nonlinear dimensionality reduction may play in human perception
and learning, it is worthwhile to consider the implication of the pre-
vious remark in this context. The biological perceptual apparatus is
confronted with high-dimensional stimuli from which it must recover
low-dimensional structure. If the approach to recovering such low-
dimensional structure is inherently local (as in the algorithm proposed
here), then a natural clustering will emerge and may serve as the basis
for the emergence of categories in biological perception.
• Since our approach is based on the intrinsic geometric structure of the
manifold, it exhibits stability with respect to the embedding. As long
as the embedding is isometric, the representation will not change. In
the example with the moving camera, different resolutions of the cam-
era (i.e., different choices of n in the n × n image grid) should lead to
embeddings of the same underlying manifold into spaces of very dif-
1376 M. Belkin and P. Niyogi
ferent dimension. Our algorithm will produce similar representations
independent of the resolution.
The generic problem of dimensionality reduction is the following. Given
a set x
1
,...,x
k
of k points in
R
l
, find a set of points y
1
,...,y
k
in
R
m
(m l)
such that y
i
“represents” x
i
. In this letter, we consider the special case where
x
1
,...,x
k
∈ M and M is a manifold embedded in R
l
.
We now consider an algorithm to construct representative y
i
’s for this
special case. The sense in which such a representation is optimal will become
clear later in this letter.
2 The Algorithm
Given k points x
1
,...,x
k
in R
l
, we construct a weighted graph with k nodes,
one for each point, and a set of edges connecting neighboring points. The
embedding map is now provided by computing the eigenvectors of the
graph Laplacian. The algorithmic procedure is formally stated below.
1. Step 1 (constructing the adjacency graph). We put an edge between
nodes i and j if x
i
and x
j
are “close.” There are two variations:
(a) -neighborhoods (parameter ∈
R). Nodes i and j are con-
nected by an edge if x
i
−x
j
2
<where the norm is the usual
Euclidean norm in
R
l
. Advantages: Geometrically motivated,
the relationship is naturally symmetric. Disadvantages: Often
leads to graphs with several connected components, difficult
to choose .
(b) n nearest neighbors (parameter n ∈
N). Nodes i and j are con-
nected by an edge if i is among n nearest neighbors of j or j is
among n nearest neighbors of i. Note that this relation is sym-
metric. Advantages: Easier to choose; does not tend to lead to
disconnected graphs. Disadvantages: Less geometrically intu-
itive.
2. Step 2 (choosing the weights).
1
Here as well, we have two variations
for weighting the edges:
(a) Heat kernel (parameter t ∈
R). If nodes i and j are connected,
put
W
ij
= e
−
x
i
−
x
j
2
t
;
otherwise, put W
ij
= 0. The justification for this choice of
weights will be provided later.
1
In a computer implementation of the algorithm, steps 1 and 2 are executed
simultaneously.
Laplacian Eigenmaps 1377
(b) Simple-minded (no parameters (t =∞)). W
ij
= 1 if vertices i
and j are connected by an edge and W
ij
= 0 if vertices i and
j are not connected by an edge. This simplification avoids the
need to choose t.
3. Step 3 (eigenmaps). Assume the graph G, constructed above, is con-
nected. Otherwise, proceed with step 3 for each connected component.
Compute eigenvalues and eigenvectors for the generalized eigenvec-
tor problem,
Lf = λDf, (2.1)
where D is diagonal weight matrix, and its entries are column (or
row, since W is symmetric) sums of W, D
ii
=
j
W
ji
. L = D − W is
the Laplacian matrix. Laplacian is a symmetric, positive semidefinite
matrix that can be thought of as an operator on functions defined on
vertices of G.
Let f
0
,...,f
k−1
be the solutions of equation 2.1, ordered according
to their eigenvalues:
Lf
0
= λ
0
Df
0
Lf
1
= λ
1
Df
1
···
Lf
k−1
= λ
k−1
Df
k−1
0 = λ
0
≤ λ
1
≤···≤λ
k−1
.
We leave out the eigenvector f
0
corresponding to eigenvalue 0 and use
the next m eigenvectors for embedding in m-dimensional Euclidean
space:
x
i
→ (f
1
(i),...,f
m
(i)).
3 Justification
3.1 Optimal Embeddings. Let us first show that the embedding pro-
vided by the Laplacian eigenmap algorithm preserves local information
optimally in a certain sense.
The following section is based on standard spectral graph theory. (See
Chung, 1997, for a comprehensive reference.)
Recall that given a data set, we construct a weighted graph G = (V, E)
with edges connecting nearby points to each other. For the purposes of
this discussion, assume the graph is connected. Consider the problem of
mapping the weighted graph G to a line so that connected points stay as close
together as possible. Let y = (y
1
, y
2
,...,y
n
)
T
be such a map. A reasonable
剩余24页未读,继续阅读
YURI_YEP
- 粉丝: 1
- 资源: 4
上传资源 快速赚钱
- 我的内容管理 收起
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
会员权益专享
最新资源
- zigbee-cluster-library-specification
- JSBSim Reference Manual
- c++校园超市商品信息管理系统课程设计说明书(含源代码) (2).pdf
- 建筑供配电系统相关课件.pptx
- 企业管理规章制度及管理模式.doc
- vb打开摄像头.doc
- 云计算-可信计算中认证协议改进方案.pdf
- [详细完整版]单片机编程4.ppt
- c语言常用算法.pdf
- c++经典程序代码大全.pdf
- 单片机数字时钟资料.doc
- 11项目管理前沿1.0.pptx
- 基于ssm的“魅力”繁峙宣传网站的设计与实现论文.doc
- 智慧交通综合解决方案.pptx
- 建筑防潮设计-PowerPointPresentati.pptx
- SPC统计过程控制程序.pptx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0