A Novel Landmark Point Selection Method for L-ISOMAP
Hao Shi, Baoqun Yin, Yizhao Bao and Yingke Lei
Abstract— Isometric feature mapping (ISOMAP) presents
remarkable performance for nonlinear dimensionality reduc-
tion in diversified research domains. Landmark-ISOMAP(L-
ISOMAP) has been proposed to improve the scalability of
ISOMAP by performing the most complicated computations
on a subset of points referred as to landmarks. In this paper,
we present a novel landmark point selection method for L-
ISOMAP. The approach first attempts to find a minimum set
cover of the neighbourhood sets and get the corresponding data
points, referred as to landmark candidate points. After that,
it removes the points which belong to neighbour sets of other
points from the candidate point set and then the remaining can-
didate points are the landmarks. We run several experiments
on synthetic and physical data sets and the experiment results
validate the effectiveness of our proposed method.
I. INTRODUCTION
Nowadays nonlinear dimensionality reduction (NLDR)
has been paid extensive attention, since most emerging
applications, such as chemoinformatics [1], global climate
patterns [2], and face recognition [3] are concerned with
the high-dimensional attributes. Manifold learning is one of
the NLDR techniques[4]. Currently, many manifold learning
methods are available, such as ISOMAP [5], locally linear
embedding (LLE) [6], local tangent space alignment (LTSA)
[7], laplacian eigenmaps (LE) [8], and self-organizing maps
(SOMs) [9]. Among the above methods, ISOMAP a global
manifold learning method. It attempts to recover original
global geometry of observed data drawn from the nonlin-
ear high-dimensional manifold. So far, ISOMAP has been
applied in many different fields, such as stellar spectra [10],
protein interaction prediction [11] and damage localization
[12]. When the number of sample data points increases,
ISOMAP faces two computational bottlenecks: computing
shortest path matrix (SPM) and metric multidimensional
scaling (MDS) eigenvalue calculation [5]. To reduce the
computational complexity, Silva et al. proposed L-ISOMAP
[13]. L-ISOMAP only constructs the shortest path graph
This work is supported in part by the National Natural Science Foun-
dation of China under grant Nos. 61174124,61233003,61272333, in part
by Research Fund for the Doctoral Program of Higher Education of China
under grant No. 20123402110029 and in part by Natural Science Research
Program of the Anhui High Education Bureau of China under grant No.
KJ2012A286.
Hao Shi is with Department of Automation, University
of Science and Technology of China, 230027, Hefei, China
haoshi@mail.ustc.edu.cn
Baoqun Yin is with Department of Automation, University of Science and
Technology of China, 230027, Hefei, China
bqyin@ustc.edu.cn
Yizhao Bao is with Department of Automation, University
of Science and Technology of China, 230027, Hefei, China
zhxf325@mail.ustc.edu.cn
Yingke lei is Electronic Engineering Institute, 230027, Hefei, China
leiyingke@163.com
between each pair of data point and landmarks instead of
between all pairwise data points. In this way, L-ISOMAP
addresses both of these inefficiencies of ISOMAP.
How to designate landmarks for L-ISOMAP is still an
open question since the number and distribution of the
landmark points are uncertain. More landmarks mean more
stability [13]. Poorly distributed landmarks result in poor
embedding result. So far, several landmark selection ap-
proaches have been proposed. In [14], landmarks were
selected using the “minimum spanning tree cut” method,
which tends to choose the set of boundary samples as
landmarks. The “maxmin” method, was proposed to reduce
uncertainty and improve landmarks for classification [15,16].
In [17], landmarks were selected based on mixed-integer
optimization. In our previous work [18], an approach was
developed which is based on Minimum Set Cover problem
which is called Fast-ISOMAP. It tries to find a minimum
set cover of the neighbourhood sets by a greedy heuristic
algorithm and the corresponding points are determined to
landmark points. But the landmark points selected by this
mean may be neighbours of each other. Thus in this paper,
we further optimize the landmark point selection method.
The points whose neighbourhoods constitute the minimum
set cover of the whole neighbourhood graph are referred as to
landmark candidate points. We delete the ones which belong
to neighbour sets of other points from the set of candidate
points and the remaining candidate points are the landmarks.
The paper is organized as follows. Section II briefly
reviews the ISOMAP and L-ISOMAP. Section III gives the
landmark selection method. Experiment results on synthetic
and physical data sets are given in Section IV, in order to
show the performance of our method. Finally, conclusions
and future extensions are discussed in Section V.
II. ISOMAP
AND L-ISOMAP
In this section, we briefly review the algorithm for I-
SOMAP and L-ISOMAP [4,13].
A. ISOMAP
Let X = {x
1
,x
2
, ··· ,x
N
}⊂M⊂R
D
, where M is a
manifold in D-dimensioanl Euclidean space R
D
. ISOMAP
attempts to unravel the d-dimensional embedding of X,
where d D. ISOMAP consists of three steps. First, create
the neighborhood graph G by either using k nearest neighbor
(k-NN) or ε-ball rule. Secondly, compute the SPM,
D
N
by implementing Floyds algorithm or Dijkstras algorithm.
Finally, find the d-dimensional embedding by means of MDS
algorithm. In practice, ISOMAP scales poorly to large data
12th IEEE International Conference on Control & Automation (ICCA)
Kathmandu, Nepal, June 1-3, 2016
978-1-5090-1738-6/16/$31.00 ©2016 IEEE 621