Image clustering by hyper-graph regularized non-negative
matrix factorization
Kun Zeng
a
, Jun Yu
b,c,
n
, Cuihua Li
a
, Jane You
c
, Taisong Jin
a
a
Computer Science Department, School of Information Science and Engineering, Xiamen University, Xiamen, China
b
School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
c
Department of Computing, The Hong Kong polytechnic University, Kowloon, Hong Kong
article info
Article history:
Received 14 September 2013
Received in revised form
30 November 2013
Accepted 26 January 2014
Communicated by Mingli Song
Available online 17 February 2014
Keywords:
Non-negative matrix factorization
Hyper-graph laplacian
Image clustering
Dimension reduction
Manifold regularization
abstract
Image clustering is a critical step for the applications of content-based image retrieval, image annotation
and other high-level image processing. To achieve these tasks, it is essential to obtain proper representa-
tion of the images. Non-negative Matrix Factorization (NMF) learns a part-based representation of the data,
which is in accordance with how the brain recognizes objects. Due to its psychological and physiological
interpretation, NMF has been successfully applied in a wide range of application such as pattern
recognition, image processing and computer vision. On the other hand, manifold learning methods
discover intrinsic geometrical structure of the high dimension data space. Incorporating manifold
regularizer to standard NMF framework leads to novel performance. In this paper, we proposed a novel
algorithm, call Hyper-graph regularized Non-negative Matrix Factorization (HNMF) for this purpose. HNMF
captures intrinsic geometrical structure by constructing a hyper-graph instead of a simple graph. Hyper-
graph model considers high-order relationship of samples and outperforms simple graph model. Empirical
experiments demonstrate the effectiveness of the proposed algorithm in comparison to the state-of-the-art
algorithms, especially some related works based on NMF.
& 2014 Elsevier B.V. All rights reserved.
1. Introduction
In order to improve the performance of applications such as
image retrieval, image annotation and image indexing, image cluster -
ing is used to better represent and browse images. Finding a suitable
data re presentation is a critical problem in many data analysis tasks
[1,2,5–11]. Various researchers have long sought appropriate data
representation which typically makes intrinsic structure in the data
explicit so further process, such as clustering [5,6,10,11,40,43] and
classification [11,39,41,42,44] can be applied. One of the most popular
data repr esentation techniques is matrix factorization. There are a
number of matrix factorizations including Singular Value Decom-
position (SVD), Non-negative Matrix Factorization (NMF) [5] etc.
NMF learns a part-based representation of the data, which is in
accordance with how the brain recognizes objects [12–14] and
gets lots of attention after popularized by Lee and Seung [5] due to
their contributing work of a simple effective procedure. NMF seeks
two non-negative matrices whose product approximates the
original matrix best. The non-negativity constraint leads NMF to
a part-based and sparse representation of the object because it
only allows additive combination of the original data.
Manifold learning methods such as ISOMAP [1 5 ],LocallyLinear
Embedding (LLE) [1 6], Laplacian Eigenmap (LE) [1 7],LocalTangent
Space alignment(L TSA) [37], Discriminative Locality Alignment (DLA)
[36] etc. are proposed to detect underly ing intrins ic structur e of data
according to local invariance assumption that nearby points in original
space are likely t o have similar embeddings. Isomap [15] preserves
global geodesic distances of all pairs of measurements. LLE [1 6]
assumes that one sample can be repr esented by linear combination
of its neighbors and the relationship between the sample and its
neighbors are preserved after dimensionality reduction. LE [17]
assumes that the representation of two nearby data points in the
intrinsic structure are also close to each other in the embedding space
and pr eserves proximity relationships by constructing an undirected
weight ed graph which indicates neighbor relations of pairwise mea-
surements. LTSA [37] exploits the local tangent information and aligns
it to provide a global coordinate. DLA [36] imposes discriminative
information in the part optimization stage and preserves the discri-
minative ability . High performance can be achieved if geometrical
structur e is e xploit ed and local invariance is considered. Laplacian
regularization [6] and Hessian regularization [33–35] are usually used
to discovering the geometrical structur e. In [34,35],Hessianregular-
ization is presented for image annotation. Hessian regularized support
vector machine is presented in [34] for image annotation on the cloud
and shows the effectiveness for large-scale image annotation. Multi-
view Hessian reg ularization (mHR) [35] optimally combines multiple
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/neucom
Neurocomputing
http://dx.doi.org/10.1016/j.neucom.2014.01.043
0925-2312 & 2014 Elsevier B.V. All rights reserved.
n
Corresponding author.
E-mail address: zju.yujun@gmail.com (J. Yu).
Neurocomputing 138 (2014) 209–21 7