Evolutionary Multi-objective Optimization for
Multi-view Clustering
Bo Jiang
∗
, Feiyue Qiu
∗
, Shipin Yang
†
and Liping Wang
‡
∗
College of Education Science and Technology, Zhejiang University of Technology, Hangzhou, China 310023
Email: bjiang, qfy@zjut.edu.cn
†
College of Electrical Engineering and Control Science, Nanjing Tech University, Nanjing, China 211800
Email: spyang@njtech.edu.cn
‡
College of Administration and Management, Zhejiang University of Technology, Hangzhou, China 310023
Email: wlp@zjut.edu.cn
Abstract—In some real-world applications, multiple measur-
ing methods are often employed to extract multiple feature groups
of data, yielding multi-view data. The main challenge of multi-
view clustering is to find a suitable way of simultaneously ex-
ploiting the complementary information of all views, considering
the view conflicts arose by different measures. For perspective
of optimization, previous multi-view clustering studies applied
weighted sum method to represent degree of conflict and treated
it as a weighted sum single-objective optimization problem. In
this work, we formatted multi-view clustering as a multi-objective
optimization problem, in which each view is regarded as a totally
independent feature subset. The clustering objective function
in each view is one of the multiple objectives. Five popular
multi-objective evolutionary algorithms (MOEAs), i.e., NSGA-
II, SPEA2, MOEA/D, SMS-EMOA and NSGA-III, were used
to solve the induced multi-objective problem. Six real-world
multi-view datasets were used to evaluate the proposed method
and the experimental results showed that SPEA2 significantly
outperformed the other MOEAs according to three performance
evaluation indices.
I. INTRODUCTION
Multi-view data is common in many real-world and sci-
entific fields in big data era. For example, in massive open
online courses (MOOCs), both students’ course registration
data (view 1) and online behavioral data (view 2) are used
to predict students dropout [1]; in image analysis, each image
can be represented by several different visual descriptors, such
as RGB color histograms, HSV color histograms and Haralick
texture features [2]. Each type of view capture distinct per-
spectives of the data. Fox example, in MOOCs data, students
course registration data describe their demographic information
and past academic grade, and online behavioral data record
their interaction behaviours during learning, such as posting in
forum, viewing lecture, view forums and submitting quizzes,
which reflect level of learning engagement. Therefore, it is
crucial to integrate these heterogeneous views to generate more
accurate and robust clustering results, rather than relying on
single view.
The goal of multi-view clustering is to find clusters that
are consistent across different views. According to how the
multiple views are utilized, existing work in multi-view clus-
tering can be broadly classified into centralized methods and
distributed methods. Algorithms in the first category utilize
all views simultaneously to discover hidden patterns [3], [4],
[5], [6], [7], [8], [9], [10], [11]. In contrast, approaches from
the second category first cluster each view independently and
then combine the individual clustering results to produce a
final partition [12], [13], [14]. Although centralized multi-
view clustering methods gained increasing attention in the
past decade due to their good performance, most of them
are based on spectral clustering that needs heavy computation
of the kernel construction and eigenvector decomposition, so
these methods cannot be used for tackling large-scale datasets.
Therefore, several multi-view clustering algorithms based on
K-means equivalences were proposed to solve large-scale
multi-view clustering problems [15], [16], [17]. In these stud-
ies, tackling disagreement among views is a key issue. Since
multiple views are derived from integration of multiple types
of measurements, they have very different statistical properties
and produce different partitions, it is very hard to find a
pattern completely consistent for all views. In contrast, a more
realistic option is first to discriminate views and then conduct
clustering on the discriminative feature space. A simple yet
efficient method is view weighting. Some typical examples of
this kind of methods includ view weighted nonnegative matrix
factorization [7], view weighted spectral clustering [18], [9],
view weighted K-means [15], [11] and two-level weighted K-
means algorithm [16], [19].
Previous discriminative multi-view clustering methods for-
mulated the problem as a weighted sum single-objective op-
timization problem that needs to solve the best view weight
and partition simultaneously. The key advantage of this kind
of method is that, they do not make restricted assumptions that
all views are compatible to each other, and so they are usually
more robust and flexible. However, from optimization point
of view, the objective function of the weighted multi-view
clustering is very hard to be solved efficiently in general. On
one hand, the high feature dimension of the multi-view dataset
makes the optimization problem with large-scale variables.
For example, the famous handwritten digit dataset has 649
dimensions
1
and the 3-Sources news dataset has more than
10
4
dimensions
2
. On the other hand, more importantly, the
commonly used weighting methods, including fuzzy weighting
[9], [18], [15], negative entropy weighting[16], [17] and sparse
regularization [11], [20], [21], make the objective function non-
convex and non-smooth.
1
http://archive.ics.uci.edu/ml/datasets/
2
http://mlg.ucd.ie/datasets/3sources.html
3308
978-1-5090-0623-6/16/$31.00
c
2016 IEEE