Mixture Self-paced Learning for Multi-view
K-means Clustering
Hong Yu
School of Software
Dalian University of Technology Dalian, China
hongyu@dlut.edu.cn
Yahong Lian
School of Software
Dalian University of Technology Dalian, China
lianyahong1@163.com
Xiujuan Xu
School of Software
Dalian University of Technology Dalian, China
xjxu@dlut.edu.cn
Xiaowei Zhao
School of Software
Dalian University of Technology Dalian, China
xiaowei.zhao@dlut.edu.cn
Abstract—In our daily life, there are more and more data
characterized by multiple features. In multi-view setting, the
clusters estimated using single view have some limitations, and
the quality of single view clustering can be improved by means
of multi-view clustering. Self-paced learning simulates human
learning process which can gradually combine information of
views into clustering task from easy to complex. In this paper,
we first propose a new mixture self-paced learning regularizer.
To recap the effectiveness of regularizer, we combine it with
robust multi-view k-means clustering and propose a new self-
paced learning based multi-view k-means (SPLMKM) clustering
method. As a non-trivial contribution, we present the solution
based on alternating minimization strategy. The comparative
experiments reveal the benefit of our proposed method.
Index Terms—multi-view clustering; self-paced learning; k-
means
I. INTRODUCTION
Due to rapid development of data acquisition technology,
there are massive data from multiple domains. Thereinto, most
of data is characterized by multiple features. For example,
a web page can be described by the text information or
pictures included in the page. Multi-lingual documents repre-
sent articles via different language. In personal identification
scene, a person can be recognized by facial picture, fingerprint
or signature information. In multi-view setting, the clusters
estimated using single view have some limitations, and the
quality of single view clustering can be improved by means of
multi-view clustering. Multi-view clustering [1], [2] combines
information from multiple views to boost the clustering per-
formance. Recent years, there are a sight of researches about
multi-view clustering [3], [4].
K-means is a common and widely-used method and has a
great many advantages. It has been prevalent in unsupervised
learning domain because of its mathematical easiness and
implementation. Extending to multi-view scenario, Bickel et
al. [1] propose a multi-view spherical k-means clustering. In
the study [5], authors have shown that non-negative matrix
factorization is equivalent to relaxed k-means which utilizes
Frobenious norm to calculate reconstruction error. This leads
to the result that k-means is sensitive to noises and outliers.
l
2,1
norm is called sparsity-inducing norm which combines l
1
norm and l
2
norm. Applying l
2,1
norm to calculate k-means
reconstruction error can obtain more robust results [6], [7].
It can also be utilized to impose the structured sparsity on
the learned weight matrix and boost the multi-view clustering
performance [8], [9].
Most off-the-shelf multi-view k-means clustering methods
tend to solve non-convex objective functions. This deficiency
often makes them getting stuck into local minima, especially
with the interference of noises and outliers. The self-paced
learning [10] well simulates the process of human learning.
Those ‘easy’ samples will be chosen first to train a model, and
then ‘complex’ samples are faded into learning process(Figure
1 gives the brief example of self-paced learning). Some
proposed multi-view clustering methods which combine self-
paced learning scheme have shown to be beneficial in avoiding
bad local minima and efficient to improve final results [11]–
[14].
Fig. 1. An example of self-paced learning process. For clustering task, it is
instructive to note that the greater variance of the clusters, the more complex
the clustering(thus in our example view 3 is the most complex view and view
1 is the easiest view). Initially, three views have equal weights. And then
increases the weight of view 1 and view 2 and decreases the weight of view
3(learn from easy views first). And in the third group of pictures, increases
the weight of view 2 which is relative ’easier’ than view 3. Through this
process, self-paced learning helps model to learn from ’easy’ view first and
then gradually include other views into the learning task.
2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence
and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress
978-1-5386-1956-8/17 $31.00 © 2017 IEEE
DOI 10.1109/DASC-PICom-DataCom-CyberSciTec.2017.193
1210