A Cluster-Weighted Kernel K-Means Method for Multi-View Clustering
Jing Liu,
1,2
Fuyuan Cao,
1
Xiao-Zhi Gao,
3
Liqin Yu,
1
Jiye Liang
1,∗
1
School of Computer and Information Technology, Shanxi University, Taiyuan 030006, P.R. China
2
School of Software, Shanxi Agricultural University, Taigu 030801, P.R. China
3
School of Computing, University of Eastern Finland, Kuopio 70211, Finland
jingliu sxu@hotmail.com, cfy@sxu.edu.cn, xiao-zhi.gao@uef.fi, liqinyu sxu@hotmail.com, ljy@sxu.edu.cn
Abstract
Clustering by jointly exploiting information from multiple
views can yield better performance than clustering on one s-
ingle view. Some existing multi-view clustering methods aim
at learning a weight for each view to determine its contribu-
tion to the final solution. However, the view-weighted scheme
can only indicate the overall importance of a view, which fails
to recognize the importance of each inner cluster of a view. A
view with higher weight cannot guarantee all clusters in this
view have higher importance than them in other views. In this
paper, we propose a cluster-weighted kernel k-means method
for multi-view clustering. Each inner cluster of each view is
assigned a weight, which is learned based on the intra-cluster
similarity of the cluster compared with all its corresponding
clusters in different views, to make the cluster with higher
intra-cluster similarity have a higher weight among the cor-
responding clusters. The cluster labels are learned simultane-
ously with the cluster weights in an alternative updating way,
by minimizing the weighted sum-of-squared errors of the ker-
nel k-means. Compared with the view-weighted scheme, the
cluster-weighted scheme enhances the interpretability for the
clustering results. Experimental results on both synthetic and
real data sets demonstrate the effectiveness of the proposed
method.
1 Introduction
Multi-view data widely exist in real-world applications,
where the same set of instances are represented by multiple
distinct feature sets from different perspectives. For exam-
ple, images can be described by different visual descriptors;
documents may be translated into various languages; and pa-
tients are diagnosed by several types of medical examina-
tions. These heterogeneous views usually have consistent as
well as complementary information with each other, which
can be simultaneously learned to get better performance than
learning one single view.
Multi-view clustering has gained much attention in recent
years (Chao, Sun, and Bi 2017). It assumes that differen-
t views have a common clustering partition, which mean-
s the corresponding instances in different views belong to
∗
Corresponding author: Jiye Liang. Email: ljy@sxu.edu.cn.
Copyright
c
2020, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
the same cluster. Simply concatenating features from dif-
ferent views into a single one clustered by traditional clus-
tering algorithms often results in poor performance, since
it ignores the heterogeneity of different feature spaces and
may lead to dimension curse. Most existing multi-view al-
gorithms tend to obtain a common clustering partition by
jointly exploiting information of multiple views without
breaking the inherent structure of each view. The gener-
al idea of these algorithms is to guarantee the consisten-
cy among different views by using common cluster dis-
crimination information, which can be expressed by com-
mon eigenvector matrix for multi-view spectral clustering
(Kumar and Daum
´
e 2011; Kumar, Rai, and Daum
´
e 2011;
Li et al. 2015; Nie, Li, and Li 2016), common coefficient
matrix for multi-view subspace clustering (Yin et al. 2015;
Gao et al. 2015; Wang et al. 2016), and common indicator
matrix for multi-view nonnegative matrix factorization clus-
tering (Akata, Thurau, and Bauckhage 2011; Liu et al. 2013;
Qian et al. 2016) and multi-view k-type clustering (Tzortzis
and Likas 2012; Cai, Nie, and Huang 2013; Xu, Wang, and
Lai 2016).
In some cases, the low-quality views (views with high
clustering loss under the common clustering partition) may
degrade the performance if equally using all available views.
To determine the contribution of different views to the fi-
nal clustering, many view-weighted methods for learning
a weight for each view have been proposed. Some meth-
ods (Tzortzis and Likas 2012; Xia et al. 2010; Li et al.
2015) multiply each view with a weight factor, and the dis-
tribution of the weights is controlled by an extra hyper-
parameter. Some methods (Nie, Li, and Li 2016; 2017;
Huang, Kang, and Xu 2018) use a self-weighted scheme
to automatically learn a weight for each view without in-
troducing any extra hyper-parameter. Xu, Wang, and Lai
(2016) proposed a method to jointly learn the view weight-
s as well as the feature weights for high-dimensional fea-
ture selection. Xu, Tao, and Xu (2015) proposed a self-
paced smoothed weighting scheme that dynamically assigns
weights to views in clustering process for gradually training
from ’easy’ to ’complex’ views. In general, most existing
view-weighted methods determine the weight for each view
according to the clustering loss of each view, to make the