多视图聚类的基于内簇权重的核K-means方法

174 浏览量更新于2024-08-27 收藏 405KB PDF 举报

本文献探讨了一种簇加权核K-均值聚类方法(ACluster-WeightedKernelK-MeansMethod for Multi-View Clustering)，由作者Jing Liu、Fuyuan Cao、Xiao-Zhi Gao、Liqin Yu和Jiye Liang共同提出。他们来自中国的山西大学计算机与信息技术学院、山西农业大学软件学院以及芬兰东部大学计算学院。该研究关注的是如何在多视图数据集上进行更有效的聚类，以便综合利用多个视角的信息，从而提高最终聚类结果的性能。现有的多视图聚类方法往往通过学习每个视角的权重来决定其对最终解决方案的贡献。这种方法的优点在于能够衡量一个视角的整体重要性。然而，这样的权重分配方式忽视了每个视角内部不同簇的重要性差异。这意味着即使某个视角被赋予更高的权重，它并不能确保该视角内的所有簇都比其他视角中的簇更重要。这在实际应用中可能导致信息的有效利用不足。因此，作者提出的簇加权核K-均值方法旨在解决这一问题。该方法的核心思想是为每个视图中的每一个内簇分配权重，而非仅仅依赖于整个视图的权重。这样，每个内簇的重要性可以根据其在聚类过程中的表现和与其他视角的关系动态调整。通过引入核函数，该方法能够更好地处理非线性和高维数据，进一步增强了多视图融合的灵活性。具体来说，簇加权核K-均值算法可能包括以下步骤： 1. 数据预处理：对多源数据进行整合和标准化，确保各个视角的数据在同一尺度上。 2. 视角表示：通过核技巧将原始特征映射到高维空间，使得非线性结构变得可处理。 3. 内簇初始化：为每个视图中的每个簇分配初始中心点。 4. 簇更新：根据每个内簇的样本分布和核函数计算相似度，同时考虑内簇权重。 5. 权重更新：根据内簇的凝聚度和分离度动态调整内簇权重，强调重要且稳定的簇。 6. 重复步骤4和5直到收敛，即内簇中心不再显著改变或达到预定迭代次数。该方法的优势在于它不仅考虑了视角整体的重要性，还关注了视角内部簇的个体差异，从而实现了更精确的聚类。这对于多模态数据和复杂数据集的分析具有重要意义，可以提升多视图聚类的准确性和鲁棒性。然而，这种方法可能需要更多的计算资源，并且对于参数的选择和调优也具有一定挑战。尽管如此，它为多视图聚类提供了一个有前景的改进策略，值得在实际应用中进一步研究和优化。

A Cluster-Weighted Kernel K-Means Method for Multi-View Clustering

Jing Liu,

1,2

Fuyuan Cao,

Xiao-Zhi Gao,

Liqin Yu,

Jiye Liang

1,∗

School of Computer and Information Technology, Shanxi University, Taiyuan 030006, P.R. China

School of Software, Shanxi Agricultural University, Taigu 030801, P.R. China

School of Computing, University of Eastern Finland, Kuopio 70211, Finland

jingliu sxu@hotmail.com, cfy@sxu.edu.cn, xiao-zhi.gao@uef.ﬁ, liqinyu sxu@hotmail.com, ljy@sxu.edu.cn

Abstract

Clustering by jointly exploiting information from multiple

views can yield better performance than clustering on one s-

ingle view. Some existing multi-view clustering methods aim

at learning a weight for each view to determine its contribu-

tion to the ﬁnal solution. However, the view-weighted scheme

can only indicate the overall importance of a view, which fails

to recognize the importance of each inner cluster of a view. A

view with higher weight cannot guarantee all clusters in this

view have higher importance than them in other views. In this

paper, we propose a cluster-weighted kernel k-means method

for multi-view clustering. Each inner cluster of each view is

assigned a weight, which is learned based on the intra-cluster

similarity of the cluster compared with all its corresponding

clusters in different views, to make the cluster with higher

intra-cluster similarity have a higher weight among the cor-

responding clusters. The cluster labels are learned simultane-

ously with the cluster weights in an alternative updating way,

by minimizing the weighted sum-of-squared errors of the ker-

nel k-means. Compared with the view-weighted scheme, the

cluster-weighted scheme enhances the interpretability for the

clustering results. Experimental results on both synthetic and

real data sets demonstrate the effectiveness of the proposed

method.

1 Introduction

Multi-view data widely exist in real-world applications,

where the same set of instances are represented by multiple

distinct feature sets from different perspectives. For exam-

ple, images can be described by different visual descriptors;

documents may be translated into various languages; and pa-

tients are diagnosed by several types of medical examina-

tions. These heterogeneous views usually have consistent as

well as complementary information with each other, which

can be simultaneously learned to get better performance than

learning one single view.

Multi-view clustering has gained much attention in recent

years (Chao, Sun, and Bi 2017). It assumes that differen-

t views have a common clustering partition, which mean-

s the corresponding instances in different views belong to

∗

Corresponding author: Jiye Liang. Email: ljy@sxu.edu.cn.

 2020, Association for the Advancement of Artiﬁcial

the same cluster. Simply concatenating features from dif-

ferent views into a single one clustered by traditional clus-

tering algorithms often results in poor performance, since

it ignores the heterogeneity of different feature spaces and

may lead to dimension curse. Most existing multi-view al-

gorithms tend to obtain a common clustering partition by

jointly exploiting information of multiple views without

breaking the inherent structure of each view. The gener-

al idea of these algorithms is to guarantee the consisten-

cy among different views by using common cluster dis-

crimination information, which can be expressed by com-

mon eigenvector matrix for multi-view spectral clustering

(Kumar and Daum

e 2011; Kumar, Rai, and Daum

e 2011;

Li et al. 2015; Nie, Li, and Li 2016), common coefﬁcient

matrix for multi-view subspace clustering (Yin et al. 2015;

Gao et al. 2015; Wang et al. 2016), and common indicator

matrix for multi-view nonnegative matrix factorization clus-

tering (Akata, Thurau, and Bauckhage 2011; Liu et al. 2013;

Qian et al. 2016) and multi-view k-type clustering (Tzortzis

and Likas 2012; Cai, Nie, and Huang 2013; Xu, Wang, and

Lai 2016).

In some cases, the low-quality views (views with high

clustering loss under the common clustering partition) may

degrade the performance if equally using all available views.

To determine the contribution of different views to the ﬁ-

nal clustering, many view-weighted methods for learning

a weight for each view have been proposed. Some meth-

ods (Tzortzis and Likas 2012; Xia et al. 2010; Li et al.

2015) multiply each view with a weight factor, and the dis-

tribution of the weights is controlled by an extra hyper-

parameter. Some methods (Nie, Li, and Li 2016; 2017;

Huang, Kang, and Xu 2018) use a self-weighted scheme

to automatically learn a weight for each view without in-

troducing any extra hyper-parameter. Xu, Wang, and Lai

(2016) proposed a method to jointly learn the view weight-

s as well as the feature weights for high-dimensional fea-

ture selection. Xu, Tao, and Xu (2015) proposed a self-

paced smoothed weighting scheme that dynamically assigns

weights to views in clustering process for gradually training

from ’easy’ to ’complex’ views. In general, most existing

view-weighted methods determine the weight for each view

according to the clustering loss of each view, to make the

下载后可阅读完整内容，剩余7页未读，立即下载

weixin_38602098

粉丝: 3
资源: 963

多视图聚类的基于内簇权重的核K-means方法

Auto-Weighted Multi-view Learning.zip_Image Clustering；_Multi-vi

Self-weighted Multiview Clustering .rar_gravityyhw_multiview_sho

k-means_k-means_K._

volume-weighted average price 源码

weighted k-means代码

weighted k-means matlab代码

SNR C-wt A-wt

Robust+federated+learning+under+statistical+heterogeneity+via+hessian-weighted+aggregation

用python实现读取excel文件的经纬度和货量，用加权K-means聚类算法进行分类。要求聚类中心向货量大的一方倾斜，并求出聚类中心坐标

最新资源