没有合适的资源?快使用搜索试试~ 我知道了~
首页基于NMF的多视图无监督聚类:一致性与可比性的融合
基于NMF的多视图无监督聚类:一致性与可比性的融合
需积分: 36 3 下载量 177 浏览量
更新于2024-09-03
收藏 150KB PDF 举报
"《多视图聚类通过联合非负矩阵分解》是一篇研究论文,针对许多现实世界数据集存在的多维度、多视角特性,提出了一个新颖的多视图聚类方法。在这些数据集中,不同视角通常提供互补的信息,多视图聚类的目标是整合这些信息,同时从每个视角生成一致的聚类结果,揭示各个视角共享的潜在结构。 论文的核心在于利用非负矩阵分解(Non-negative Matrix Factorization, NMF)作为基础技术。传统的NMF用于将数据矩阵分解为两个非负因子,但在多视图情况下,作者设计了一种联合的矩阵分解过程,引入了约束机制,确保每个视角的聚类解决方案朝着共同的共识方向收敛,而不是简单地固定为独立的结果。这种约束性的设计使得算法能够适应多视角数据的特点,提高聚类的稳健性和一致性。 然而,一个关键挑战是如何保证在不同视角下的聚类结果既具有意义又可以进行有效的比较。为了克服这一问题,作者借鉴了NMF与概率潜在语义分析(Probabilistic Latent Semantic Analysis, PLSA)之间的联系,开发了一种创新的归一化策略。这种策略旨在平衡各个视角的贡献,使得聚类结果在各个视角间具有内在的一致性,从而增强聚类的可解释性和有效性。 论文通过在多个数据集上进行实验验证,展示了新方法的有效性和可靠性。实验结果显示,与传统单视图聚类方法相比,该算法在处理复杂多视角数据时,不仅提高了聚类的准确度,还能够揭示出更深层次的数据关联,对于实际应用具有很高的价值。这篇论文为多视图聚类问题提供了一个强大的工具,有助于在无监督学习环境中挖掘和利用多源信息,提升数据分析的全面性和深度。"
资源详情
资源推荐
Multi-View Clustering via Joint Nonnegative Matrix Factorization
Jialu Liu
1
, Chi Wang
1
, Jing Gao
2
, and Jiawei Han
1
1
University of Illinois at Urbana-Champaign
2
University at Buffalo
Abstract
Many real-world datasets are comprised of different rep-
resentations or views which often provide information
complementary to each other. To integrate information
from multiple views in the unsupervised setting, multi-
view clustering algorithms have been develop ed to clus-
ter multiple views simultaneously to derive a solution
which uncovers the common latent structure shared by
multiple views. In this paper, we propose a novel NMF-
based multi-view clustering algorithm by searching for a
factorization that gives compatible clustering solutions
across multiple views. The key idea is to formulate a
joint matrix factorization process with the constraint
that pushes clustering solution of each view towards
a common consensus instead of fixing it directly. The
main challenge is how to keep clustering solutions across
different views meaningful and comparable. To tackle
this challenge, we design a novel and effective normaliza-
tion strategy inspired by the connection between NMF
and PLSA. Experimental results on synthetic and sev-
eral real datasets demonstrate the effectiveness of our
approach.
1 Introduction
Many datasets in real world are naturally comprised of
different representations or views [5]. For example, the
same story can be told in articles from different news
sources, one document may be translated into multiple
different languages, research communities are formed
based on research topics as well as co-authorship links,
web pages can be classified based on both content and
anchor text leading to hyperlinks, and so on. In these
applications, each data set is represented by attributes
that can naturally be split into different subsets, any
of which suffices for mining knowledge. Observing that
these multiple representations often provide compatible
and complementary information, it becomes natural
for one to integrate them together to obtain better
performance rather than relying on a single view. The
key of learning from multiple views (multi-view ) is to
leverage each view’s own knowledge base in order to
outperform simply concatenating views.
As unlabeled data are plentiful in real life and in-
creasing quantities of them come in multiple views from
diverse sources, the problem of unsupervised learning
from multiple views of unlabeled data has attracted
attention [3, 17], referred to as multi-view clustering.
The goal of multi-view clustering is to partition objects
into clusters based on multiple representations of the
object. Existing multi-view clustering algorithms can
be roughly classified into three categories. Algorithms
in the first category [3, 17] incorp orate multi-view inte-
gration into the clustering process directly through op-
timizing certain loss functions. In contrast, algorithms
in the second category such as the ones based on Canon-
ical Correlation Analysis [8, 4] first project multi-view
data into a common lower dimensional subspace and
then apply any clustering algorithm such as k-means to
learn the partition. The third category is called late in-
tegration or late fusion, in which a clustering solution
is derived from each individual view and then all the
solutions are fused base on consensus [7, 13].
In this paper, we propose a new multi-view cluster-
ing approach based on a highly effective technique in
single-view clustering, i.e., non-negative matrix factor-
ization (NMF) [18]. NMF, which was originally intro-
duced as a dimensionality reduction technique [18], has
been shown to be useful in many research areas such
as information retrieval [20] and pattern recognition
[18]. NMF has received much attention because of its
straightforward interpretability for applications, i.e., we
can explain each observation as an additive linear com-
binations of nonnegative basis vectors. Recently, NMF
has become a popular technique for data clustering, and
it is reported to achieve competitive performance com-
pared with most of the state-of-the-art unsupervised al-
gorithms. For example, Xu et al. [20] applied NMF to
text clustering and gained superior performance, and
Brunet et al. [6] achieved similar success on biological
data clustering. Recent studies [9, 11] show that NMF
is closely related to Probabilistic Latent Semantic Anal-
下载后可阅读完整内容,剩余8页未读,立即下载
杜兵伟
- 粉丝: 65
- 资源: 8
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- IPQ4019 QSDK开源代码资源包发布
- 高频组电赛必备:掌握数字频率合成模块要点
- ThinkPHP开发的仿微博系统功能解析
- 掌握Objective-C并发编程:NSOperation与NSOperationQueue精讲
- Navicat160 Premium 安装教程与说明
- SpringBoot+Vue开发的休闲娱乐票务代理平台
- 数据库课程设计:实现与优化方法探讨
- 电赛高频模块攻略:掌握移相网络的关键技术
- PHP简易简历系统教程与源码分享
- Java聊天室程序设计:实现用户互动与服务器监控
- Bootstrap后台管理页面模板(纯前端实现)
- 校园订餐系统项目源码解析:深入Spring框架核心原理
- 探索Spring核心原理的JavaWeb校园管理系统源码
- ios苹果APP从开发到上架的完整流程指南
- 深入理解Spring核心原理与源码解析
- 掌握Python函数与模块使用技巧
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功