深度双编码器驱动的判别性光谱聚类提升

96 浏览量更新于2024-08-28 收藏 1.7MB PDF 举报

本文主要探讨了"使用双自动编码器网络的深度光谱聚类"这一前沿研究领域。随着深度学习在计算机视觉和机器学习中的日益重要，传统的聚类方法已经无法满足复杂数据集的高效分析需求。深度聚类作为一种融合嵌入与聚类的策略，通过优化嵌入空间以提升聚类性能，展现出了显著的优势。作者提出了一个创新的联合学习框架，其核心是双自动编码器网络。自动编码器是一种无监督学习模型，它包含一个编码器和解码器，能够将输入数据压缩到低维潜在空间，同时尽量保持原始信息的重构。在这个框架中，设计的双自动编码器网络不仅处理原始数据，还处理其噪声版本，目的是增强潜在表示的鲁棒性，使其能更好地适应各种噪声环境。通过在潜在空间中应用重建约束，网络能够学习到更加区分度高的特征表示。此外，作者引入了互信息估计技术，进一步增加了输入数据中的判别信息，使得嵌入更加精确，有助于提升聚类的准确性。这种联合学习策略使得网络能够在保留输入间关系的同时，挖掘深层次的特征关联，从而提高聚类的性能和有效性。深度光谱聚类方法在此基础上得以应用，它是一种基于图论的非监督聚类算法，通过构建数据点间的相似度矩阵，将其转化为谱聚类问题，以发现数据的自然结构。将学习到的潜在表示映射到特征空间，并在此空间中进行聚类，能够充分利用数据的内在连接，生成更为准确的簇。实验证明，相比于现有的聚类方法，这个基于双自动编码器网络的深度光谱聚类策略在基准数据集上表现出显著的优势，无论是对于噪声数据的处理还是对于复杂数据集的分割，都能提供更优的聚类结果。因此，这项研究不仅为深度聚类提供了新的技术手段，也为实际应用中的数据预处理和高维数据分析开辟了新的可能性。

Deep Spectral Clustering using Dual Autoencoder Network

Xu Yang

, Cheng Deng

1∗

, Feng Zheng

, Junchi Yan

, Wei Liu

4∗

School of Electronic Engineering, Xidian University, Xian 710071, China

Department of Computer Science and Engineering, Southern University of Science and Technology

Department of CSE, and MoE Key Lab of Artiﬁcial Intelligence, Shanghai Jiao Tong University

Tencent AI Lab, Shenzhen, China

{xuyang.xd, chdeng.xd}@gmail.com, zhengf@sustc.edu.cn,

yanjunchi@sjtu.edu.cn, wl2223@columbia.edu

Abstract

The clustering methods have recently absorbed even-

increasing attention in learning and vision. Deep cluster-

ing combines embedding and clustering together to obtain

optimal embedding subspace for clustering, which can be

more effective compared with conventional clustering meth-

ods. In this paper, we propose a joint learning framework

for discriminative embedding and spectral clustering. We

ﬁrst devise a dual autoencoder network, which enforces the

reconstruction constraint for the latent representations and

their noisy versions, to embed the inputs into a latent space

for clustering. As such the learned latent representations

can be more robust to noise. Then the mutual information

estimation is utilized to provide more discriminative infor-

mation from the inputs. Furthermore, a deep spectral clus-

tering method is applied to embed the latent representations

into the eigenspace and subsequently clusters them, which

can fully exploit the relationship between inputs to achieve

optimal clustering results. Experimental results on bench-

mark datasets show that our method can signiﬁcantly out-

perform state-of-the-art clustering approaches.

1. Introduction

As an important task in unsupervised learning [39, 8, 20]

and vision communities, clustering has been widely used

in image segmentation [33], image categorization [41], and

digital media analysis [1]. The goal of clustering is to ﬁnd

a partition in order to keep similar data points in the same

cluster while dissimilar ones in different clusters. In recen-

t years, many clustering methods have been proposed, such

as K-means clustering [24], spectral clustering [27, 42], and

non-negative matrix factorization clustering [37], among

which K-means and spectral clustering are two well-known

∗

Corresponding author.

(a) Raw data (b) ConvAE (c) Our method

Figure 1. Visualizing the discriminative embedding capability on

MNIST-test with t-SNE algorithm. (a): the space of raw data, (b):

data points in the latent subspace of convolution autoencoder; (c):

data points in the latent subspace of the proposed autoencoder net-

work. Our method can provide a more discriminative embedding

subspace.

conventional algorithms that are applicable to a wide range

of various tasks. However, these shallow clustering method-

s depend on low-level features such as raw pixels, SIFT [28]

or HOG [7] of the inputs. Their distance metrics are only

exploited to describe local relationships in data space, and

have limitation to represent the latent dependencies among

the inputs [3].

This paper presents a novel deep learning based unsu-

pervised clustering approach. Deep clustering, which inte-

grates embedding and clustering processes to obtain opti-

mal embedding subspace for clustering, can be more effec-

tive than shallow clustering methods. The main reason is

that the deep clustering methods can effectively model the

distribution of the inputs and capture the non-linear proper-

ty, being more suitable to real-world clustering scenarios.

Recently, many clustering methods are promoted by

deep generative approaches, such as autoencoder net-

work [25]. The popularity of the autoencoder network lies

in its powerful ability to capture high dimensional probabil-

ity distributions of the inputs without supervised informa-

tion. The encoder model projects the inputs into the latent

4066

下载后可阅读完整内容，剩余9页未读，立即下载

weixin_38735804

粉丝: 5

深度双编码器驱动的判别性光谱聚类提升

pytorch下使用LSTM神经网络写诗实例

ClusterEncoder:使用自动编码器进行聚类

time-series-autoencoder:Pytorch双注意LSTM自动编码器，用于多元时间序列预测

基于伪标签深度学习的高光谱影像半监督分类.pdf

结合深度学习和高光谱成像技术的杂交秋葵种子识别.pdf

改进谱聚类集成算法：提升高光谱图像分割与大规模数据处理

多视图聚类新框架：基于VAE的多视图解纠缠学习

利用谱注意力革新图Transformer：从光谱角度理解位置编码

无监督频谱解混：探索uDAS稀疏自编码器在去噪中的应用

MATLAB中高光谱数据异常检测技术解析

最新资源