自适应半监督降维方法：基于成对约束与图优化

3 浏览量更新于2024-08-26 收藏 800KB PDF 举报

"这篇研究论文探讨了一种基于成对约束加权和图优化的自适应半监督降维方法（Adaptive Semi-supervised Dimensionality Reduction，ASSDR），旨在处理高维数据中的分类和分析任务。该方法利用必须链接约束（must-link constraint）和不能链接约束（cannot-link constraint）来表示实例间的类别关系，通过自适应地调整这些约束的权重并优化图结构，以获得优化后的低维数据表示。" 在高维数据不断增长的背景下，降维技术在实际的数据处理和分析中扮演着越来越关键的角色。半监督学习是一种介于无监督学习和有监督学习之间的方法，它利用少量的标记数据（即监督信息）和大量未标记数据来训练模型。本文提出的ASSDR方法就是针对这种场景设计的。 ASSDR的核心在于利用成对约束，这些约束可以是专家提供的领域知识，指示实例对是否属于同一类（必须链接约束）或不同类（不能链接约束）。通过这些约束，算法能够捕获数据的内在结构和类别信息，即使在只有部分数据有标签的情况下也能有效地进行降维。在ASSDR中，关键创新点是算法能自适应地调整约束的权重。这意味着算法可以根据数据的特性动态地改变各个约束的重要性，以更好地反映数据的真实分布。同时，通过优化图结构，ASSDR可以构建一个能够捕捉数据间相似性的图，每个实例在图中表示为节点，边的权重则由实例之间的相似度决定。通过最小化图上的拉普拉斯矩阵，可以得到低维空间的投影，使得具有相同约束的实例在新空间中尽可能接近或远离，从而实现降维的同时保持类别信息。此外，由于ASSDR考虑了未标记数据的信息，因此它在处理大规模数据集时可能比完全依赖标记数据的方法更具优势，尤其在标记数据获取困难或成本高昂的情况下。这种方法在模式识别、图像分类、社交网络分析等领域有着广泛的应用潜力。这篇研究论文介绍了一种新的自适应半监督降维算法，该算法通过成对约束的权重调整和图优化，有效地解决了高维数据的降维问题，并在有限的监督信息下保持了良好的分类性能。其理论与实践意义对于进一步推动半监督学习和高维数据处理的研究具有重要价值。

dimensionality reduction process into a uniﬁed framework,

which results in an optimized graph rather than a prede-

ﬁned one.

In this paper, we ﬁrst propose a semi-supervised di-

mensionality reduction method called weighted pairwise

constraints based semi-supervised dimensionality reduction

(WPCSSDR). Then, a novel semi-supervised dimension-

ality reduction method called adaptive semi-supervised

dimensionality reduction (ASSDR) is proposed which uses

WPCSSDR as a subprogram. ASSDR ﬁrst initialize all the

pairwise constraints with equal weights and construct a

neighborhood graph with initial adjacency weight matrix,

and then the following procedure is repeated until the stop

condition is satisﬁed: (1) reducing the dimensionality of the

original space with the current weighted pairwise con-

straints and the current adjacenc y weight matrix using

WPCSSDR; (2) clustering in the reduced subspace; (3)

updating the weights of the pairwise constraints according

to the clustering result; (4) updating the adjacency weight

matrix. As a result, we can get the optimized weights of the

pairwise constraints and the optimized adjacency weight

matrix of the neighborhood graph, as well as the projection

matrix.

2 Adaptive semi-supervised dimensionality

reduction algorithm (ASSDR)

2.1 The problem

Here we deﬁne the weighted pairwise constraints based

semi-supervised dimensionality reduction problem as fol-

lows: Suppose we have a set of D-dimensional data sam-

ples X ¼fx

; x

; :::; x

gR

together with some pairwise

must-link constraints (M) and cannot-link constraints (C)as

domain knowledge: ðx

; x

Þ2M,ifx

and x

belong to the

same class; ðx

; x

Þ2C,ifx

and x

belong to the different

classes. In addition, each pairwise constraint ðx

; x

Þ has a

weight S

to indicate the importance of information owned

by itself, which mea ns one should be paid more attention to

the pairwise constraint ðx

; x

Þ if S

is large. In this case,

what we want to do is to ﬁnd a set of linear projection

vectors W ¼½w

; w

; :::; w

2R

Dd

, where d\\D, such

that the transformed low dimensional projections

Y ¼fy

; y

; :::; y

gR

, where y

¼ W

, can preserve

some properties of the original dataset as well as the

pairwise constraints in M and C. For the convenience of

discussion, one dimensional case is discussed below,

namely y

¼ w

, which is easy to be extended to the high

dimensional case.

2.2 Weighted pairwise constraints based semi-

supervised dimensionality reduction

(WPCSSDR)

To make use of the pairwise constraints, the pairwise points

in M should end up close to each other whi le the pairwise

points in C should end up far from each other. This means

the instances belonging to the same class in the original

space should be close to each other in the reduced sub-

space, and the instances belonging to different classes in

the original space should be far from each other in the

reduced subspace. In addition, if ðx

; x

Þ2M and S

large, it means the Euclidean distance of x

and x

in the

low dimension should be smaller to each other than with

small weight; if ðx

; x

Þ2C and S

is large, it means the

Euclidean distance of x

and x

in the low dimension should

be larger from each other than with small weight.

As for the weighted must-link constraints M, the in-

traclass compactness is characterized by the term as follows:

ðwÞ¼

ðx

Þ2Morðx

Þ2M

 w



¼ 2



 2

i;j



¼ 2w

XðD

 S

ÞX

¼ 2w

ð1Þ

ðx

; x

Þ2Morðx

; x

Þ2M

0 else



ð2Þ

where D

is a diagonal matrix whose entries are column

sums of S

(or row sums, since S

is symmetric),

, L

¼ D

 S

is the Laplacian matrix [26].

ðwÞ should be as small as possible, which means the

weighted distance sum in the transformed low dimensional

subspace between instances involved in the must-link

constraints M should be small.

On the other hand, the interclass separability of the

weighted cannot-link constraints C can be characterized by

the term:

ðwÞ¼

ðx

Þ2Corðx

Þ2C

 w



¼ 2



 2

i;j



¼ 2w

XðD

 S

ÞX

¼ 2w

ð3Þ

Int. J. Mach. Learn. & Cyber. (2017) 8:793–805 795

123

剩余12页未读，继续阅读

weixin_38715008

粉丝: 5
资源: 1016

自适应半监督降维方法：基于成对约束与图优化

使用成对约束的稀疏表示的自适应半监督降维

基于加权Voronoi图和自适应PSO算法的电动汽车充换电站联合规划.pdf

基于有向加权复杂网络的自适应粒子群算法.pdf

基于能量约束的自适应加权图像盲复原算法

基于随机加权估计的Sage自适应滤波及其在导航中的应用

基于图像局部加权熵和自适应阈值的角点检测算法

基于加窗和随机加权的自适应无味卡尔曼滤波器

acwmf.rar_acwmf_matlab 加权滤波_加权中值滤波_加权滤波 matlab_自适应加权

基于引导系数加权和自适应图像增强去雾算法.docx

基于加权RBPF的SAR图像自适应主动轮廓模型。

最新资源