LLE与LE驱动的异构多任务多视图聚类算法：数据挖掘新趋势

PDF格式 | 836KB | 更新于2024-08-28 | 189 浏览量 | 举报

在当前数据挖掘领域，多任务多视图聚类算法作为一种新兴的研究热点，近年来备受瞩目。本文由Yiling Zhang、Yan Yan和Tianrui Li等人合作，针对异构场景提出了一个创新的多任务多视图聚类算法，该算法结合了局部线性嵌入（Local Linear Embedding, LLE）和拉普拉斯嵌入（Laplacian Eigenmaps, LE）技术。研究背景源自现实世界中，随着数据集复杂性的增加，任务间的相互关联性和数据的多维度特性使得单一任务或单一视角的处理方法已无法满足需求。传统的多任务多视图学习方法通常应用于分类问题，假设所有任务共享相同的类别标签集。然而，实际应用中，任务之间的标签集合往往存在差异，这就要求算法具备对多样性和异构性的适应能力。本文旨在解决这一问题，通过将LLE和LE的优势相结合，设计了一种能够在处理不同任务和多视角数据时，考虑到每个任务独特性以及数据间潜在联系的聚类算法。 LLE通过保持邻域数据的局部线性关系，有效地捕获数据的低维结构，而LE则通过拉普拉斯矩阵的谱分析，揭示数据的全局结构。作者可能在这篇文章中探讨了如何利用这两种技术来构建一个联合模型，该模型能够同时考虑不同任务间的协同信息和各视图数据的独立特征，从而实现更准确的聚类结果。在论文的具体实现部分，可能包含了数据预处理步骤，如特征选择和降维，以及如何融合来自不同任务和视图的数据。同时，为了处理异构情况，可能会引入某种形式的集成策略，比如加权平均或者通过共享的低维表示空间进行连接。文章还可能探讨了算法的优化策略，如模型参数的选择和迭代方法，以确保在处理多任务多视图数据时具有良好的收敛性和稳定性。此外，实验部分可能展示了算法在多个真实世界数据集上的性能评估，包括对比其他多任务多视图聚类方法的准确性、鲁棒性和效率。通过对复杂异构场景的实验验证，该算法有望提供一种新的多任务多视图聚类框架，为数据挖掘和机器学习领域的实际应用提供有价值的解决方案。这篇研究论文不仅介绍了基于LLE和LE的多任务多视图聚类算法的基本原理，而且强调了其在处理异构场景中的优势和实用性，为理解数据中潜在的复杂关系提供了新的视角。对于数据挖掘和机器学习领域的研究人员和工程师来说，这篇论文提供了有价值的新思路和技术参考。

778 Y. Zhang et al. / Knowledge-Based Systems 163 (2019) 776–786

features. Suppose X

, X

are four samples from the locally

neighborhood of X

, then X

can be represented as follows:

= W

+ W

(1)

where W = {W

, W

, . . . , W

} is the weight matrix and the

dimension of the matrix is same as that of object matrix X. The

ideology of LLE is that this locally linear structure in the locally

neighborhood does not change after dimension reduction. There-

fore, we obtain the relationship after data mapping and dimension

reduction as follows:

′

= W

′

+ W

′

+ W

′

+ W

′

(2)

where X

′

= {X

′

, X

′

, . . . , X

′

} is the object matrix after mapping. As

we know, the linear structure only affects the relationship nearby

the current sample and it does not have an influence on the objects

staying away from the sample. The objective function of LLE is

partitioned to two parts:

J(W ) =



i=1

∥X

−



j=1

∥

(3)

J(X

′

) =



i=1

∥X

′

−



j=1

′

∥

(4)

Eq. (3) is to obtain the weight matrix W of each object to learn

the relationship between the objects and their neighborhoods.

And Eq. (4) exploits the matrix from Eq. (3) to get the results of

dimension reduction, a new object matrix X

′

2.2. Laplacian Eigenmaps (LE)

Laplacian Eigenmaps (LE) is similar to LLE, and it also utilizes the

locality to construct the relationship between objects and preserve

the manifold structure of data [40]. The main idea of LE is that if

object i and object j are similar in original space, then they will be

very similar after dimension reduction. Assume that there are n ob-

jects with m

attributes, X = {X

, X

, . . . , X

} ∈ R

n∗m

is the object

matrix, and Y = {Y

, Y

, . . . , Y

} ∈ R

n∗m

is the matrix with

attributes after dimension reduction, i = 1, 2, . . . , n, m

, W

= {W

i,1

, W

i,2

, . . . , W

i,j

, . . . , W

i,k

} represents the adjacent

matrix of the ith object with k samples, W

i,j

is the jth entry of

. LE consists of three steps, including graph construction, weight

decision and eigenmapping.

The first step, graph construction, is achieved by some simple

methods, such as k-nearest-neighbors (KNN) algorithms. Secondly,

LE decides the weight between objects by exploiting the con-

structed relationship from the first step and using fundamental

functions, like thermonuclear function, to get the adjacent matri-

ces. Finally, it computes the eigenvalues and eigenvectors of Lapla-

cian matrix L by utilizing the adjacent matrices from the second

step to achieve eigenmapping, where Laplacian matrix L

= D

−W

and D



is the diagonal matrix, W = {W

, W

, . . . , W

}

contains the adjacent matrix of each object.

In summary, the objective function of LE is shown as follows:

J(Y ) = min



i,j

∥Y

− Y

∥

i,j

(5)

The difference between LE and LLE is that LLE uses the neighbors

to reconstruct a test sample and it transforms original data from

high dimensional space into separable low dimensional space,

while LE considers keeping the similar distance from origin space

to a new feature space and it makes related samples to be as close

as possible in reduced space. They both preserve the locality and

achieve dimension reduction effectively, and play a vital role on

clustering or classifying high-dimension data. However, neither of

them is able to achieve two times of dimension reduction, because

the weight matrix or adjacent matrix is not changed and it makes

no sense on the second time of dimension reduction. Therefore,

we combine the two algorithms to solve multi-task multi-view

clustering.

It is necessary for the two dimension because of the complexity

of original data. For multi-task multi-view problems, especially

in heterogeneous situation, various tasks and views may have

different structures. We explore the first dimension reduction to

transform the original data from multiple views into the common

intermediate space (view space) and obtain the more separable

data structures, and LLE is suitable for this step because it keeps

the locally linear structures and makes the data be separable si-

multaneously; The second dimension reduction is to transform

the data from the intermediate space (view space) into task space

and extract the shared and complementary features among various

tasks. We want to obtain the samples that are relate to each other

to be close such that sufficient shared and complementary features

are learned, and LE is more appropriate for this step.

3. Methodology

For multi-task multi-view clustering, we usually consider that

there is similar relationship between several views of each task.

However, it is difficult to directly learn knowledge through the

original data. In order to facilitate information extracting, we col-

lectively learn the feature transformation matrices for all views

from each task by LLE. Besides, some similar features exist in

multiple tasks, we want to keep the distances between tasks in

the process of mapping and solve the multi-task problems by LE.

In this section, we introduce the detail of our proposed method,

L3E-M2VC.

As previous mentioned, LLE and LE are for learning the struc-

tures in views of each task and multiple tasks, respectively. There-

fore, the process of multi-task multi-view clustering for knowledge

sharing is partitioned to two steps. First step is mapping the views

of each task to a common space called view space. Second is

extracting the features in multiple tasks and mapping them to a

discriminant space called task space, which is shown in Fig. 1.

Firstly, through the first transformation step, the data samples

of the vth view from the tth task in original space is transformed

into the view space, which just makes samples be more separable

and each transformation matrix is dependent on various tasks and

views. The view space can be seemed as a common intermediate

latent space that is shared by all the views from each task [38].

Then through the transformation in the second step, the sample

is mapped from view space to task space. Assume the second

transformation matrix is R

, which consists of two parts, including

the shared feature R which is common in multiple tasks, and the

complementary feature Rc

that belongs to only one task. Of course,

the second transformation step for the different views in the same

task is identical, because they are mapped to the same space in

the first step. In other words, the view space offers a common

plane for various views in each task which uniforms the features

in different views; then the task space considers the shared and

complementary features in multiple tasks in order to facilitate

knowledge sharing in multi-task multi-view clustering.

3.1. Problem definition

Assume that there are T clustering tasks with V

views, and

= {x

t,1

, x

t,2

, . . . , x

t,i

, . . . , x

t,n

} ∈ R

∗d

, i = 1, 2, . . . , n

, t =

1, 2, . . . , T , and v = 1, 2, . . . , V

, where n

represents the number

of objects in tth task, and d

is the dimension of feature of the vth

view in the tth task. Obviously, each task is clustered to c

classes,

and it is different for various tasks about the label sets because

of the heterogeneity. The structures of data in view space Y

and

剩余10页未读，继续阅读

weixin_38737335

粉丝: 4

LLE与LE驱动的异构多任务多视图聚类算法：数据挖掘新趋势

基于稀疏编码和流形一致性多视图图像聚类源码解析

聚类选择k近邻LLE算法在故障检测中的应用

局部线性嵌入算法(LLE)详解及MATLAB仿真实现

LLE.zip_LLE算法推导_classification lle_lle算法的应用_resolution_多维聚类

基于聚类选择k近邻的LLE算法及故障检测

基于LLE算法的人脸识别方法研究.pdf

论文研究-基于LLE算法的人脸识别方法.pdf

LLE.rar_LLE PPT_LLE参数_LLE算法仿真_lle_lle 算法

基于聚类和改进距离的LLE方法在数据降维中的应用

基于维数约简的无监督聚类算法研究.docx

最新资源