多视图缺失数据补全：ILCA方法与低秩稀疏建模

需积分: 5 28 浏览量更新于2024-08-26 收藏 1.4MB PDF 举报

随着医疗诊断、网页分类和多媒体分析等众多场景的发展，多视图数据变得日益普遍。然而，一个重要的挑战是并非所有实例在所有视图中都有完整的数据，导致存在缺失视图问题。本文关注的是在多视图数据中进行特征级别的缺失数据补全。首先，为了捕捉不同视图之间的语义互补性和相同的分布特性，作者提出了一种Isomorphic Linear Correlation Analysis (ILCA)方法。ILCA的核心思想是通过学习一组优秀的同构特征，将多视图数据线性映射到一个特征同构子空间中，这样可以有效地挖掘不同视图之间的共享信息。这种方法强调了特征之间的内在联系，使得即使在缺失情况下也能利用其他视图的结构来推断缺失部分。接着，论文假设缺失视图的数据服从正态分布，这有助于构建更精确的模型。基于这一假设，缺失视图数据矩阵被分解为低秩成分和稀疏贡献两部分。这样的分解有助于识别数据中的潜在模式和异常值，因为正常情况下，大部分数据应该具有一定的规律性，而缺失的部分则可能是随机或异常的。为了实现缺失视图的补全，研究人员采用了一种结合了低秩和稀疏成分的策略。他们提出了一种方法，通过学习到的低秩结构来估计缺失数据的全局趋势，同时利用稀疏成分捕捉可能的局部特征。这个过程旨在最大限度地保留数据的分布特征，确保补全后的数据不仅在统计上合理，而且在内容上也与原始数据保持一致性。最后，该研究还可能涉及优化算法和评估指标，以确保补全过程的有效性和鲁棒性。通过实验验证，ILCA方法能够显著提高多视图数据的学习性能，减少由于缺失数据导致的信息损失，并且在实际应用中展现出良好的可扩展性和适应性。总结来说，这篇研究论文探讨了在多视图数据中处理缺失问题的方法，重点在于利用ILCA和低秩-稀疏分解策略，旨在发现和利用跨视图的结构信息来完成缺失数据，这对于多源数据融合和跨模态学习至关重要。通过这种方式，该研究为解决实际问题中的多视图数据处理提供了一个有效且实用的解决方案。

 Extensive experiments on four multi-view datasets

are conducted to demonstrate the effectiveness of

the proposed framework.

1.2 Organization

The remainder of this paper is organized as follows: We

present a general feature-level framework for completing

missing view to obtain the integrated representations for

multi-view data in Section 2.1. In Section 2.2, a novel Iso-

morphic Linear Correlation Analysis model is developed

for correlating different views through learning a set of

excellent isomorphic features. We build a new Identical Dis-

tribution Pursuit Completion model to recover missing

view of multi-view data under both semantic complemen-

tarity and identical distribution restraints in Section 2.3. Fur-

thermore, Section 3 provides an efﬁcient algorithm to solve

the proposed framework and analyzes the computational

complexities and convergence rates of the proposed algo-

rithms. Section 4 gives a broad overview of some related

work. Experimental results and analyses are reported in

Section 5. Section 6 concludes this paper.

1.3 Notations

Here we establish some notations to be used throughout

this paper. Assume V

and V

are two different views. Let

the data matrices X

¼½x

; ...;x



2 R

d

and Y

½y

; ...;y



2 R

d

be two sets of existing heterogeneous

representations from the V

and V

, respectively, where

2 R

is the ith sample from V

, y

2 R

is the ith sample

from V

, n

is the number of available samples, and d

and

are the dimensionalities of the heterogeneous low-level

feature spaces V

and V

. Note that for i ¼ 1; ...;n

, ðx

represents the ith couple of heterogeneous representations.

We assume that both fx

i¼1

and fy

i¼1

are centered, i.e.,

i¼1

¼ 0 and

i¼1

¼ 0. Let the data matrix X

½x

þ1

; ...;x

þn



2 R

d

be a set of missing representa-

tions from the V

and the data matrix Y

¼½y

þ1

; ...;

þn



2 R

d

be a set of existing heterogeneous repre-

sentations from the V

corresponding to the missing repre-

sentations X

We use jjAjj



i¼1

to denote the trace (nuclear)

norm of a matrix A ¼½a

2R

pq

, where r ¼ rankðAÞ

denotes the rank of A and fs

i¼1

is the set of singular val-

ues of A in a non-increasing order. jjAjj

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

i¼1

j¼1

is the Frobenius norm of A.IfA is a square matrix, then let

trðAÞ¼

i¼1

be the trace of A. For two matrices A and

B, hA; Bi¼trðA

BÞ denotes the matrix inner product. For a

vector b 2 R

, let jjbjj

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

i¼1

be the ‘

-norm of b.

Additionally, let jHj be the number of elements in the set

H; ÏfðCÞ denotes the gradient of any smooth function fðÞ

at the point C; for w 2 R

, we denote by diagðwÞ the diago-

nal matrix having the components of the vector w on the

diagonal; let D be a set of representations, meanðDÞ denotes

the average value of D. I

2 R

is an identity matrix.

2THE PROPOSED FORMULATION

We propose a general feature-level framework to complete

missing view of multi-view data. A graphical illustration of

the proposed formulation is given in Fig. 4 to facilitate the

understanding the proposed formulations and algorithms

signiﬁcantly.

2.1 Overview of the Proposed Formulations

We provide an overview of the proposed formulations by

using the example in Fig. 4. In this example, a set of multi-

view data consists of the views MRI and PET. However, the

MRI view is missing, such as all attributes in the representa-

tions x

, x

, and x

are totally absent.

To recover missing view of multi-view data, a feature-

isomorphic subspace is learned by ILCA model to build a

bridge between multiple heterogeneous low-level feature

spaces in the proposed framework, in which the same

dimension and attributes are used to represent the same

semantic concept. Speciﬁcally, to fully exploit both semantic

complementarity and similar distributions among different

views as shown in Fig. 3, multiple linear transformations A

and B are learned using the existing multi-view data X

and Y

to eliminate the heterogeneity across them. Thus, a

feature-isomorphic subspace is obtained by a set of learned

excellent isomorphic features, in which the correlated repre-

sentations from different views are coupled together to

capture the commonality among the heterogeneous repre-

sentations from different views. Consequently, some maxi-

mum neighbourhoods are established among different

categories, such as the maximum neighbourhoods of Class

1 and Class 2 in Fig. 4. We can measure the correlation

among the multi-view data in the feature-isomorphic

Fig. 4. The proposed framework for completing missing view of multi-view data.

1298 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 30, NO. 7, JULY 2018

剩余13页未读，继续阅读

weixin_38571104

粉丝: 3
资源: 944

多视图缺失数据补全：ILCA方法与低秩稀疏建模

适应性数据补全的可扩展不完整多视图聚类

淘淘商城商品管理：缺失笔记补全-添加与分类功能详解

深度学习中适应性特征投影与分布对齐的不完全多视图聚类

不完备或缺失数据的补全方法

基于多视角缺失补全算法的数据挖掘研究.pdf

基于时空多视图BP神经网络的城市空气质量数据补全方法研究.pdf

生成式不完整多视图数据聚类.docx

局部多视图光谱聚类

多模态数据融合与知识发现.pptx

深度补全技术的简易实现与应用

最新资源