低秩矩阵分解概率模型综述

76 浏览量更新于2024-07-15 收藏 589KB PDF 举报

"这篇论文是关于低秩矩阵分解概率模型的研究，主要探讨了低秩矩阵分解的方法，如PCA、SVD和NMF，以及近年来出现的各种概率模型。这些模型将低秩分量视为随机变量，利用概率分布和贝叶斯推断来处理噪声污染的数据。文章介绍了概率分布和共轭先验在简化推理过程中的应用，同时概述了吉布斯采样和变分贝叶斯推理这两种主要的推断方法。此外，论文还按照不同的矩阵分解公式，如PCA、矩阵分解、鲁棒PCA、NMF和张量分解，将概率低秩矩阵分解模型进行了分类和回顾。最后，作者提出了未来研究方向的讨论。" 在低秩矩阵分解中，主成分分析(PCA)、奇异值分解(SVD)和非负矩阵分解(NMF)是最常见的技术，用于获取数据矩阵的低秩近似。传统的这些方法通常假设数据受到随机噪声的干扰，因此可以利用最大似然(ML)或最大后验(MAP)估计来求解低秩成分。然而，近年来的低秩矩阵分解概率模型引入了一个新的视角，即把低秩分量视为随机变量，这样可以更好地处理不确定性并建模噪声。在概率模型中，选择适当的概率分布至关重要。这篇论文中，作者讨论了常用的概率分布，并提及了共轭先验，它们能够简化贝叶斯推断过程，使得参数估计更加高效。贝叶斯推断的两种主要方法——吉布斯采样和变分贝叶斯推理——在处理复杂概率模型时特别有用。吉布斯采样是一种马尔科夫链蒙特卡洛方法，用于生成概率模型的样本；而变分贝叶斯推理则通过优化变分分布来近似后验概率，它在处理高维问题时更为有效。接着，作者根据不同的矩阵分解公式，比如PCA、矩阵分解、鲁棒PCA、NMF和张量分解，将概率低秩矩阵分解模型进行分类。这些模型各有特色，适用于不同的应用场景，例如，鲁棒PCA能够抵抗异常值，NMF则假设非负性约束，增加了模型的解释性。论文最后部分，作者讨论了未来的研究挑战和可能的方向，这可能包括更复杂的噪声模型、更有效的推断算法，以及将这些概率模型应用于新的领域，如机器学习、图像处理和大数据分析等。这样的研究对于提升矩阵分解的性能和应用范围具有重要意义。

Entropy 2017, 19, 424 6 of 33

Entropy 2017, 19, 424 6 of 32

IG( , )

=,=,=

−abp

=2 , 0, =→ab p

Beta( , )ab

U(0,1)

1, 1==ab

St( , , )

λν

(, )

−



→+∞

GIG( , , )pab

~Gam( , )x

~Exp( )x

~IGam( , )

−

IGam( , )

−

0→a

~Lap(0, )x

−

Figure 1. Relationships among several probability distributions.

2.2. Conjugate Priors

Let

x be a random vector with the parameter vector z and

{

}

, ,...,

X = xx x a collection of N

observed samples. In the presence of latent variables, they are also absorbed into

z . For given z , the

conditional probability density/mass function of

x is denoted by (|)p xz. Thus, we can construct

the likelihood function:

(| ) ( |) ( |).

LXpX p

∏

zzxz

(6)

As for variational Bayesian methods, the parameter vector

z is usually assumed to be

stochastic. Here, the prior distribution of

z is expressed as ()p z .

To simplify Bayesian analysis, we hope that the posterior distribution

(| )pXz is in the same

functional form as the prior ()p z . Under this circumstance, the prior and the posterior are called

conjugate distributions and the prior is also called a conjugate prior for the likelihood function

(| )LXz

[54,66]. In the following, we provide three most commonly-used examples of conjugate priors.

Example 3. Assume that random variable x obeys the Bernoulli distribution with parameter

. We have

the likelihood function for

(|) Bern( |) (1 ) (1 )

iii i

xNx

LX x

μμμμμμ

−



==−=−

∏∏

(7)

where the observations

{0,1}

x ∈ . In consideration of the form of (|)LX

, we stipulate the prior

distribution of

as the Beta distribution with parameters

and

()

() Beta(|,) (1 ) .

()()

pab

μμ μμ

−−

Γ+

== −

ΓΓ

(8)

At this moment, we get the posterior distribution of

via the Bayes’ rule:

()(|)

(|) ()(|).

()

ppX

pX ppX

μμ

μμμ

=∝

(9)

Because

()

()( |) ()(| )

(1 ) 1

(1 )

ax bNx

ppX pL X

μμμμ

μμμ μ

μμ

−

−−

+− +−−



∝− −



∝−

(10)

Figure 1. Relationships among several probability distributions.

2.2. Conjugate Priors

Let

be a random vector with the parameter vector

and

X =

{

, x

, . . . , x

}

a collection of

N observed samples. In the presence of latent variables, they are also absorbed into

. For given

the conditional probability density/mass function of

is denoted by

p(x|z)

. Thus, we can construct

the likelihood function:

L(z|X) = p(X|z) =

∏

i=1

p(x

|z). (6)

As for variational Bayesian methods, the parameter vector

is usually assumed to be stochastic.

Here, the prior distribution of z is expressed as p(z).

To simplify Bayesian analysis, we hope that the posterior distribution

p(z|X)

is in the same

functional form as the prior

p(z)

. Under this circumstance, the prior and the posterior are called

conjugate distributions and the prior is also called a conjugate prior for the likelihood function

L(z|X)

[

]. In the following, we provide three most commonly-used examples of conjugate priors.

Example 3.

Assume that random variable

obeys the Bernoulli distribution with parameter

. We have the

likelihood function for x:

L(µ|X) =

∏

i=1

Bern(x

|µ) =

∏

i=1

(1 −µ)

1−x

= µ

∑

i=1

(1 −µ)

N−

∑

i=1

(7)

where the observations

∈

{

0, 1

}

. In consideration of the form of

L(µ|X)

, we stipulate the prior distribution

of µ as the Beta distribution with parameters a and b:

p(µ) = Beta(µ|a, b) =

Γ(a + b)

Γ(a)Γ(b)

a−1

(1 −µ)

b−1

. (8)

At this moment, we get the posterior distribution of µ via the Bayes’ rule:

p(µ|X) =

p(µ)p(X|µ)

p(X)

∝ p(µ)p(X|µ). (9)

Because

p(µ)p(X|µ) = p(µ)L(µ|X)

∝ µ

a−1

(1 −µ)

b−1

∑

i=1

(

1 − µ

)

N−

∑

i=1

∝ µ

∑

i=1

−1

(1 −µ)

b+N−

∑

i=1

−1

(10)

剩余32页未读，继续阅读

weixin_38729108

粉丝: 5
资源: 896

低秩矩阵分解概率模型综述

低秩矩阵分解

融合多权重因素的低秩概率矩阵分解推荐模型.docx

低秩矩阵分解 matlab

基于低秩矩阵分解的显著性目标分割模型研究历程与原理说明

。低秩矩阵分解自2015年到2021年的研究一直出于相对空白的状态么

② 基于低秩矩阵分解的显著性目标分割模型研究历程与原理简述

低秩矩阵分解用在半监督医学图像分割任务上是否大材小用了，尤其对于单个计算机上完成的任务来说

低秩矩阵分解python

使用matlab写一段代码来通过低秩矩阵分解计算得到图片的秩并且画出图像稀疏矩阵的统计结果

使用非负矩阵分解对图像矩阵进行分解，并对分解后的低秩矩阵添加Laplace噪声，根据添加Laplace噪声的低秩矩阵重构图像

最新资源