Bregman散度下的非负矩阵分解扩展

需积分: 9 170 浏览量更新于2024-09-27 收藏 149KB PDF 举报

"这篇文档是Inderjit S. Dhillon和Suvrit Sra于2005年发表的UTCSTechnical Report，标题为'Generalized Nonnegative Matrix Approximations with Bregman Divergences'，探讨了如何使用Bregman散度进行非负矩阵分解（NMF）的扩展。" 非负矩阵分解（Non-negative Matrix Factorization, NMF）是一种用于数据降维和分析的技术，它可以将非负输入数据转化为稀疏且非负的组成部分表示。这种方法在文本分析、文档聚类、人脸识别、语言建模、语音处理等多个领域都有广泛的应用。然而，尽管NMF有诸多应用场景，但其计算因子的算法开发相对滞后。本文档主要贡献在于提出了一个新的通用化NMF问题模型，该模型利用Bregman散度最小化输入矩阵与其低秩近似之间的差距。Bregman散度是一种衡量两个概率分布或向量之间差异的度量，它比传统的欧几里得距离更为灵活，可以适应不同的优化目标和数据特性。在Dhillon和Sra的工作中，他们使用乘法更新公式来求解这些新提出的广义NMF问题。这些乘法更新规则是之前由Lee和Seung在他们的开创性工作中提出的一种特殊情况。通过Bregman散度，算法可以更好地捕捉数据的结构，尤其是当数据具有特定的非负属性时，如在文本数据中词频的非负性。作者不仅提出了新的优化框架，还可能讨论了算法的收敛性和效率。他们可能还提供了实验结果，以证明新方法在实际应用中的性能和优势，以及与传统NMF方法相比的改进之处。这篇报告对理解和应用基于Bregman散度的NMF提供了深入的理论和实践指导，对于想要在NMF领域进行深入研究或者解决特定问题的研究者和工程师来说，是非常有价值的资源。

3 Algorithms

In this section we present algorithms that seek to optimize (2.2) and (2.3). Our algorithms are iterative in

nature, and are directly inspired by the efﬁcient algorithms of Lee and Seung [20]. Appealing properties

include ease of implementation and computational efﬁciency.

Note that the problems (2.2) and (2.3) are not jointly convexin B and C, so it is not easy to obtain globally

optimal solutions in polynomial time. Our iterative procedures start by initializing B and C randomly or

otherwise. Then, B and C are alternately updated until there is no further appreciable change in the objective

function value.

3.1 Algorithms for (2.2)

We utilize the concept of auxiliary functions [20] for our derivations. It is sufﬁcient to illustrate our methods

using a single column of C (or row of B), since our divergences are separable.

Deﬁnition 3.1 (Auxiliary function). A function G(c, c

′

) is called an auxiliary function for F (c) if:

1. G(c, c) = F (c), and

2. G(c, c

′

) ≥ F (c) for all c

′

Auxiliary functions turn out to be useful due to the following lemma.

Lemma 3.2 (Iterative minimization). If G(c, c

′

) is an auxiliary function for F (c), then F is non-increasing

under the update

t+1

= argmin

G(c, c

Proof. F (c

t+1

) ≤ G(c

t+1

, c

) ≤ G(c

, c

) = F (c

As can be observed, the sequence formed by the iterative application of Lemma 3.2 leads to a monotonic

decrease in the objective function value F (c). For an algorithm that iteratively updates c in its quest to min-

imize F (c), the method for proving convergence boils down to the construction of an appropriate auxiliary

function. Auxiliary functions have been used in many places before, see for example [5, 20].

We now construct simple auxiliary functions for (2.2) that yield multiplicative updates. To avoid clutter

we drop the functions α and β from (2.2), noting that our methods can easily be extended to incorporate these

functions.

Suppose B is ﬁxed and we wish to compute an updated column of C. We wish to minimize

F (c) = D

(Bc, a), (3.1)

where a is the column of A corresponding to the column c of C. The lemma below shows how to construct

an auxiliary function for (3.1). For convenience of notation we use ψ to denote ∇ϕ for the rest of this section.

Lemma 3.3 (Auxiliary function). The function

G(c, c

′

) =





−

ϕ(a

) − ψ(a

)



(Bc)

− a



(3.2)

with λ

= (b

′

)/(

′

), is an auxiliary function for (3.1). Note that by deﬁnition

= 1, and as

both b

and c

′

are nonnegative, λ

≥ 0.

Proof. It is easy to verify that G(c, c) = F (c), since

= 1. Using the convexity of ϕ, we conclude

that if

= 1 and λ

≥ 0, then

F (c) =





− ϕ(a

) − ψ(a

)



(Bc)

− a



≤





−

ϕ(a

) − ψ(a

)



(Bc)

− a



= G(c, c

′

剩余13页未读，继续阅读

danjuaner

粉丝: 0
资源: 6

Bregman散度下的非负矩阵分解扩展

Generalized Low Rank Models

The generalized symmetric procrustes problems associated with matrix equation AX=B

generalized decision rule approximations for SP.pdf

Generalized Mobius Inversion Theory Associated with Non-Standard Analysis (1983年)

Generalized latent variable models with non-linear effects

Generalized sliced Latin hypercube designs with slices of different sizes

New Analytical Solution of a Generalized Negative Variable-coefficient eKdV Equation in Internal Wave

Solutions to the generalized Sylvester matrix equations by a singular value decomposition (2007年)

Several Generalized Matrix Versions of Kantorovich Inequalities

An Iterative Method for the Generalized Bisymmetric Solution of Matrix Equation AXB=C

最新资源