使用SVT的矩阵补全算法

需积分: 49 74 浏览量更新于2024-07-17 收藏 441KB PDF 举报

“SVT：A Singular Value Thresholding Algorithm For Matrix Completion.pdf” 这篇论文提出了一个用于矩阵填充的新颖算法，称为奇异值阈值化（Singular Value Thresholding, SVT）算法。矩阵填充问题在许多重要应用中都有所涉及，比如著名的Netflix问题，即通过少量的用户电影评分数据来恢复完整的评分矩阵。传统的算法，如内点法，对于具有上百万未知项的大规模问题并不适用。 SVT算法是针对具有低秩最优解的问题设计的一种简单且易于实现的首阶迭代方法。它主要关注的是那些在矩阵填充问题中具有低秩特性的矩阵。在每一步迭代中，该算法对矩阵Yk的奇异值进行软阈值操作。软阈值操作是一种线性化的过程，它可以有效地将大奇异值缩小，将小奇异值置零，从而逐步逼近低秩矩阵。矩阵的核范数是最小化问题的关键，因为它是所有可能矩阵中能保持约束条件的矩阵的秩的凸松弛。在低秩矩阵恢复问题中，最小化核范数等同于寻求最优解，而这个最优解通常对应于最低秩的解决方案。SVT算法通过迭代方式寻找满足条件的矩阵，其每次迭代不仅执行软阈值操作，还可能包括其他优化步骤，如梯度下降或最优化调整，以确保向目标低秩矩阵靠拢。奇异值分解（SVD）在SVT算法中起着核心作用。SVD将任何矩阵分解为三个正交矩阵的乘积，即UΣV^T，其中U和V是单位正交矩阵，Σ是对角矩阵，对角线上的元素是原始矩阵的奇异值。通过对Σ进行软阈值操作，可以改变原始矩阵的秩，从而达到矩阵填充的目的。 SVT算法的两大优点使其特别适合于低秩矩阵填充问题。首先，软阈值操作在计算上相对高效，适合大规模数据处理；其次，由于算法的迭代特性，随着迭代次数增加，矩阵的秩会逐渐降低，直至接近最优的低秩解决方案。这种方法不仅在理论上有坚实的数学基础，而且在实践中也表现出良好的性能和效率。 SVT算法提供了一种有效解决矩阵填充问题的途径，尤其是在面对大数据集时，它的效率和实用性得到了广泛认可。通过结合SVD和软阈值技术，该算法能够从有限的数据中重构出低秩矩阵，这对于数据分析、推荐系统和其他依赖于矩阵填充技术的领域有着重要的实际应用价值。

τ, the rank of D

(X) may be considerably lower than that of X, just like the soft-

thresholding rule applied to vectors leads to sparser outputs whenever some entries

of the input are below threshold.

The singular value thresholding operator is the proximity operator associated with

the nuclear norm. Details about the proximity operator can be found in e.g. [42].

Theorem 2.1.

For each τ ≥ 0 and Y ∈ R

×n

, the singular value shrinkage

operator (2.2) obeys

(Y ) = arg min

kX − Y k

+ τ kXk

∗

. (2.3)

Proof. Since the function h

(X) := τkXk

∗

kX −Y k

is strictly convex, it is

easy to see that there exists a unique minimizer, and we thus need to prove that it is

equal to D

(Y ). To do this, recall the deﬁnition of a subgradient of a convex function

f : R

×n

→ R. We say that Z is a subgradient of f at X

, denoted Z ∈ ∂f(X

), if

f(X) ≥ f(X

) + hZ, X − X

i (2.4)

for all X. Now

X minimizes h

if and only if 0 is a subgradient of the functional h

at the point

X, i.e.

0 ∈

X − Y + τ∂k

∗

, (2.5)

where ∂k

∗

is the set of subgradients of the nuclear norm. Let X ∈ R

×n

be an

arbitrary matrix and U ΣV

∗

be its SVD. It is known [16, 46, 65] that

∂kXk

∗



∗

+ W : W ∈ R

×n

, U

∗

W = 0, W V = 0, kW k

≤ 1



. (2.6)

Set

X := D

(Y ) for short. In order to show that

X obeys (2.5), decompose the

SVD of Y as Y = U

∗

+ U

∗

, where U

, V

(resp. U

, V

) are the singular

vectors associated with singular values greater than τ (resp. smaller than or equal to

τ). With these notations, we have

X = U

(Σ

− τ I)V

∗

and, therefore,

Y −

X = τ(U

∗

+ W ), W = τ

−1

∗

By deﬁnition, U

∗

W = 0, W V

= 0 and since the diagonal elements of Σ

have

magnitudes bounded by τ, we also have kW k

≤ 1. Hence Y −

X ∈ τ ∂k

∗

, which

concludes the proof.

2.2. Shrinkage iterations. We are now in the position to introduce the singular

value thresholding algorithm. Fix τ > 0 and a sequence {δ

} of positive step sizes.

Starting with Y

, inductively deﬁne for k = 1, 2, . . .,

(

= D

k−1

= Y

k−1

+ δ

Ω

(M − X

)

(2.7)

until a stopping criterion is reached (we postpone the discussion this stopping criterion

and of the choice of step sizes). This shrinkage iteration is very simple to implement.

One reviewer pointed out that a similar result had been mentioned in a talk given by Donald

Goldfarb at the Foundations of Computational Mathematics conference which took place in Hong

Kong in June 2008.

At each step, we only need to compute an SVD and perform elementary matrix

operations. With the help of a standard numerical linear algebra package, the whole

algorithm can be coded in just a few lines. As we will see later, the iteration (2.7) is

the linearized Bregman iteration, which is a special instance of Uzawa’s algorithm.

Before addressing further computational issues, we would like to make explicit the

relationship between this iteration and the original problem (1.1). In Section 4, we

will show that the sequence {X

} converges to the unique solution of an optimization

problem closely related to (1.1), namely,

minimize τkXk

∗

kXk

subject to P

Ω

(X) = P

Ω

(M ).

(2.8)

Furthermore, it is intuitive that the solution to this modiﬁed problem converges to

that of (1.4) as τ → ∞ as shown in Section 3. Thus by selecting a large value of the

parameter τ, the sequence of iterates converges to a matrix which nearly minimizes

(1.1).

As mentioned earlier, there are two crucial properties which make this algorithm

ideally suited for matrix completion.

• Low-rank property. A remarkable empirical fact is that the matrices in the

sequence {X

} have low rank (provided, of course, that the solution to (2.8)

has low rank). We use the word “empirical” because all of our numerical ex-

periments have produced low-rank sequences but we cannot rigorously prove

that this is true in general. The reason for this phenomenon is, however,

simple: because we are interested in large values of τ (as to better approxi-

mate the solution to (1.1)), the thresholding step happens to ‘kill’ most of the

small singular values and produces a low-rank output. In fact, our numerical

results show that the rank of X

is nondecreasing with k, and the maximum

rank is reached in the last steps of the algorithm, see Section 5.

Thus, when the rank of the solution is substantially smaller than either di-

mension of the matrix, the storage requirement is low since we could store

each X

in its SVD form (note that we only need to keep the current iterate

and may discard earlier values).

• Sparsity. Another important property of the SVT algorithm is that the it-

eration matrix Y

is sparse. Since Y

= 0, we have by induction that Y

vanishes outside of Ω. The fewer entries available, the sparser Y

. Because

the sparsity pattern Ω is ﬁxed throughout, one can then apply sparse matrix

techniques to save storage. Also, if |Ω| = m, the computational cost of up-

dating Y

is of order m. Moreover, we can call subroutines supporting sparse

matrix computations, which can further reduce computational costs.

One such subroutine is the SVD. However, note that we do not need to com-

pute the entire SVD of Y

to apply the singular value thresholding operator.

Only the part corresponding to singular values greater than τ is needed.

Hence, a good strategy is to apply the iterative Lanczos algorithm to com-

pute the ﬁrst few singular values and singular vectors. Because Y

is sparse,

can be applied to arbitrary vectors rapidly, and this procedure oﬀers a

considerable speedup over naive methods.

2.3. Relation with other works. Our algorithm is inspired by recent work in

the area of `

minimization, and especially by the work on linearized Bregman itera-

tions for compressed sensing, see [11–13, 27, 56, 67] for linearized Bregman iterations

and [17,19–21,30] for some information about the ﬁeld of compressed sensing. In this

剩余25页未读，继续阅读

怯

粉丝: 0
资源: 2

使用SVT的矩阵补全算法

PCIe3.0+USB3.0!Intel_7系芯片组解析.doc

矩阵填充中的SVT重构算法

MatrixCompletion

模形式与矩阵补全：Singular Value Thresholding算法探索

维数公式与矩阵完成：SVT算法在模曲线的应用

SET算法：高精度矩阵填充的SVT改进方法

矩阵补全的奇异值阈值算法-SVT

Hecke算子通论：矩阵补全的奇异值阈值算法

模形式初步：矩阵完成的奇异值阈值算法

模曲线解析理论：矩阵完成的奇异值阈值算法

最新资源