快速 proximity 方法：卷积稀疏编码的优化算法

需积分: 10 146 浏览量更新于2024-09-07 收藏 147KB PDF 举报

"A Fast Proximal Method for Convolutional Sparse Coding是关于卷积稀疏编码的一个高效算法，基于Fast Iterative Shrinkage-Thresholding Algorithm (FISTA)，旨在解决大图像上的稀疏编码的可扩展性问题。该方法不仅提高了收敛速度，还能轻松推广到其他成本函数。 I. 引言在机器学习，尤其是计算机视觉任务中，选择合适的数据表示（即特征空间）对于实现如分类、聚类等任务至关重要。稀疏编码是一种无监督特征学习技术，常作为构建深度网络的基础模块。然而，传统的稀疏编码方法在处理大型图像时面临可扩展性挑战。为了解决这一问题，文献提出了卷积稀疏编码。 II. 卷积稀疏编码卷积稀疏编码通过在图像上应用卷积操作来提取特征，这有助于捕捉局部相关性和图像的结构信息。相比普通的稀疏编码，它更适合处理具有空间一致性的数据，如图像中的纹理和边缘。 III. FISTA与卷积扩展 FISTA是一种快速的迭代 shrinkage-thresholding 算法，被广泛用于求解稀疏优化问题。在此文中，作者将FISTA应用于卷积环境，提出了一种新的高效算法。实验表明，这种卷积扩展的FISTA不仅能加速现有方法的收敛速度，而且其灵活性允许适应各种损失函数，增加了方法的通用性。 IV. 数值实验与性能通过数值实验，作者证明了所提出的算法在计算效率和泛化能力方面的优势。与其他方法相比，它在收敛速度上有显著提升，并且在处理不同任务时能保持良好的性能。 V. 结论这项工作为卷积稀疏编码提供了更有效的解决方案，对于构建高效、适应性强的深度学习模型具有重要意义。它为图像处理和计算机视觉领域的特征提取提供了一个强大的工具，尤其是在大数据集和高分辨率图像的应用中。关键词：卷积，稀疏编码，特征提取，无监督学习。" 这篇摘要涵盖了卷积稀疏编码的基本概念，引入了FISTA算法在解决卷积稀疏编码问题上的优势，以及实验结果验证了新方法的优越性。在实际应用中，这种方法可能对提高图像处理和计算机视觉任务的性能有重大贡献。

展开

A Fast Proximal Method for Convolutional Sparse Coding

Rakesh Chalasani Jose C. Principe Naveen Ramakrishnan

Abstract—Sparse coding, an unsupervised feature learning

technique, is often used as a basic building block to construct

deep networks. Convolutional sparse coding is proposed in the

literature to overcome the scalability issues of sparse coding

techniques to large images. In this paper we propose an efﬁcient

algorithm, based on the fast iterative shrinkage thresholding

algorithm (FISTA), for learning sparse convolutional features.

Through numerical experiments, we show that the proposed

convolutional extension of FISTA can not only lead to faster

convergence compared to existing methods but can also easily

generalize to other cost functions.

Index Terms—Convolution, Sparse Coding, Feature Extrac-

tion, Unsupervised Learning.

I. INTRODUCTION

Choosing the appropriate data representation (i.e. feature

space) that has desirable properties for a given task (e.g.

classiﬁcation, clustering) is a central issue in statistical

machine learning. This issue becomes all the more important

for computer vision tasks (e.g. object detection) due to

high dimensionality of input features and high variability

in instances. Hand crafted features like SIFT [1], HoG [2],

etc., have been successful to over come these issues. Recent

developments in deep architectures, where multiple layers

of feature extractors (e.g. RBM, sparse coding blocks) are

stacked, attempt to learn the features automatically in an

unsupervised manner using large-scale unlabeled data [3],

[4]. Some of the popular deep networks use sparse encoders

as building blocks [5], [6], [7] and hence, in this work we

focus on these sparse coding blocks.

Generally, sparse coding is based on the idea that an

observation, y ∈ R

, can be encoded using an over-complete

dictionary of ﬁlters, C ∈ R

p×k

; (k > p) and a sparse vector

x ∈ R

. More formally this can be written as

x = arg min

∥y − Cx∥

+ λ∥x∥

(1)

The ℓ

-norm on x ensures that the latent vector is sparse.

Several efﬁcient solvers like coordinate descent (CoD) [8],

fast iterative shrinkage thresholding algorithm (FISTA) [9],

feature-sign algorithm [10], etc, can be readily applied to

solve the above optimization problem. The dictionary C can

also be learned from the data [10], [11].

However, in most applications of sparse coding many over-

lapping patches across the image are processed separately.

Rakesh Chalasani and Jose C. Principe are with the Department of Electri-

cal and Computer Engineering, University of Florida, Gainesville, FL, USA .

Naveen Ramakrishnan is with Robert Bosch LLC, Research and Technology

Center North America, Pittsburgh, PA, USA. (email: rakeshch@uﬂ.edu,

principe@cnel.uﬂ.edu,Naveen.Ramakrishnan@us.bosch.com)

This work is partially supported by ONR grant #N 000141010375. The

ﬁrst author performed part of this work while at Bosch RTC, Pittsburgh

during summer 2012.

This is often too slow in practice, making it difﬁcult to scale

to large images. Moreover, sparse coding alone is not capable

of encoding translations in the observations. Learning the

dictionary in this context produces several shifted versions

of the same ﬁlter, such that each patch can be reconstructed

individually [12]. During inference, when performed on all

the overlapping patches, this can lead to a very redundant

representation. To over come these limitations, convolutional

sparse coding is proposed [5], [12]. Here sparse coding

is applied over the entire image and the dictionary is a

convolutional ﬁlter bank with ‘M’ kernels such that,

x = arg min

∥I −



m=1

∗ x

∥

+ λ



m=1

∥x

∥

(2)

where I is an image of size (w × h), C

is a ﬁlter kernel

of size (s ×s) in a dictionary, x

is a sparse matrix of size

(w + s − 1) × (h + s − 1), λ is the sparsity parameter and

’∗’ represents a 2D convolution operator

In this work, we propose an algorithm to solve the opti-

mization problem in (2) efﬁciently and which scales to large

images. This is an extension to fast iterative shrinkage thresh-

olding algorithm (FISTA) [9] and is solved using proximal

gradient. In addition, we also extend our method to include

a feed-forward predictor (predictive sparse decomposition

[6]) in the cost for joint optimization during inference and

learning.

Convolutional sparse coding is also previously studied

in [5], [12]. Zeiler et. al. [5] proposed a method to solve

the optimization problem in (2) by introducing additional

auxiliary variable which demands solving a large linear

system (the size of the linear system is proportional to size of

the image) at every iteration; although the complexity could

be reduced by using conjugate gradient, their approach does

not scale to large images. On the other hand, the authors in

[12] proposed a convolutional extension to CoD where the

computation per iteration is small compared to the method

in [5] but the number of iterations required for convergence

becomes large for large images. In the following sections, we

compare and contrast convolutional CoD with our method to

show the performance improvements achieved.

The rest of the paper is organized as follows: section II

describes convolutional FISTA and procedures to learn the

parameters of the model. Also, the extension of the method

for predictive sparse decomposition (PSD) is discussed.

Several experiments are described in section III to show the

performance of the proposed method and compare with the

All the variable henceforth represent a matrix, unless otherwise stated.

Also the convolution operators is applied in ‘full’ or ‘valid’ modes, depend-

ing on the context.

下载后可阅读完整内容，剩余4页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

xiazaidaihao

粉丝: 0

快速 proximity 方法：卷积稀疏编码的优化算法

细节增强的matlab代码-Convolutional-Sparse-Coding:卷积稀疏编码

FCSC:Bristow 等人的 CVPR 2013 论文“Fast Convolutional Sparse Coding”的实现

A Proximal Method for Solving Vector Variational Inequalities

文有为_A Fast Proximal Gradient Algorithm For Single Particle Recon

SVM.rar_Proximal SVM_SVM_svm.dat_svm算法实现

深度强化学习算法-Proximal Policy Optimization (PPO)v3.pdf

MATLAB代码：n阶机械臂单、多智能体控制 关键词：n阶机械臂单 多智能体 单智能体 参考文档： 1.《Proximal Policy Optimization Algorithms》 2.As

PPO算法，即Proximal Policy Optimization（近端策略优化）.pdf

论文研究-Proximal SVM在脑功能分类中的应用研究.pdf

Proximal gradient method

最新资源

MATLAB代码：n阶机械臂单、多智能体控制关键词：n阶机械臂单多智能体单智能体参考文档： 1.《Proximal Policy Optimization Algorithms》 2.As