消除空间与通道冗余：加速卷积神经网络

175 浏览量更新于2024-08-31 收藏 879KB PDF 举报

"ESPACE: 加速卷积神经网络通过消除空间和通道冗余" 这篇研究论文探讨了如何通过消除卷积神经网络（CNN）输入图像中的空间和通道冗余来加速其运算。近年来，CNN在计算机视觉和人工智能领域取得了显著的进步，但这种进步是以巨大的计算复杂性为代价的，这使得在资源有限的应用，如移动设备或嵌入式系统中使用CNN变得困难。尽管已经有很多工作关注于优化CNN的内部网络结构以提高效率，但对输入图像本身的冗余处理却相对较少。 ESPACE（Efficient Subspace Analysis for Processing Convolutional Elements）方法首次尝试从视觉输入层面直接减少CNN的计算负担。该方法的核心思想是识别并去除图像中不必要的空间和通道信息，从而降低计算需求，提高模型运行速度，同时保持或甚至提升模型的性能。 1. 空间冗余：在图像中，相邻像素之间往往存在高度相关性，这意味着并非每个像素都需要独立处理。ESPACE通过一种高效的空间分析方法，识别并压缩这些冗余信息，减少了卷积层的计算量。 2. 通道冗余：在深度学习模型中，不同的特征通道可能包含相似或重复的信息。ESPACE提出了一种通道选择策略，能够选择最具代表性的通道进行计算，而忽略那些贡献较小或重复的通道，从而降低计算复杂性。 3. 结构优化：除了直接处理输入数据外，ESPACE还可能涉及到CNN的结构优化，比如修改滤波器的大小、数量或者引入稀疏连接，以进一步减少计算量。 4. 性能与效率权衡：在消除冗余的同时，研究者需要确保模型的准确性不会受到太大影响。ESPACE可能采用了某种损失函数或评估指标，以平衡加速与性能之间的关系。 5. 应用场景：这种技术对于资源受限的环境尤其有价值，如自动驾驶汽车、无人机或智能物联网设备，它们需要实时处理大量图像数据，而ESPACE可以有效地帮助这些设备在有限的计算资源下运行复杂的CNN模型。 ESPACE为CNN加速提供了一个新的视角，通过消除输入图像的空间和通道冗余，它不仅减轻了计算负担，还可能提高模型在资源有限条件下的运行效率。这种方法对于推动CNN在实际应用中的广泛部署具有重要意义。

ESPACE: Accelerating Convolutional Neural Networks

via Eliminating Spatial and Channel Redundancy

Shaohui Lin,

†‡

Rongrong Ji,

†‡∗

Chao Chen,

†‡

Feiyue Huang,



†

Fujian Key Laboratory of Sensing and Computing for Smart City, Xiamen University, 361005, China

‡

School of Information Science and Engineering, Xiamen University, 361005, China



BestImage Lab, Tencent Technology (Shanghai) Co.,Ltd, China

shaohuilin007@gmail.com, rrji@xmu.edu.cn, silentcc@icloud.com, garyhuang@tencent.com

Abstract

Recent years have witnessed an extensive popularity of con-

volutional neural networks (CNNs) in various computer vi-

sion and artiﬁcial intelligence applications. However, the per-

formance gains have come at a cost of substantially inten-

sive computation complexity, which prohibits its usage in

resource-limited applications like mobile or embedded de-

vices. While increasing attention has been paid to the acceler-

ation of internal network structure, the redundancy of visual

input is rarely considered. In this paper, we make the ﬁrst

attempt of reducing spatial and channel redundancy directly

from the visual input for CNNs acceleration. The proposed

method, termed ESPACE (Elimination of SPAtial and Chan-

nel rEdundancy), works by the following three steps: First,

the 3D channel redundancy of convolutional layers is reduced

by a set of low-rank approximation of convolutional ﬁlters.

Second, a novel mask based selective processing scheme is

proposed, which further speedups the convolution operations

via skipping unsalient spatial locations of the visual input.

Third, the accelerated network is ﬁne-tuned using the train-

ing data via back-propagation. The proposed method is evalu-

ated on ImageNet 2012 with implementations on two widely-

adopted CNNs, i.e. AlexNet and GoogLeNet. In compari-

son to several recent methods of CNN acceleration, the pro-

posed scheme has demonstrated new state-of-the-art acceler-

ation performance by a factor of 5.48× and 4.12× speedup

on AlexNet and GoogLeNet, respectively, with a minimal de-

crease in classiﬁcation accuracy.

Introduction

In recent years, convolutional neural networks (CNNs) have

demonstrated impressive performance in various computer

vision and artiﬁcial intelligence applications, such as object

recognition (Krizhevsky, Sutskever, and Hinton 2012)(Si-

monyan and Zisserman 2014)(Lecun et al. 1998)(Szegedy

et al. 2015)(He et al. 2015), object detection (Girshick et

al. 2014)(Girshick 2015)(Ren et al. 2015), and image re-

trieval (Gong et al. 2014b). The cutting-edge CNNs are

computationally intensive, in which the speed limitation

mainly resorts to the convolution operations in the convolu-

tional layers

. For example, an 8-layer AlexNet (Krizhevsky,

 2017, Association for the Advancement of Artiﬁcial

In this paper, we focus on the acceleration of the convolutional

layers, as it takes up over 80% running time in most existing CNNs,

Sutskever, and Hinton 2012) with about 600,000 nodes costs

240MB storage (including 61M parameters) and requires

729M FLOP

to classify one image with size 224 × 224.

Such cost is further intensiﬁed in deeper CNNs, e.g. a 16-

layer-VGGNet (Simonyan and Zisserman 2014) with 1.5M

nodes costs 528MB storage (including 144M parameters)

and requires about 15B FLOP to classify one image.

Under such circumstance, the existing CNNs cannot be

directly deployed to scenarios that require fast processing

and compact storage, such as streaming or real-time ap-

plications. On one hand, CNNs with million-scale param-

eters typically tend to be over parameterized and heavily

computed (Denil et al. 2013). Therefore, not all parame-

ters and operations (e.g. convolution or non-linear activa-

tion) are essentially necessary in producing a discrimina-

tive decision. On the other hand, it is quantitatively shown in

(Ba and Caruana 2014) that, neither shallow nor simpliﬁed

CNNs provide comparable performance to deep CNNs with

billion-scale online operations. Therefore, to accelerate on-

line CNNs predictions without signiﬁcantly decreasing the

decision accuracy, a natural thought is to discover and dis-

card redundant parameters and operations in deep CNNs.

Accelerating CNNs has attracted a few research attention

very recently, most of which focus on accelerating the con-

volutional layer, which is the most time-consuming part of

CNNs. In the literature, the related works can be further cat-

egorized into four groups, i.e. designing compact convolu-

tional ﬁlters, parameters quantization, parameters pruning

and tensor decomposition.

Designing compact convolutional ﬁlters. Using a com-

pact ﬁlter for convolution can directly reduce the com-

putation cost. The key idea is to replace the loose and

over-parametric ﬁlters with compact blocks to improve the

speed, which signiﬁcantly accelerate CNNs like GoogLeNet

(Szegedy et al. 2015), ResNet (He et al. 2015) on sev-

eral benchmarks. Decomposing 3 × 3 convolution with two

1 × 1 convolutions was used in (Szegedy, Loffe, and Van-

houcke 2016), which achieved state-of-the-art acceleration

performance on object recognition. SqueezeNet (Iandola,

Moskewicz, and Ashraf 2016) was proposed to replace 3×3

convolution with 1 × 1 convolution, which created a com-

i.e. AlexNet, GoogLeNet and VGGNet.

FLOP: The number of Floating-point operation to classify one

image with CNNs.

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)

1424

下载后可阅读完整内容，剩余6页未读，立即下载

weixin_38556205

粉丝: 4
资源: 938

消除空间与通道冗余：加速卷积神经网络

espace.1.0.1

espace230：Espace 230网站源代​​码

espace_vert:贡比涅的空间之声

espace_adherents：Portailnumérique信徒协会（Gestion desadhérents，通讯mailsms，妊娠文件记录员）

wiki:维基百科站点L'espace合作

eva-servur：Serveur et espace d'administration pour eva

华为espace技术白皮书

eSpace Report 产品文档

华为eSpace统一通信：构建企业高效通信平台

espace_adherents：数字化信徒协会管理系统功能介绍

最新资源

espace230：Espace 230网站源代码