旷视科技Shufflenet：移动设备上的高效CNN架构

需积分: 45 36 浏览量更新于2024-09-14 收藏 363KB PDF 举报

ShuffleNet是由旷视科技（Megvii Inc. 的Face++团队）提出的一种极其高效的人工神经网络架构，专为计算能力极为有限的移动设备设计（例如10到150百万浮点运算FLOPs）。该创新的架构主要关注在低功耗设备上保持高精度的同时显著降低计算成本。 ShuffleNet的核心创新包括两点：点wise group convolution（点群卷积）和channel shuffle（通道shuffle操作）。点wise group convolution是一种创新的卷积方式，它将输入特征图分成多个小组，然后对每个小组分别执行点wise卷积，这大大减少了计算量，因为点wise操作通常比标准卷积更轻量级。通道shuffle则是在深度学习层中重新排列通道，这样可以促进信息在不同通道间的交互，从而提高模型的表示学习能力，而无需增加额外的参数或计算负担。 ShuffleNet的设计目标是在ImageNet图像分类和Microsoft COCO对象检测等任务上实现卓越性能，同时控制在40MFLOPs的计算预算内。与当时流行的MobileNet相比，ShuffleNet在ImageNet上的top-1错误率降低了7.8%，表明其在保持高性能的同时，能更有效地利用资源。在基于ARM的移动设备上，ShuffleNet的实际速度提升了约13倍，相比于AlexNet，它能够在速度和准确性之间找到一个理想的平衡点。 ShuffleNet的成功证明了轻量级模型设计对于移动设备的重要性，它展示了即使在资源受限的环境中，也能通过优化架构和操作来提升模型的效率和性能。这种设计理念对于当今AI在移动端的应用，如手机相机、实时识别和处理等方面具有重大意义，使得复杂任务的处理变得更加实时和节能。在未来的研究中，ShuffleNet可能会激发更多的研究者探索如何在有限的硬件资源下，进一步提升模型的性能和效率。

ShufﬂeNet: An Extremely Efﬁcient Convolutional Neural Network for Mobile

Devices

Xiangyu Zhang

∗

Xinyu Zhou

∗

Mengxiao Lin Jian Sun

Megvii Inc (Face++)

{zhangxiangyu,zxy,linmengxiao,sunjian}@megvii.com

Abstract

We introduce an extremely computation-efﬁcient CNN

architecture named ShufﬂeNet, which is designed specially

for mobile devices with very limited computing power (e.g.,

10-150 MFLOPs). The new architecture utilizes two new

operations, pointwise group convolution and channel shuf-

ﬂe, to greatly reduce computation cost while maintaining

accuracy. Experiments on ImageNet classiﬁcation and MS

COCO object detection demonstrate the superior perfor-

mance of ShufﬂeNet over other structures, e.g. lower top-1

error (absolute 7.8%) than recent MobileNet [12] on Ima-

geNet classiﬁcation task, under the computation budget of

40 MFLOPs. On an ARM-based mobile device, ShufﬂeNet

achieves ∼13× actual speedup over AlexNet while main-

taining comparable accuracy.

1. Introduction

Building deeper and larger convolutional neural net-

works (CNNs) is a primary trend for solving major visual

recognition tasks [21, 9, 33, 5, 28, 24]. The most accu-

rate CNNs usually have hundreds of layers and thousands

of channels [9, 34, 32, 40], thus requiring computation at

billions of FLOPs. This report examines the opposite ex-

treme: pursuing the best accuracy in very limited compu-

tational budgets at tens or hundreds of MFLOPs, focusing

on common mobile platforms such as drones, robots, and

smartphones. Note that many existing works [16, 22, 43, 42,

38, 27] focus on pruning, compressing, or low-bit represent-

ing a “basic” network architecture. Here we aim to explore

a highly efﬁcient basic architecture specially designed for

our desired computing ranges.

We notice that state-of-the-art basic architectures such as

Xception [3] and ResNeXt [40] become less efﬁcient in ex-

tremely small networks because of the costly dense 1 × 1

convolutions. We propose using pointwise group convolu-

* Equally contribution.

tions to reduce computation complexity of 1 × 1 convolu-

tions. To overcome the side effects brought by group con-

volutions, we come up with a novel channel shufﬂe opera-

tion to help the information ﬂowing across feature channels.

Based on the two techniques, we build a highly efﬁcient ar-

chitecture called ShufﬂeNet. Compared with popular struc-

tures like [30, 9, 40], for a given computation complexity

budget, our ShufﬂeNet allows more feature map channels,

which helps to encode more information and is especially

critical to the performance of very small networks.

We evaluate our models on the challenging ImageNet

classiﬁcation [4, 29] and MS COCO object detection [23]

tasks. A series of controlled experiments shows the effec-

tiveness of our design principles and the better performance

over other structures. Compared with the state-of-the-art

architecture MobileNet [12], ShufﬂeNet achieves superior

performance by a signiﬁcant margin, e.g. absolute 7.8%

lower ImageNet top-1 error at level of 40 MFLOPs.

We also examine the speedup on real hardware, i.e. an

off-the-shelf ARM-based computing core. The ShufﬂeNet

model achieves ∼13× actual speedup (theoretical speedup

is 18×) over AlexNet [21] while maintaining comparable

accuracy.

2. Related Work

Efﬁcient Model Designs The last few years have seen

the success of deep neural networks in computer vision

tasks [21, 36, 28], in which model designs play an im-

portant role. The increasing needs of running high qual-

ity deep neural networks on embedded devices encour-

age the study on efﬁcient model designs [8]. For ex-

ample, GoogLeNet [33] increases the depth of networks

with much lower complexity compared to simply stack-

ing convolution layers. SqueezeNet [14] reduces parame-

ters and computation signiﬁcantly while maintaining accu-

racy. ResNet [9, 10] utilizes the efﬁcient bottleneck struc-

ture to achieve impressive performance. SENet [13] in-

troduces an architectural unit that boosts performance at

slight computation cost. Concurrent with us, a very re-

arXiv:1707.01083v2 [cs.CV] 7 Dec 2017

下载后可阅读完整内容，剩余8页未读，立即下载

qq_29116291

粉丝: 0
资源: 7

旷视科技Shufflenet：移动设备上的高效CNN架构

ShuffleNet网络的Python实现：降低参数量

Shufflenet模型性能提升技术研究

ShuffleNet v2: 探索ONNX格式的轻量化模型架构

shuffleNet

ShuffleNet

Shufflenet

shuffleNet_shufflenet_

ShuffleNet系列

shufflenetV1

ShuffleNet V3

最新资源