低功耗深度学习与计算机视觉：压缩与加速技术探析

需积分: 13 181 浏览量更新于2024-09-05 收藏 222KB PDF 举报

"这篇论文是关于低功耗深度学习和计算机视觉方法的综述，重点关注在推理阶段的研究进展。文章探讨了如何通过压缩和加速深度神经网络（DNN）模型来减少能耗、计算和内存需求，同时保持较高的准确性。文中详细分析了四种主要的技术类别：(1) 参数量化和剪枝；(2) 压缩卷积滤波器和矩阵分解；(3) 网络结构搜索；(4) 知识提取。此外，作者还讨论了评估这些技术的新指标，为未来的研究提供方向。" 在深度学习领域，尤其是计算机视觉任务中，DNNs已经取得了显著的成功。然而，最精确的DNN模型通常包含数百万个参数和运算，这使得它们在能源、计算和内存方面的需求巨大，不适合在资源有限的低功耗设备上部署。为了解决这个问题，研究人员提出了各种优化策略，以减小模型的内存占用、降低能耗并减少运算次数，而不会显著降低模型的准确度。第一种技术，参数量化和剪枝，涉及到将浮点数参数转换为低精度数据类型（如二进制或定点数），以及通过删除不重要的权重或连接（即“剪枝”）来简化网络结构。这样做可以显著减少存储需求和计算负担，但过度剪枝可能导致模型性能下降，因此需要寻找合适的量化和剪枝策略。第二种技术，压缩卷积滤波器和矩阵分解，通过将复杂的卷积层分解为更简单的操作，如低秩矩阵分解，来减少计算量。这种方法可以保持模型的性能，但可能需要额外的预处理和后处理步骤。第三种，网络结构搜索，利用自动化手段自动设计能源效率更高的网络架构，如通过遗传算法或强化学习。这种方法可能导致新颖且高效的结构，但也可能导致计算成本增加。第四种，知识提取，包括知识蒸馏，即将大型教师模型的知识转移到小型学生模型中，使学生模型能够模仿教师模型的行为，从而实现高效推理。在评估这些技术时，除了准确性和计算效率外，论文还引入了新的评价指标，如能源效率、延迟时间和模型的可移植性。这些指标对于在实际应用中选择和优化低功耗深度学习模型至关重要。该文全面总结了低功耗深度学习和计算机视觉领域的最新进展，为研究者提供了关于模型压缩、加速和优化的综合参考，为未来的研发工作提供了宝贵的指导。通过深入理解并应用这些技术，我们可以期望在物联网(IoT)等资源受限的环境中实现更智能、更节能的计算机视觉应用。

A Survey of Methods for Low-Power Deep

Learning and Computer Vision

Abhinav Goel, Caleb Tung, Yung-Hsiang Lu, and George K. Thiruvathukal

†

School of Electrical and Computer Engineering, Purdue University

†

Department of Computer Science, Loyola University Chicago

{goel39, tung3, yunglu}@purdue.edu, gkt@cs.luc.edu

Abstract—Deep neural networks (DNNs) are successful in

many computer vision tasks. However, the most accurate DNNs

require millions of parameters and operations, making them

energy, computation and memory intensive. This impedes the

deployment of large DNNs in low-power devices with limited

compute resources. Recent research improves DNN models by

reducing the memory requirement, energy consumption, and

number of operations without signiﬁcantly decreasing the ac-

curacy. This paper surveys the progress of low-power deep

learning and computer vision, speciﬁcally in regards to inference,

and discusses the methods for compacting and accelerating

DNN models. The techniques can be divided into four major

categories: (1) parameter quantization and pruning, (2) com-

pressed convolutional ﬁlters and matrix factorization, (3) network

architecture search, and (4) knowledge distillation. We analyze

the accuracy, advantages, disadvantages, and potential solutions

to the problems with the techniques in each category. We also

discuss new evaluation metrics as a guideline for future research.

Index Terms—neural networks, computer vision, low-power

I. INTRODUCTION

Deep Neural Networks (DNNs) are widely used in computer

vision tasks like object detection, classiﬁcation, and segmen-

tation [1, 2]. DNNs are designated as “Deep” because they

are made of many layers with a large spread of connections

between layers. This gives DNNs a tremendous range of

variability that can be ﬁne-tuned for accurate inference through

training. Unfortunately, DNNs are also computation-heavy and

energy-expensive as a result. VGG-16 [3] needs 15 billion op-

erations to perform image classiﬁcation on a single image [4].

Similarly, YOLOv3 performs 39 billion operations to process

one image [5]. These many computations require signiﬁcant

compute resources and lead to a high energy cost [6].

This presents a problem for DNNs: how can they be

meaningfully deployed on low-power embedded systems and

mobile devices? Such machines are often constrained by

battery power or obtain energy through low-current USB

connections [7]. They also do not usually come with GPUs.

Ofﬂoading computing to the cloud is a solution [8], but many

DNN applications need to be performed on low-power devices,

e.g., computer vision deployed on drones ﬂying in areas

without reliable network coverage to ofﬂoad computation, or

in satellites where ofﬂoading is too expensive [9].

Some low-power computer vision techniques remove redun-

dancies from DNNs to reduce the number of operations by

75% and the inference time by 50% with negligible loss in

accuracy. To deploy DNNs on small embedded computers,

more such optimizations are necessary. Therefore, pursuing

low-power improvements in deep learning for efﬁcient infer-

ence is worthwhile and is a growing area of research [10].

This paper surveys the literature and reports state-of-the-art

solutions for low-power computer vision. We focus speciﬁcally

on low-power DNN inference, not training, as the goal is

to attain high throughput. The paper classiﬁes the low-power

inference methods into four categories:

1) Parameter Quantization and Pruning: Lowers the memory

and computation costs by reducing the number of bits

used to store the parameters of DNN models.

2) Compressed Convolutional Filters and Matrix Factoriza-

tion: Decomposes large DNN layers into smaller layers

to decrease the memory requirement and the number of

redundant matrix operations.

3) Network Architecture Search: Builds DNNs with differ-

ent combinations of layers automatically to ﬁnd a DNN

architecture that achieves the desired performance.

4) Knowledge Distillation: Trains a compact DNN that

mimics the outputs, features, and activations of a more

computation-heavy DNN.

TABLE I summarizes these methods. This survey will focus

on the above mentioned software-based low-power computer

vision techniques, without considering low-power hardware

optimizations (e.g. hardware accelerators, spiking DNNs). This

paper uses results reported in the existing literature to analyze

the advantages, disadvantages, and propose potential improve-

ments to the four methods. We also suggest an additional set

of evaluation metrics to guide future research.

II. PARAMETER QUANTIZATION AND PRUNING

Memory accesses contribute signiﬁcantly to the energy

consumption of DNNs [4, 11]. To build low-power DNNs,

recent research has looked into the tradeoff between accuracy

and the number of memory accesses.

A. Quantization of Deep Neural Networks

One method to reduce the number of memory accesses is

to decrease the size of DNN parameters. Some methods [12,

13] show that it is possible to have negligible accuracy loss

even when the precision of the DNN parameters is reduced.

Courbariaux et al. [13] experiment with parameters stored in

下载后可阅读完整内容，剩余5页未读，立即下载

syp_net

粉丝: 158
资源: 1187

低功耗深度学习与计算机视觉：压缩与加速技术探析

WF-IoT技术的安全策略.pdf

NB-IoT_物联网精品资料大全33个合集.zip

WF-IoT技术的安全策略.docx

miniaturizing-iot-designs.pdf.zip_iot

WF-IoT的五大基础功能.pdf

2020年物联网NB-IoT行业深度研究报告.pdf

NB-IoT下行物理层技术.pdf

NB-IoT行业应用规范指引.pdf

NB-IOT关键技术及应用.pdf

基于NB-IoT的智能监测系统.pdf

最新资源