A Survey of Methods for Low-Power Deep
Learning and Computer Vision
Abhinav Goel, Caleb Tung, Yung-Hsiang Lu, and George K. Thiruvathukal
†
School of Electrical and Computer Engineering, Purdue University
†
Department of Computer Science, Loyola University Chicago
{goel39, tung3, yunglu}@purdue.edu, gkt@cs.luc.edu
Abstract—Deep neural networks (DNNs) are successful in
many computer vision tasks. However, the most accurate DNNs
require millions of parameters and operations, making them
energy, computation and memory intensive. This impedes the
deployment of large DNNs in low-power devices with limited
compute resources. Recent research improves DNN models by
reducing the memory requirement, energy consumption, and
number of operations without significantly decreasing the ac-
curacy. This paper surveys the progress of low-power deep
learning and computer vision, specifically in regards to inference,
and discusses the methods for compacting and accelerating
DNN models. The techniques can be divided into four major
categories: (1) parameter quantization and pruning, (2) com-
pressed convolutional filters and matrix factorization, (3) network
architecture search, and (4) knowledge distillation. We analyze
the accuracy, advantages, disadvantages, and potential solutions
to the problems with the techniques in each category. We also
discuss new evaluation metrics as a guideline for future research.
Index Terms—neural networks, computer vision, low-power
I. INTRODUCTION
Deep Neural Networks (DNNs) are widely used in computer
vision tasks like object detection, classification, and segmen-
tation [1, 2]. DNNs are designated as “Deep” because they
are made of many layers with a large spread of connections
between layers. This gives DNNs a tremendous range of
variability that can be fine-tuned for accurate inference through
training. Unfortunately, DNNs are also computation-heavy and
energy-expensive as a result. VGG-16 [3] needs 15 billion op-
erations to perform image classification on a single image [4].
Similarly, YOLOv3 performs 39 billion operations to process
one image [5]. These many computations require significant
compute resources and lead to a high energy cost [6].
This presents a problem for DNNs: how can they be
meaningfully deployed on low-power embedded systems and
mobile devices? Such machines are often constrained by
battery power or obtain energy through low-current USB
connections [7]. They also do not usually come with GPUs.
Offloading computing to the cloud is a solution [8], but many
DNN applications need to be performed on low-power devices,
e.g., computer vision deployed on drones flying in areas
without reliable network coverage to offload computation, or
in satellites where offloading is too expensive [9].
Some low-power computer vision techniques remove redun-
dancies from DNNs to reduce the number of operations by
75% and the inference time by 50% with negligible loss in
accuracy. To deploy DNNs on small embedded computers,
more such optimizations are necessary. Therefore, pursuing
low-power improvements in deep learning for efficient infer-
ence is worthwhile and is a growing area of research [10].
This paper surveys the literature and reports state-of-the-art
solutions for low-power computer vision. We focus specifically
on low-power DNN inference, not training, as the goal is
to attain high throughput. The paper classifies the low-power
inference methods into four categories:
1) Parameter Quantization and Pruning: Lowers the memory
and computation costs by reducing the number of bits
used to store the parameters of DNN models.
2) Compressed Convolutional Filters and Matrix Factoriza-
tion: Decomposes large DNN layers into smaller layers
to decrease the memory requirement and the number of
redundant matrix operations.
3) Network Architecture Search: Builds DNNs with differ-
ent combinations of layers automatically to find a DNN
architecture that achieves the desired performance.
4) Knowledge Distillation: Trains a compact DNN that
mimics the outputs, features, and activations of a more
computation-heavy DNN.
TABLE I summarizes these methods. This survey will focus
on the above mentioned software-based low-power computer
vision techniques, without considering low-power hardware
optimizations (e.g. hardware accelerators, spiking DNNs). This
paper uses results reported in the existing literature to analyze
the advantages, disadvantages, and propose potential improve-
ments to the four methods. We also suggest an additional set
of evaluation metrics to guide future research.
II. PARAMETER QUANTIZATION AND PRUNING
Memory accesses contribute significantly to the energy
consumption of DNNs [4, 11]. To build low-power DNNs,
recent research has looked into the tradeoff between accuracy
and the number of memory accesses.
A. Quantization of Deep Neural Networks
One method to reduce the number of memory accesses is
to decrease the size of DNN parameters. Some methods [12,
13] show that it is possible to have negligible accuracy loss
even when the precision of the DNN parameters is reduced.
Courbariaux et al. [13] experiment with parameters stored in