Squeezed Edge YOLO：资源受限设备上的高效对象检测与性能优化

版权申诉

67 浏览量更新于2024-08-03 收藏 1.94MB PDF 举报

随着自动驾驶技术的发展，对边缘设备上实时、高效的物体检测能力的需求日益增长。YOLO（You Only Look Once）因其快速的检测速度和较高的准确率，在这一领域备受关注。然而，将这类深度学习模型部署到资源受限的边缘设备，如嵌入式处理器和低功耗芯片上，面临着严峻的挑战，因为它们对计算性能有很高的要求。本文主要探讨了一种名为Squeezed Edge YOLO的压缩目标检测模型，它专为适应边缘设备设计，尤其是那些内存和计算能力有限的设备。 Squeezed Edge YOLO通过压缩和优化模型参数，将其大小减小到仅数千字节，以满足边缘设备上的存储需求。这种压缩策略旨在保持模型的准确性同时减少计算负担，从而在有限的资源条件下实现高效的物体检测。模型压缩的方法可能包括权重剪枝、量化、知识蒸馏等技术，旨在最大限度地保留关键信息而消除冗余。为了验证Squeezed Edge YOLO的实际性能，研究者选择了两个典型的应用场景进行评估：人体检测和形状识别。这些案例不仅考察了模型在实际任务中的精度，还测试了其在实际硬件平台上的运行效率。实验平台包括具有8个RISC-V内核的GAP8处理器，以及具有4GB内存的NVIDIA Jetson Nano。结果显示，Squeezed Edge YOLO在模型大小优化方面表现出色，相比于原始模型，它的尺寸缩小了8倍，这带来了显著的能源效率提升，达到了76%。此外，通过硬件优化，整体处理速度也提高了3.3倍，这表明Squeezed Edge YOLO能够在资源有限的环境下提供出色的性能。 Squeezed Edge YOLO的成功在于它兼顾了边缘设备的现实需求和深度学习模型的性能。它展示了在压缩和优化深度学习模型的同时，如何在不牺牲准确性的情况下，实现在嵌入式环境中的高效部署。这对于推动自动驾驶汽车、无人机和其他物联网设备的实时感知能力至关重要。随着未来智能设备对边缘计算的依赖加深，这种轻量级、高效的物体检测模型有望在更多领域得到广泛应用。

Squeezed Edge YOLO: Onboard Object Detection on

Edge Devices

Edward Humes

Mozhgan Navardi

Tinoosh Mohsenin

University of Maryland, Baltimore County

Johns Hopkins University

ehumes2@umbc.edu {mnavard1, tinoosh}@jhu.edu

Abstract

Demand for efﬁcient onboard object detection is increasing due to its key role

in autonomous navigation. However, deploying object detection models such as

YOLO on resource constrained edge devices is challenging due to the high com-

putational requirements of such models. In this paper, an compressed object detec-

tion model named Squeezed Edge YOLO is examined. This model is compressed

and optimized to kilobytes of parameters in order to ﬁt onboard such edge devices.

To evaluate Squeezed Edge YOLO, two use cases - human and shape detection -

are used to show the model accuracy and performance. Moreover, the model is

deployed onboard a GAP8 processor with 8 RISC-V cores and an NVIDIA Jetson

Nano with 4GB of memory. Experimental results show Squeezed Edge YOLO

model size is optimized by a factor of 8x which leads to 76% improvements in

energy efﬁciency and 3.3x faster throughout.

1 Introduction

Interest in Machine Learning (ML) is dramatically increasing as it provides a promising solution for

various applications such as autonomous navigation [1, 2]. Object detection models in particular can

signiﬁcantly assist in autonomous navigation by detecting obstacles and pre-deﬁned objects of inter-

est in the environment [3]. However, object detectors have high computational requirements due to

the need for accuracy and the ability to detect various object categories. GPUs with signiﬁcant com-

putational capacity are often mandatory to train such complex models, yet onboard processing and

edge computing necessitates low-power and low-computation algorithms as a result of the limited

power and computational capacity available [4].

Object detectors are trained classiﬁers that can identify and locate multiple objects within an image.

These detectors are trained on a set of annotated images, and their accuracy is evaluated on unseen

datasets. There are two commonly used object detector paradigms: single-shot and two-shot. Single-

shot-based methods such as You Only Look Once (YOLO) [5], Single Shot Detector (SSD) [6], etc.,

directly predict the class probabilities and Bounding Box (BBox) coordinates for objects in an im-

age. In contrast, two-shot architectures such as R-CNN [7], Faster R-CNN [8], etc., generate a set

of region proposals and then classify and reﬁne them to output the ﬁnal object detection. Moreover,

two-shot object detection methods have several advantages over other methods, including robustness

to scale and size variations, accurate localization, ﬂexibility, and improved object recognition [9].

However, these advantages come at the expense of inference speed, with single-shot object detectors

generally being faster than two-shot object detectors. Despite this, even single-shot objector models

are difﬁcult to deploy to resource constrained edge devices due to their high computational com-

plexity. Therefore, it is important to improve object detection models to meet power consumption

and real-time requirements on such devices [10, 11].

In recent years, researchers have presented optimized object detection models [10, 12, 13, 14, 15,

16, 17, 18] to enable onboard object detection on edge devices. Work in [13, 14, 15] proposed an

37th First Workshop on Machine Learning with New Compute Paradigms at NeurIPS 2023(MLNPCP 2023).

arXiv:2312.11716v1 [cs.CV] 18 Dec 2023

下载后可阅读完整内容，剩余7页未读，立即下载

人工智能_SYBH

粉丝: 4w+
资源: 222

Squeezed Edge YOLO：资源受限设备上的高效对象检测与性能优化

最新资源