雾计算中的CNN模型分割技术

需积分: 0 57 浏览量更新于2024-08-05 收藏 3.61MB PDF 举报

"本文探讨了在雾计算设备上执行CNN（卷积神经网络）模型的切分策略，旨在解决边缘计算中的计算效率和网络流量问题。作者来自TCS研究与创新团队，他们提出了一种方法来优化CNN模型在有限计算资源的雾设备上的运行。在物联网(IoT)和智能城市等领域，雾计算已成为一个重要的研究方向，因为它能够减少数据传输到云端的延迟和带宽需求。随着深度神经网络(DNN)的进步，尤其是CNN在图像、语音等领域的广泛应用，需要在数据源附近进行实时分析和推断的需求日益增长。这促使了在雾设备上运行这些模型的需求，以减少向上游网络发送的数据量。然而，雾设备通常具有计算能力的限制，无法有效地处理复杂的深度学习模型。因此，文章的重点在于如何将大型的CNN模型分割成更小的部分，使得部分计算可以在雾设备上本地执行，而其他计算任务可能仍然需要云中心的支持。这种模型切分策略可以平衡边缘设备的计算负载和网络通信成本，同时保持预测的准确性。文章可能涵盖了以下几个关键点： 1. **模型切分算法**：描述了将CNN模型分解的算法，可能包括基于层次或权重重要性的分割方法，以确保在边缘设备上的部分模型能处理部分计算任务。 2. **性能评估**：对不同切分策略的执行效率、内存占用和预测精度进行了评估，以确定最佳的模型切分方案。 3. **资源优化**：探讨了如何通过模型切分优化雾设备的计算资源，可能包括动态调整模型切分以适应不断变化的工作负载。 4. **网络通信成本**：分析了模型切分对网络通信的影响，包括上传和下载的数据量以及由此导致的延迟问题。 5. **应用案例**：可能提供了具体的案例研究，展示在实际场景中，如智能监控或自动驾驶，如何利用模型切分在雾设备上实现高效推理。 6. **未来工作展望**：讨论了模型切分技术的潜在改进和未来研究方向，如适应性模型更新和跨设备协同计算。通过这种方式，文章旨在为雾计算环境中的深度学习部署提供实用的解决方案，促进边缘计算的发展，提高物联网系统的响应速度和效率。"

Partitioning of CNN Models for Execution on Fog Devices

∗

Swarnava Dey

TCS Research and Innovation

Kolkata, West Bengal, India

swarnava.dey@tcs.com

Arijit Mukherjee

TCS Research and Innovation

Kolkata, West Bengal, India

mukherjee.arijit@tcs.com

Arpan Pal

TCS Research and Innovation

Kolkata, West Bengal, India

arpan.pal@tcs.com

Balamuralidhar P

TCS Research and Innovation

Bangalore, Karnataka, India

balamurali.p@tcs.com

ABSTRACT

Fog Computing has in recent times captured the imagination of

industrial and research organizations working on various aspects of

connected livelihood and governance of smart cities. Improvements

in deep neural networks imply extensive use of such models for

analytics and inferencing on large volume of data, including sensor

observations, images, speech. A growing need for such inferencing

to be run on devices closer to the data sources, i.e. devices which

reside at the edge of the network, popularly known as fog devices

exists, in order to reduce the upstream network trac. However,

being computationally constrained in nature, executing complex

deep inferencing models on such devices has been proved dicult.

This has led to several new approaches to partition/distribute the

computation and/or data over multiple fog devices. In this paper

we propose a novel depth-wise input partitioning scheme for CNN

models and experimentally prove that it achieves better perfor-

mance compared to row/column or grid based schemes.

CCS CONCEPTS

• Computing methodologies →

MapReduce algorithms;

• Com-

puter systems organization →

Cloud computing; Neural net-

works;

KEYWORDS

CNN, distributed, Edge, Fog, Cloud, DCNN, convolution, parallel

ACM Reference Format:

Swarnava Dey, Arijit Mukherjee, Arpan Pal, and Balamuralidhar P. 2018.

Partitioning of CNN Models for Execution on Fog Devices. In The 1st ACM

International Workshop on Smart Cities and Fog Computing (CitiFog’18),

November 4, 2018, Shenzhen, China. ACM, New York, NY, USA, 6 pages.

https://doi.org/10.1145/3277893.3277899

∗

Produces the permission block, and copyright information

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

CitiFog’18, November 4, 2018, Shenzhen, China

ACM ISBN 978-1-4503-6051-7/18/11.. . $15.00

https://doi.org/10.1145/3277893.3277899

1 INTRODUCTION

In recent years industries and research organizations have heav-

ily invested in Fog Computing where computational methods are

placed closer to the data sources at the edge of the network. Data

analytic applications processing large volume of sensor data, im-

ages, videos, sounds etc. to generate inferences are primary candi-

date applications for such a processing architecture as processing

the data closer to the source ensures less data trac upstream.

Example implementations of data analytic applications in Smart

City are available in smart city transport systems [

], smart city

healthcare [

], detection of illegal garbage dumping [

] and

several others. We redirect the reader to a recent survey [

] that

highlights challenges and opportunities in Articial Intelligence(AI)-

based frameworks for smart cities. It is noteworthy that many of the

above mentioned and several other data analytic applications for

smart city are adopting Deep Learning (DL)/Inferencing techniques

due to availability of state of the art (SoA) learning models ready

for transfer learning and ne tuning, resulting in faster time to

market. One of the major challenges of running top of the line deep

models like Inception, Resnet, VGG in common edge/fog devices

are the computational and memory requirements for each of the

models. In our experiments, we have found that the Inception V3

model [

] can not be loaded into the available memory without

allocating a USB based swap space in the Raspberry Pi 3 board and

it takes nearly ve seconds to classify a single image; and the issues

are similar in most of the commonly used models. In this work, we

propose a method to run deep inference operation of Convolutional

Neural Networks (CNN) [

] on a set of fog devices for achieving

high speed inferencing. CNNs are de facto techniques for image

classication and have recently been used for speech and sensor

data as well [

]. Though the concept of collaborative edge execu-

tion of CNN is introduced earlier by Mao et al. [

], our work ex-

tends the SoA through the following major contributions:1) a novel

depth-wise input partitioning scheme that removes the overhead

associated with earlier row/column and grid partitioning schemes,

2) a highlighted role of input and output depth of current convolu-

tional layers (CLs) in the speedup achieved by distributed execution

and 3) demonstration of its eect on distributed execution through

extensive simulations with realistic workloads.We also prove our

partitioning scheme on Inception V3 CLs on a real system based

on Raspberry pi 3 and TensorFlow [

] to achieve 3 times speedup.

The current paper is organized as follows: Section 2 gives a brief

overview of current state of development in Edge Computing and

下载后可阅读完整内容，剩余5页未读，立即下载

人亲卓玛

粉丝: 33
资源: 329

雾计算中的CNN模型分割技术

【项目实战】Python基于卷积神经网络CNN模型和VGG16模型进行图片识别项目实战

机器学习CNN模型在心血管疾病诊疗中的临床应用及研究进展.pdf

cnn

基于卷积神经网络-门控循环单元CNN-GRU时间序列预测，要求2020及以上版本 评价指标包括:R2、MAE、MSE、RMS

Python-TensorFlow中的3D卷积神经网络用于语音验证

深入研究半监督深度学习在木马流量检测中的应用

Transformer模型在文本生成中的新时代：内容创作利器，开启创意无限可能

Jupyter中的神经网络模型开发与调试

【TensorFlow高级实战】：简化模型构建的Estimators应用指南

意图识别在NLP中的应用：方法与实践深入解析

最新资源

基于卷积神经网络-门控循环单元CNN-GRU时间序列预测，要求2020及以上版本评价指标包括:R2、MAE、MSE、RMS