Implementing Deep Learning and Inferencing on
Fog and Edge Computing Systems
Swarnava Dey (Author)
Embedded Systems and Robotics
TCS Research & Innovation
Kolkata, India
Email: swarnava.dey@tcs.com
Arijit Mukherjee (Author)
Embedded Systems and Robotics
TCS Research & Innovation
Kolkata, India
Email: mukherjee.arijit@tcs.com
Abstract—The case for leveraging the computing resources of
smart devices at the edge of network was conceptualized almost
nine years back. Since then several concepts like Cloudlets,
Fog etc. were instrumental in realizing computing at network
edge, in physical proximity to the data sources for building
more responsive, scalable and available Cloud based services.
An essential component in smartphone applications, Internet of
Things(IoT), field robotics etc. is the ability to analyze large
amount of data with reasonable latency. Deep Learning is fast
becoming a de facto choice for performing this data analytics
owing to its ability to reduce human interventions in such
workflows. Major deterrent of providing Deep Learning based
Cloud services are Cloud outages and relatively high latency.
In the current article the role of Fog Computing in addressing
these issues is discussed, current state of standardization in
Fog / Edge Computing is reviewed and the importance of
optimum resource provisioning for running Edge-Analytics is
highlighted. A detailed design and evaluation of the distribution
and parallelization aspects of an Edge based Deep Learning
framework using off-the-shelf components is presented along with
strategies for optimum resource provisioning in constrained edge
devices based on experiments with system resource (CPU, GPU
& RAM) consumptions of a Deep Convolutional Neural Network.
I. INTRODUCTION
In today's Digital World the dumb rule based network end-
devices are giving way to intelligent, semi-autonomous devices
that provide tailor-made, real-time services. These service
endpoints include smartphones, wearable devices, autonomous
vehicles, robots & drones and other embedded systems used
in domains such as healthcare, city services, engineering,
finance, entertainment and several others. As these sensor fitted
devices, sensors and human beings are generating high amount
of contextual data, the options for application of artificial in-
telligence (AI) for rendering more intelligent, customized and
effective services are increasing. Though these data-driven,
machine inferred services promise to bring in a paradigm
shift in several different application areas, the major challenge
remains in managing the big data generated by data sources
and applying AI for learning/inferencing in a distributed
fashion for near accurate and near real-time response. Due
to fast changing nature of services and requirement of fast
time-to-market, there is a shift of focus towards automated
generation of features for machine learning via Deep Learning
(DL) [3], [5], from traditional hand engineering of machine
learning features. In a typical supervised DL application, input
data with desired output is fed to a complex neural network
(NN) with processing and non-linear transition happening at
several layers to adjust the NN to form a model, which
can be utilized later to predict/classify/encode new sets of
data. Cloud based sensor data analytics frameworks are often
challenged by low latency requirement of the applications
and intermittent network connectivity due to high mobility
of the devices. To handle these issues resource rich devices
within local access network can be utilized and this was first
established in [1], where service software was offloaded for
execution on a Cloudlet virtual machine. Fog Computing [2]
proposed in 2012 envisaged deployment of services within lo-
cal access network, augmenting Cloud based deployments. As
an ongoing activity, Multi-access Edge Computing (MEC) [6]
initiative from ETSI is standardizing the process of deploying
applications and services in applications like video analytics,
Internet of Things (IoT) etc. in the Radio Access Network
(RAN) edge. With the availability state-of-the-art distributed
DL frameworks like TensorFlow [7] and scalable Cloud/Fog
based infrastructure it is possible to rapidly develop intelligent,
data-driven services. In this work we address this issue of run-
ning DL based analytics in a Cloud-Edge setup. We focus on
two primary requirements for successful deployment of such
services: a) analyzing resource requirement of applications
in terms of processing speed, memory for optimum resource
utilization and b) utilizing the workload information for op-
timized partitioning of load between networked resources in
order to minimize deployment cost or response time or some
other parameter important for the problem at hand.We perform
fine grained analyses of the resource requirement for offloaded
execution of a representative application that implements DL
analytics on streaming data, considering NN size (depth,
width, number of layers), data throughput, NN hyperparame-
ters (feature extraction filter size, batch size) etc. with respect
to execution time, CPU, GPU and memory requirement for
different levels of accuracy of prediction/classification. We
present a set of benchmarking methods and results and hope
that these will be helpful in provisioning optimum resources
at network edge to design effective DL analytics frameworks
capable of handling large volume of streaming data. We also
discuss the distribution and parallelization strategies using
SmartEdge'18 - Second International Workshop on Smart Edge Computing and Networking
978-1-5386-3227-7/18/$31.00 ©2018 IEEE 818