SOLAR: Services-oriented Learning Architectures
Deep Learning as a Service
Chao Wang
1
, Xi Li
1
, Qi Yu
1
, Aili Wang
1
, Patrick Hung
2
, and Xuehai Zhou
1
1
School of Computer Science, University of Science and Technology of China
2
Business and Information Technology, University of Ontario Institute of Technology
cswang,llxx,wangal,xhzhou@ustc.edu.cn, yuiq@mail.ustc.edu.cn, patrick.hung@uoit.ca
Abstract— Deep learning has been an emerging field of ma-
chine learning during past decades. However, the diversity and
large scale data sizes have posed significant challenge to con-
struct a flexible and high efficient implementations of deep
learning neural networks. In order to improve the perfor-
mance as well to maintain the scalability, in this paper we pre-
sent SOLAR, a services-oriented deep learning architecture
using various accelerators like GPU and FPGA based ap-
proaches. SOLAR provides a uniform programming model to
users so that the hardware implementation and the scheduling
is invisible to the programmers. At runtime, the services can be
executed either on the software processors or the hardware
accelerators. Experimental results on the real state-of-the-art
FPGA board demonstrate that the SOLAR is able to provide a
ubiquitous framework for diverse applications without increas-
ing the burden of the programmers. Moreover, the speedup of
the GPU and FPGA hardware accelerator in SOLAR can
achieve significant speedup comparing to the conventional
Intel i5 processors with great scalability.
Keywords- Services-oriented Architecture; Deep Learning;
neural network; accelerator.
I. INTRODUCTION
In the past few years, machine learning has become per-
vasive in various research fields and commercial applications,
and achieved satisfactory products. In particular, the emer-
gence of deep learning speeded up the development of ma-
chine learning and artificial intelligence. Consequently, deep
learning has become a research hotspot in research organiza-
tions and the companies [1]. In general, deep learning uses a
multi-layer neural network model to extract high-level fea-
tures into a combination of low-level abstractions to find the
distributed data features, in order to solve complex problems
in machine learning. Currently the most widely used neural
models of deep learning are Deep Neural Networks (DNNs)
and Convolution Neural Networks (CNNs) [2], which have
excellent capability in solving picture recognition, voice
recognition, bioinformatics, and other complex machine
learning tasks [3].
However, with the increasing accuracy requirements and
complexity for the practical applications, the size of the neu-
ral network is becoming explosively large scale, which
makes the data centers quite power consuming. Therefore, it
poses significant challenges to implement high performance
deep learning networks with low power cost, especially for
large-scale deep learning neural network models.
So far, the state-of-the-art means for accelerating deep
learning algorithms include Field-Programmable Gate Array
(FPGA), Application Specific Integrated Circuit (ASIC), and
Graphic Processing Unit (GPU) based approaches. Of these
approaches, GPU has been well recognized for its high per-
formance in massive computing capacity. Compared with
GPU acceleration, hardware accelerators like FPGA and
ASIC can achieve at least moderate performance with lower
power consumption. However, both FPGA and ASIC have
relatively limited computing resources, memory, and I/O
bandwidths, therefore it is difficult to develop complex and
massive deep neural networks using hardware accelerators.
Up to now, the problem of providing efficient middleware
support for different architectures has not been properly
solved.
Another challenge is diversity and programming in deep
learning applications. Due to the design complexity of the
deep learning algorithms, architectures and accelerators, it
requires significant programming effort to make satisfying
utilization of the accelerations in diverse application domains.
If the complex computation is solved by the programmer
manually, the quality of scheduling depends on the experi-
ences of the programmer, who has limited knowledge of the
hardware. In order to alleviate the burden of the high level
programmers, we can demonstrate the effectiveness of ser-
vices-oriented architecture (SOA) in deep learning research
paradigm. Traditionally, SOA provides effective measures
with better flexibility and extensibility at lower cost through
adopting re-usable modules in software engineering. To the
best of our knowledge, there have been no researches that
utilize SOA concepts to solve the diversity of the deep learn-
ing applications.
To tackle this problem, in this paper we present SOLAR,
which is a services-oriented deep learning architecture tar-
geting the state-of-the-art machine learning applications. Our
main contributions are the following:
1. We introduce SOA concepts into the state-of-the-art
architecture for deep learning applications. Services are pro-
vided by heterogeneous accelerators within a well-structured
interface to improve the flexibility and scalability.
2. SOLAR is based on a heterogeneous hybrid system
which includes the software processor, GPU accelerator, and
hardware FPGA accelerator to speed up the kernel computa-
tional parts of deep learning algorithms. In particular, we
utilize an efficient middleware support to bridge the gap be-
tween high level neural networks and hardware accelerators.
2016 IEEE International Conference on Web Services
978-1-5090-2675-3/16 $31.00 © 2016 IEEE
DOI 10.1109/ICWS.2016.91
662