Topology-Aware Partial Virtual Cluster Mapping
Algorithm on Shared Distributed Infrastructures
Xiaohui Wei, Member, IEEE, H ongliang Li, Member, IEEE , Kun Yang, Senior Member, IEEE, and Lei Zou
Abstract—Novel virtualized HPC centers provide isolat ed and configur able Virtual Clusters (VC) on shared distributed
infrastructures as execution environments for parallel and distributed applications. These VCs are usually customized and d eployed
per job in runtime. Allocating physical resources for VC is known as V irtual Cluster Mappi ng (VCM) probl em, which is a critical
issue that affects b oth perfor mance of the VC and resource utilization of the system. Most previous works treat all Virtual Machines
(VMs) in a VC request equally. However, because sub-jobs in a parallel job usually perform dif ferent roles, the corr esponding
VMs in a VC that execute these sub- jobs respectively should have different leve ls of impor tance. Based on this a rgument, this pap er
introduces the concept of partial VC mapping in contrast to the full mapping methodology in the current literatu res. To fulfill
partial mapping, the impo rtant backbone com munica tion struc ture of parallel job call ed C ommun ication Skele ton (CS) is derived
based on the network topology among virtual nodes. To generate the CS of a job, mechanisms for evaluati ng the importance of
nodes are proposed. Eventually, a Topolog y-aware Partia l Virtu al Cluster Mappi ng algor ithm (TOP-VCM) is propo sed which is based
on sub-graph isomorphism detec tion. TOP-V CM can fully satisfy the nodes/lin ks requirem ents in CS to ensure the execution
performance with only slight degradation of other trivial nodes/links to significantly reduce the mapping difficulty. Simulati on
results have shown that TOP-VCM has significantly improve d the total revenue, the utilization of physical re sources and the
performance of mapping algorithm while satisfying the VC requirements.
Index Terms—Virtual cluster, parallel job, topology, partial mapping, shared distributed infrastructure
Ç
1INTRODUCTION
N
OVEL virtualized HPC systems are able to facilitate
multiple heterogeneous environments for various
applications simultaneously on shared distributed infra-
structures. Usually these infrastructures consist of one
or multiple data centers and supported by virtualization
technologies. To share physical resources, the execution
environments are u sually implemented by Virtual Machines
(VMs) or sets of related VMs organized as Virtual Clusters
(VCs)[1],[2],[3],[4],[5].TheseVCsareusuallyparticularly
allocated and deployed per job in runtime [6], [7], [8], [9] to
support customized parallel and distributed applicati ons.
Virtualization on high level system c omponents, such
as Virtual Network (VN) [10], [11], [3], [12], [13], [ 14] and
VC, enables HPC system to dynamically manage these
environments by placement [7], [8], [9], [15], [16] in
construction and live migration [17], [18], [19], [20], [21]
in runtime. How to efficiently map VC onto substrate
cluster(s) is called Virtual Cluster Mapping (VCM) problem,
which is a critical issue that affects both performance of
the VC and resource utilization of the system. It is the basis
of both VC placement and VC live migration problems.
This paper focuses on the VCM problem on shared
distributed infrastructures.
A parallel job consists of a set of sub-jobs.Usually,a
parallel job is executed on one customized VC: (1) each
sub-job executes on one VM hosted by a physical machine
(computing host) ( Fig. 1 ); (2) communication links between
parallel jobs are supported by Virtual Links (VLs) between
VMs. Each VL can be mapped onto one or multiple
connected Physical Links (PLs). As a result, the computing
and communication requirements of the job are trans-
formed into a VC request that consists of computing
capacities of VMs and the network bandwidth of each
links between VMs.
Mapping a VC onto physical/Substrate Cluster (SC) consists
of VM ma pping and V L m apping. The VCM problem i s
similar to Virtual Network Embedding (VNE) probl em. They
both need to map nod es and links onto a sub strate env i-
ronment. The terms ‘‘node/link mapping’’ and ‘‘VM/VL
mapping’’ are interchangeable when referring VCM prob-
lem in the rest of the p aper.
As a result, most previous VCM works are based on VNE
algorithms. Since this problem is known to be NP-hard most
existing works use heuristic-based algorith ms [22], [ 23], [24],
[25], [26], [22], [27], [28], [29], [30], [31], [32] , [33], [34] such as
simulated annealing [29], mixed integer programming [25],
linear programmi ng [24 ], and sub-graph isomorphism
search [35].
However, directly applying the VNE solutions to VCM
problem is not efficien t. On one hand, virtual nodes in VN
mayhaveadditionalresourceconstrains[25],[26],[36],
[33], suc h as node location. VC resource allocation has
no such constraints, which increases th e solution space
and consume s m ore execution time. On other hand, the
. X. Wei, H. Li, and L. Zou are with the College of Computer Science and
Technology, Jilin University and the Key Laboratory of Symbolic
Computation and Knowledge Engineering of Ministry of Education,
Changchun, 130012, China. E-mail: {weixh, lihongliang}@jlu.edu.cn.
. K. Yang is wi th the School of Computer S cience and Electronic
Engineering, University of Essex, Colchester, CO4 3SQ, United Kingdom.
E-mail: kunyang@essex.ac.u.
Manuscript received 19 May 2013; revised 17 Aug. 2013; accepted 21 Aug.
2013. Date of publication 5 Sept. 2013; date of current version 17 Sept. 2014.
Recommended for acceptance by Y. Cui.
For information on obtaining reprints of this article, please send e-mail to:
reprints@ieee.org, and reference the Digital Object Identifier below.
Digital Object Identifier no. 10.1109/TPDS.2013.224
1045-9219
Ó
2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 25, NO. 10, OCTOBER 2014 2721