Towards a Lightweight RDMA Para-Virtualization
for HPC
Shiqing Fan
Huawei Technologies
German Research Center
shiqing.fan@huawei.com
Fang Chen
Huawei Technologies
German Research Center
fang.chen1@huawei.com
Holm Rauchfuss
Huawei Technologies
German Research Center
holm.rauchfuss@huawei.com
Nadav Har’El
cyllaDB
nyh@scylladb.com
Uwe Schilling
University of Stuttgart
HLRS Stuttgart
schilling@hlrs.de
Nico Struckmann
University of Stuttgart
HLRS Stuttgart
struckmann@hlrs.de
ABSTRACT
Virtualization has gained increasing attention in the recent
High Performance Computing (HPC) development. While
HPC provides scalability and computing performance, HPC
in the cloud benefits in addition from the agility and flexibil-
ity that virtualization brings. One of the major challenges
of HPC in virtualized environments is RDMA virtualiza-
tion. Existing implementations of RDMA virtualization fo-
cused on supporting VMs running Linux. However, HPC
workloads rarely need a full-blown Linux OS. Compared
to traditional Linux OS, emerging Library OSes, such as
OSv, are becoming popular choices as they provide efficient,
portable and lightweight cloud images. To enable virtualized
RDMA for lightweight library OSes, drivers and interfaces
must be re-designed to accommodate the underlying virtual
devices. In this paper we present a novel design, the virtio-
rdma driver for OSv, which aims to provide RDMA para-
virtualization for lightweight library OS. We compare this
new design with existing implementations for Linux, and an-
alyze the advantages of virtio-rdma’s architecture, its ease of
migration to different operating systems, and the potential
for performance improvement. We also propose a solution
for integrating this para-virtualized driver into HPC plat-
forms, enabling HPC application users to deploy their use
cases smoothly in a virtualized HPC environment.
Keywords
Virtualization; virtIO; RDMA; HPC; Unikernel
1. INTRODUCTION
Para-virtualization has been commonly used in virtual-
ized environments to improve system efficiency and to op-
timize management workloads. In the era of High Perfor-
mance Computing (HPC) and Big Data use cases, cloud
COSH-VisorHPC 2017 Jan 24, 2017, Stockholm, Sweden
c
2017, All rights owned by authors. Published in the TUM library.
ISBN 978-3-00-055564-0
DOI: 10.14459/2017md1344417
providers and HPC centers focus more on developing para-
virtualization solutions of fast and efficient I/O. Due to the
nature of high bandwidth, low latency and kernel bypass,
Remote Direct Memory Access (RDMA) [3] interconnects
play an important role for the I/O efficiency, and it has been
widely deployed in HPC and data centers as an I/O perfor-
mance booster. To benefit from these RDMA advantages
in virtualized HPC environment, network communication
supporting InfiniBand and RDMA over Converged Ethernet
(RoCE) [4] must be enabled for the underlying virtualized
devices.
There are a few existing solutions for RDMA virtualiza-
tion, e.g. vRDMA [14] from VMware and HyV [13],a hybrid
I/O virtualization framework for RDMA-capable network
interfaces, from IBM. However, none of them is applica-
ble for virtualization on HPC. vRDMA is only available for
VMware ESXi guest, and it is not open source based and not
free. HyV supports only for Linux kernel 3.13, and it relies
heavily on Linux kernel drivers, which tightly couples Linux
host and guest, excluding the usage of lightweight library
OSes [8].
To enable such communications for virtualized devices
with OSv, a lightweight, fast and simple library OS for
Cloud, we designed a new para-virtualized frontend driver,
virtio-rdma, for RDMA-capable fabrics. This solution aims
to disrupt the overhead barrier preventing HPC Cloud adop-
tion, enable HPC applications to run in virtual machines
with a performance comparable to bare metal, and bring
all the benefits from the Cloud. The virtio-rdma frontend
driver is designed to support also shared memory communi-
cation for the virtual machines (VM) on the same host. By
switching the protocols automatically in virtio-rdma, user
application uses only the standard RDMA API to accom-
plish both inter-host and intra-host communications.
On the other hand, our design also includes a solution to
use OSv and virtio-rdma in HPC environment. By extend-
ing Torque [5], necessary environment settings are config-
ured to launch OSv and its components. The job submitting
procedure is the same as on a normal HPC platform, except
the job command line needs simple adoptions.
This paper is structured as follows: section 2 introduces
the fundamental work for this design; section 3 presents the
details of the virtio-rdma, including its components, capa-
bilities and advantages; section 4 shows the basic integra-
tion solution for running HPC jobs with OSv and virtio-