Multi-Tenant Multi-Objective Bandwidth Allocation
in Datacenters Using Stacked Congestion Control
Chen Tian
†
, Ali Munir
‡
, Alex X. Liu
†‡
, Yingtong Liu
, Yanzhao Li
†
, Jiajun Sun
†
, Fan Zhang
§
, Gong Zhang
§
†
State Key Laboratory for Novel Software Technology, Nanjing University, China
‡
Department of Computer Science and Engineering, Michigan State University, USA
Department of Computer Science, University of California, Irvine, USA
§
Future Network Theory Lab, Huawei, Hong Kong, China
Abstract—In datacenter networks, flows can have different
performance objectives. We use a tenant-objective division to
denote all flows of a tenant that share the same objective.
Bandwidth allocation in datacenters should support not only
performance isolation among divisions but also objective-oriented
scheduling among flows within the same division. This paper
studies the Multi-Tenant Multi-Objective (MT-MO) bandwidth
allocation problem. To our best knowledge, no existing practical
work support performance isolation and objective scheduling
simultaneously. We propose Stacked Congestion Control (SCC),
a distributed host-based bandwidth allocation design, where an
underlay congestion control (UCC) layer handles contention
among divisions, and a private congestion control (PCC) layer for
each division optimizes its performance objective. Via the tenant-
objective tunnel abstraction, SCC achieves weighted bandwidth
sharing for each division in a distributed and transparent way.
By adding a rate-limiting send queue in the ingress of each
tunnel, mechanisms between performance isolation and objective
scheduling are completely decoupled. We evaluate SCC both
on a small-scale testbed and with large-scale NS-2 simulations.
Compared to the direct coexistence cases, SCC reduces latency
by up to 40% for Latency-Sensitive flows, deadline miss ratio
by up to 3.2× for Deadline-Sensitive flows, and average flow-
completion-time by up to 53% for Completion-Sensitive flows.
I. INTRODUCTION
Motivation: In datacenter networks, flows can have different
performance objectives. A private datacenter is shared by vari-
ous tenants, such as search engine, advertising and e-Business
applications. Each tenant can run many service entities (e.g.,
Virtual Machines, Containers, Java processes) that communi-
cate over the underlying network. The flows generated by these
services have different performance objectives due to their
service requirements. Some flows are Latency-Sensitive (LS):
service can enqueue copies of a task in multiple servers to
combat computation time variability [1]; to minimize resource
wastage, a cancelation message should be sent to the counter-
part servers as soon as the first replica is finished. On the other
hand, some flows are Deadline-Sensitive (DS): the partition-
aggregate architecture of Online Data Intensive applications
(OLDI) [2], [3] and real-time analytic [4], [5] enforce deadline
semantics for every leaf-to-parent flow. Furthermore for many
other applications, minimizing average flow-completion-time
(AFCT) can significantly improve their performance [6], [7],
[8], and we call these flows as Completion-Sensitive (CS)
flows. We use a tenant-objective division to denote all the
flows of a tenant that share the same performance objective.
Bandwidth allocation in datacenters should support not only
performance isolation among divisions but also objective-
oriented scheduling among flows within the same division.
Bandwidth allocation design, in essence, defines how flows
behave when congestion happens. Most datacenter networks
are oversubscribed [9] and congestion is not uncommon:
packet drops due to congestion can be observed when the
whole network utilization is around only 25% [10]. To achieve
performance isolation, administrators can assign weights to
different divisions that share the underlying network [11]. For
example, upon congestion, an administrator may prefer tenant
A’s DS flows over tenant B’s DS flows, or all tenants’ LS
flows over their CS flows. Various techniques can be used to
support objective-oriented flow scheduling: some reduce tail
latency of messages [12], [13], [14], [15], some add deadline
awareness [7], [16], [17], [18], and others focus on reducing
AFCT [7], [19], [20], [6], [8], [21], [22], [23], [24].
This paper studies the Multi-Tenant Multi-Objective (MT-
MO) bandwidth allocation problem in datacenter networks. To
our best knowledge, no existing work supports performance
isolation and objective scheduling simultaneously.
Limitations of Prior Art: Many of the existing objective-
oriented approaches [7], [12], [13], [14], [15], [19], [20],
[6], [8], [21], [22], [23], [24] are designed to achieve only
a single performance objective at a time, and there could be
severe interference if approaches of different objectives coexist
without isolation. This happens because these approaches may
detect congestion differently (e.g., packet drop, or ECN) or
react to congestion differently (e.g., the ECN co-existence
problem in production Cloud [25], [26]). pFabric [6] and
Karuna [27] evaluate the coexistence of the DS and CS
flows by setting absolute priority to DS flows over CS flows.
However, performance isolation among flows with the same
objective but of different tenants is not considered. Further-
more, existing performance isolation approaches cannot op-
timize performance objectives for individual tenant-objective
division. Neither bandwidth guarantee [28], [29], [30], [31],
[32], [33] nor proportional sharing [11] can perform bandwidth
allocation at flow-level granularity.
Bandwidth allocation design should be practical and readily-
deployable. Many works either require non-trivial switch mod-
ifications [13], [7], [19], [6], [34], or assume non-blocking
IEEE INFOCOM 2017 - IEEE Conference on Computer Communications
978-1-5090-5336-0/17/$31.00 ©2017 IEEE