Minimizing Packet Delay via Load Balancing in
Clos Switching Networks for Datacenters
Shu Yang, Shanshan Xin, Zhipeng Zhao and Bin Wu
School of Computer Science and Technology
Tianjin University
Tianjin, China
binwu.tju@gmail.com
Abstract—We consider unbalanced traffic distribution in a
three-stage Clos switching network for datacenters to ensure
100% throughput and minimize traffic delay by balancing
bandwidth utilization. Existing traffic scheduling algorithms
with packet delay bound guarantee neglect the fact that
unbalanced demands may lead to poor bandwidth utilization
and thus lead to poor switching performance. To make up
these deficiencies, we propose an efficient multiple hops
strategy (MHS). MHS can well utilize internal bandwidth of
the switch to achieve load balancing. As the first step of MHS,
we first propose a naive algorithm AHF (Average
Heavy-Traffic First) to decompose the traffic matrix by taking
load balancing into account. By shifting and transmitting
packets using those empty time slots (i.e., not-yet-utilized slots)
in the configurations generated by AHF, the second step of
MHS is achieved to minimize packet delay via a better load
balancing strategy. Simulation results confirm that MHS
outperforms the naive AHF algorithm by effectively reducing
packet delay bound guarantee.
Keywords-Clos switching network; data center; load
balancing; matrix decomposition; packet switching.
I. INTRODUCTION
With the rapid development of cloud computing, data
center as its basic platform is shifting to a high-performance
computing center of big data operation and storage rather
than simply a unified server hosting place. It is common for a
data center network (DCN) to contain thousands of cluster
servers which are connected via network switches to form a
large distributed system [1].
Because of the large number of servers in a DCN,
scalable multi-stage networks such as Clos or Fat-Tree have
been applied in data switching among servers. Fig. 1 shows a
three-stage Clos network, which is defined by Clos(n, m, r).
It has r input switches (×), m middle switches (×),
and r output switches (×). The k
th
switch in a particular
stage is connected to the k
th
input port of each switch in the
next stage, which provides extensive path diversity [2]. As a
variation of Clos architecture, Fat-Tree can be obtained by
folding the Clos network at the middle stage [3].
In a typical DCN, the Top-of-Rack (ToR) switch in the
access layer provides connectivity to the servers mounted on
every rack. In Fig. 1, a pair of input and output switches can
be combined into a ToR switch of the corresponding rack. In
order to analyze the network topology, we generally treat a
ToR switch as a pair of separate input and output switches.
In order to support future scalable DCNs, lots of
high-capacity optical switching technologies have been
adopted in different stages of the Clos network. An
additional aggregation stage can be inserted between ToR
and middle swiches (i.e., core switches in a particular DCN)
to multiplex traffic across several racks onto a single fiber.
Besides, high-capacity optical switches can be used as core
switches through which pairs of rack with high bandwidth
demands are connected.
However, reconfiguration overhead (denoted by δ) in
optical switches that caused by the adjustment of system
synchronization and tunable optical component cannot be
ignored in current DCNs. Usually, the whole reconfiguration
overhead time can total up to multiple slots and no data can
be transmitted during this time, which inevitably leads to
packet delay [6].
Existing works such as ADAPT and QLEF [4-5] focus
on traffic scheduling with packet delay bound guarantee.
Based on these two algorithms, some works [6] on Clos
networks consider batch scheduling based packet switching
in DCNs with reconfiguration overhead at each middle
switches to improve system performance, including
minimizing packet delay and switch cost. Though all those
algorithms can achieve 100% throughput while taking the
reconfiguration overhead into consideration, they neglect the
fact that unbalanced demands may lead to poor bandwidth
utilization and thus result in poor switching performance.
0 0
0
1
m-1
...
1
...
r-1
...
...
1
...
r-1
...
... ... ...
...
... ...
...
...
...
...
...
...
n
n
n
n
n
n
ĂĂ
input stage middle stage output stage
Ă
Ă
Figure 1. Clos (n, m, r) network
2016 International Conference on Networking and Network Applications
978-1-4673-9803-9/15 $31.00 © 2015 IEEE
DOI 10.1109/NaNA.2016.14
23
2016 International Conference on Networking and Network Applications
978-1-4673-9803-9/15 $31.00 © 2015 IEEE
DOI 10.1109/NaNA.2016.14
23