堆叠拥塞控制下的数据中心多租户多目标带宽分配策略

76 浏览量更新于2024-08-28 收藏 326KB PDF 举报

"这篇研究论文探讨了在数据中心网络中如何使用堆叠拥塞控制进行多租户、多目标带宽分配的问题。论文作者包括陈天、阿里·穆尼尔、李彦宗等人，分别来自南京大学、密歇根州立大学、加州大学欧文分校以及华为香港的未来网络理论实验室。论文提出了一个名为Stacked Congestion Control (SCC)的分布式主机基带宽分配方案，旨在同时实现性能隔离和目标导向的流调度，以满足不同租户和内部流量的不同性能目标。数据中心网络中的流量具有多种不同的性能目标，论文引入了“租户目标划分”概念，将同一租户内具有相同目标的流量归为一类。带宽分配的目标不仅需要在不同划分之间实现性能隔离，还要在同一划分内的流量之间进行目标导向的调度。现有的实践工作中，尚无解决方案能同时实现这两点。 MT-MO（Multi-Tenant Multi-Objective）带宽分配问题被论文作为研究焦点，这是一个挑战性的问题，因为需要在满足多个租户的需求的同时，确保每个租户内部的流量能根据其特定目标得到适当的带宽资源。为解决这个问题，论文提出了SCC方案，这是一种堆叠的拥塞控制策略。SCC通过分布式的方式在每个主机上实施，能够动态调整带宽分配，从而适应各种性能目标和网络条件。 SCC的关键特性在于其层次结构的设计，它允许不同层的拥塞控制算法协同工作，以处理不同级别的性能目标。例如，上层可能关注公平性或最大吞吐量，而下层可能侧重于最小延迟或抖动控制。这种堆叠设计使得SCC能够灵活应对多租户环境中的复杂需求，同时保持网络的高效运行。此外，论文可能还深入讨论了SCC的实现细节，包括算法设计、性能评估以及与现有拥塞控制机制的比较。通过仿真和实际部署的实验，SCC的性能和效率得到了验证，证明了它在实现性能隔离和目标导向调度方面的有效性。这篇研究论文为数据中心的带宽管理提供了一个创新的解决方案，即堆叠拥塞控制，它有助于优化多租户环境下的网络资源分配，提升服务质量和用户体验。"

Multi-Tenant Multi-Objective Bandwidth Allocation

in Datacenters Using Stacked Congestion Control

Chen Tian

†

, Ali Munir

‡

, Alex X. Liu

†‡

, Yingtong Liu



, Yanzhao Li

†

, Jiajun Sun

†

, Fan Zhang

, Gong Zhang

†

State Key Laboratory for Novel Software Technology, Nanjing University, China

‡

Department of Computer Science and Engineering, Michigan State University, USA

Department of Computer Science, University of California, Irvine, USA

Future Network Theory Lab, Huawei, Hong Kong, China

Abstract—In datacenter networks, ﬂows can have different

performance objectives. We use a tenant-objective division to

denote all ﬂows of a tenant that share the same objective.

Bandwidth allocation in datacenters should support not only

performance isolation among divisions but also objective-oriented

scheduling among ﬂows within the same division. This paper

studies the Multi-Tenant Multi-Objective (MT-MO) bandwidth

allocation problem. To our best knowledge, no existing practical

work support performance isolation and objective scheduling

simultaneously. We propose Stacked Congestion Control (SCC),

a distributed host-based bandwidth allocation design, where an

underlay congestion control (UCC) layer handles contention

among divisions, and a private congestion control (PCC) layer for

each division optimizes its performance objective. Via the tenant-

objective tunnel abstraction, SCC achieves weighted bandwidth

sharing for each division in a distributed and transparent way.

By adding a rate-limiting send queue in the ingress of each

tunnel, mechanisms between performance isolation and objective

scheduling are completely decoupled. We evaluate SCC both

on a small-scale testbed and with large-scale NS-2 simulations.

Compared to the direct coexistence cases, SCC reduces latency

by up to 40% for Latency-Sensitive ﬂows, deadline miss ratio

by up to 3.2× for Deadline-Sensitive ﬂows, and average ﬂow-

completion-time by up to 53% for Completion-Sensitive ﬂows.

I. INTRODUCTION

Motivation: In datacenter networks, ﬂows can have different

performance objectives. A private datacenter is shared by vari-

ous tenants, such as search engine, advertising and e-Business

applications. Each tenant can run many service entities (e.g.,

Virtual Machines, Containers, Java processes) that communi-

cate over the underlying network. The ﬂows generated by these

services have different performance objectives due to their

service requirements. Some ﬂows are Latency-Sensitive (LS):

service can enqueue copies of a task in multiple servers to

combat computation time variability [1]; to minimize resource

wastage, a cancelation message should be sent to the counter-

part servers as soon as the ﬁrst replica is ﬁnished. On the other

hand, some ﬂows are Deadline-Sensitive (DS): the partition-

aggregate architecture of Online Data Intensive applications

(OLDI) [2], [3] and real-time analytic [4], [5] enforce deadline

semantics for every leaf-to-parent ﬂow. Furthermore for many

other applications, minimizing average ﬂow-completion-time

(AFCT) can signiﬁcantly improve their performance [6], [7],

[8], and we call these ﬂows as Completion-Sensitive (CS)

ﬂows. We use a tenant-objective division to denote all the

ﬂows of a tenant that share the same performance objective.

Bandwidth allocation in datacenters should support not only

performance isolation among divisions but also objective-

oriented scheduling among ﬂows within the same division.

Bandwidth allocation design, in essence, deﬁnes how ﬂows

behave when congestion happens. Most datacenter networks

are oversubscribed [9] and congestion is not uncommon:

packet drops due to congestion can be observed when the

whole network utilization is around only 25% [10]. To achieve

performance isolation, administrators can assign weights to

different divisions that share the underlying network [11]. For

example, upon congestion, an administrator may prefer tenant

A’s DS ﬂows over tenant B’s DS ﬂows, or all tenants’ LS

ﬂows over their CS ﬂows. Various techniques can be used to

support objective-oriented ﬂow scheduling: some reduce tail

latency of messages [12], [13], [14], [15], some add deadline

awareness [7], [16], [17], [18], and others focus on reducing

AFCT [7], [19], [20], [6], [8], [21], [22], [23], [24].

This paper studies the Multi-Tenant Multi-Objective (MT-

MO) bandwidth allocation problem in datacenter networks. To

our best knowledge, no existing work supports performance

isolation and objective scheduling simultaneously.

Limitations of Prior Art: Many of the existing objective-

oriented approaches [7], [12], [13], [14], [15], [19], [20],

[6], [8], [21], [22], [23], [24] are designed to achieve only

a single performance objective at a time, and there could be

severe interference if approaches of different objectives coexist

without isolation. This happens because these approaches may

detect congestion differently (e.g., packet drop, or ECN) or

react to congestion differently (e.g., the ECN co-existence

problem in production Cloud [25], [26]). pFabric [6] and

Karuna [27] evaluate the coexistence of the DS and CS

ﬂows by setting absolute priority to DS ﬂows over CS ﬂows.

However, performance isolation among ﬂows with the same

objective but of different tenants is not considered. Further-

more, existing performance isolation approaches cannot op-

timize performance objectives for individual tenant-objective

division. Neither bandwidth guarantee [28], [29], [30], [31],

[32], [33] nor proportional sharing [11] can perform bandwidth

allocation at ﬂow-level granularity.

Bandwidth allocation design should be practical and readily-

deployable. Many works either require non-trivial switch mod-

iﬁcations [13], [7], [19], [6], [34], or assume non-blocking

IEEE INFOCOM 2017 - IEEE Conference on Computer Communications

下载后可阅读完整内容，剩余8页未读，立即下载

weixin_38618784

粉丝: 11
资源: 884

堆叠拥塞控制下的数据中心多租户多目标带宽分配策略

翻译版 拥塞控制算法系列之：Swift-谷歌2020年SIGCOM-包级别端到端TIMELY拥塞控制算法

锐捷交换机堆叠的端到端QoS优化：配置与改进技巧

026-SVM用于分类时的参数优化，粒子群优化算法，用于优化核函数的c,g两个参数(SVM PSO) Matlab代码.rar

铅酸电池失效仿真comsol

小程序项目-基于微信小程序的童心党史小程序（包括源码，数据库，教程）.zip

小程序项目-基于微信小程序的新生报到系统（包括源码，数据库，教程）.zip

springboot124中药实验管理系统设计与实现.zip

解除劳动合同协议书.doc

快速过滤图像融合Matlab代码.rar

强调图像中内核形状（例如直线）的过滤器Matlab代码.rar

最新资源

翻译版拥塞控制算法系列之：Swift-谷歌2020年SIGCOM-包级别端到端TIMELY拥塞控制算法