3 Control Plane
The Andromeda control plane consists of three layers:
Cluster Management (CM) Layer:
The CM layer pro-
visions networking, storage, and compute resources on
behalf of users. This layer is not networking-specific, and
is beyond the scope of this paper.
Fabric Management (FM) Layer:
The FM layer ex-
poses a high-level API for the CM Layer to configure
virtual networks. The API expresses user intent and ab-
stracts implementation details, such as the mechanism for
programming switches, the encapsulation format, and the
network elements responsible for specific functions.
Switch Layer:
In this layer, two types of software
switches support primitives such as encapsulation, for-
warding, firewall, and load balancing. Each VM host has a
virtual switch based on Open vSwitch [
33
], which handles
traffic for all VMs on the host. Hoverboards are standalone
switches, which act as default routers for some flows.
3.1 FM Layer
When the CM layer connects to a FM controller, it sends
a
full
update containing the complete FM configuration
for the cluster. Subsequent updates are diffs against previ-
ously sent configuration. The FM configuration consists
of a set of entities with a known type, a unique name, and
parameters defining entity properties. Figure 2 lists some
examples of FM entities.
The FM API is implemented by multiple types of con-
trollers, each responsible for different sets of network
devices. Presently, VM Controllers (VMCs) program
VM hosts and Hoverboards, while Load-Balancing Con-
trollers [
12
] program load balancers. This paper focuses
on VMCs.
VMCs program VM host switches using a combination
of OpenFlow [
3
,
28
] and proprietary extensions. VMCs
send OpenFlow requests to proxies called OpenFlow
Front Ends (OFEs) via RPC – an architecture inspired
by Onix [
25
]. OFEs translate those requests to Open-
Flow. OFEs decouple the controller architecture from the
OpenFlow protocol. Since OFEs maintain little internal
state, they also serve as a stable control point for VM
host switches. Each switch has a stable OFE connection
without regard for controller upgrade or repartitioning.
OFEs send switch events to VMCs, such as when a
switch connects to it, or when virtual ports are added for
new VMs. VMCs generate OpenFlow programming for
switches by synthesizing the abstract FM programming
and physical information reported in switch events. When
a VMC is notified that a switch connected, it reconciles the
switch’s OpenFlow state by reading the switch’s state via
the OFE, comparing that to the state expected by the VMC,
and issuing update operations to resolve any differences.
Network: QoS, firewall rules, . . .
VM: Private IP, external IPs, tags, . . .
Subnetwork: IP prefix
Route: IP prefix, priority, next hop, . . .
Figure 2: Examples of FM Entities
Multiple VMC partitions are deployed in every cluster.
Each partition is responsible for a fraction of the cluster
hosts, determined by consistent hashing [
20
]. The OFEs
broadcast some events, such as switch-connected events,
to all VMC partitions. The VMC partition responsible for
the host switch that generated the event will then subscribe
to other events from that host.
3.2 Switch Layer
The switch layer has a programmable software switch
on each VM host, as well as software switches called Hov-
erboards, which run on dedicated machines. Hoverboards
and host switches run a user-space dataplane and share
a common framework for constructing high-performance
packet processors. These dataplanes bypass the host ker-
nel network stack, and achieve high performance through
a variety of techniques. Section 4 discusses the VM host
dataplane architecture.
We employ a modified Open vSwitch [
33
] for the con-
trol portion of Andromeda’s VM host switches. A user-
space process called vswitchd receives OpenFlow pro-
gramming from the OFE, and programs the datapath. The
dataplane contains a flow cache, and sends packets that
miss in the cache to vswitchd. vswitchd looks up the flow
in its OpenFlow tables and inserts a cache entry.
We have modified the switch in a number of substantial
ways. We added a C++ wrapper to the C-based vswitchd,
to include a configuration mechanism, debugging hooks,
and remote health checks. A management plane process
called the host agent supports VM lifecycle events, such
as creation, deletion, and migration. For example, when
a VM is created, the host agent connects it to the switch
by configuring a virtual port in Open vSwitch for each
VM network interface. The host agent also updates VM
to virtual port mapping in the FM.
Extension modules add functionality not readily ex-
pressed in OpenFlow. Such extensions include connection
tracking firewall, billing, sticky load balancing, security
token validation, and WAN bandwidth enforcement [
26
].
The extension framework consists of upcall handlers run
during flow lookup. For example, Pre-lookup handlers
manage flow cache misses prior to OpenFlow lookup. One
such handler validates security tokens, which are crypto-
graphic ids inserted into packet headers to prevent spoof-
ing in the fabric. Another type is group lookup handlers,
which override the behavior of specific OpenFlow groups,
e.g., to provide sticky load balancing.
USENIX Association 15th USENIX Symposium on Networked Systems Design and Implementation 375