100 PowerSolutions
B
eowulf is a concept of clustering commodity computers
to form a parallel, virtual supercomputer. The communi-
cations subsystem is the crucial “clustering technology” that
harnesses the computing power of a collection of computer
systems and transforms them into a high-performance clus-
ter. The combination of the physical interconnection, the
communications protocol, and the message passing interface
comprises the communications subsystem. It allows the
processes of a parallel application to exchange messages
during their collaborative execution.
Figure 1 shows the architecture of a typical communica-
tions subsystem for a Beowulf cluster. At the lowest layer,
the cluster interconnect consists of host interfaces (network
interface cards) and switches, which physically interconnect
compute nodes to form a cluster. Communications protocols
help compute nodes exchange data packets through the
physical layer by determining how data packets are formed
and routed to their destinations.
Applications for a Beowulf cluster must be written in paral-
lel style and use the message-passing programming model. Jobs
of a parallel application are spawned to compute nodes, which
work collaboratively until the application is finished. During
the execution, compute nodes use a standard message passing
interface, such as Message Passing Interface (MPI) or Parallel
Virtual Machine (PVM), to exchange information.
Within the communications subsystem, factors influenc-
ing its qualities include:
■
Efficient implementation of host interface, such as hard-
ware assistance for packet formation and direct memory
access (DMA)
■
Ultra-low latency and high link speed
■
Supported connectivities and topologies
■
Non-blocking routing capability of switches that support
simultaneous communications
■
Cost of interconnect per compute node, including factors
such as the host interface(s) and all switches and cables
Communications protocols used by the high-speed inter-
connect must provide reliable data transport. They must also
provide simultaneous, memory-protected, user-level host
interface access that bypasses the OS, and low CPU over-
head for protocol processing.
Efficient Communications
Subsystems
for a
Cost-Effective, High-Performance
Beowulf Cluster
This article continues the discussion of design choices for building cost-effective,
high-performance Beowulf clusters. After selecting the appropriate compute node
(as discussed in “Design Choices for a Cost-Effective, High-Performance Beowulf
Cluster,” Issue 3, 2000), the next design step is to decide the communications
subsystem. This is the “gluing technology” that turns a group of autonomous
compute nodes into a high-performance Beowulf cluster.
By Jenwei Hsieh, Ph.D.; Tau Leng; and YungChin Fang
HIGH PERFORMANCE COMPUTING