Chapter 1: So You Want to Use a Cluster
Overview
William Gropp
What is a "Beowulf Cluster" and what is it good for? Simply put, a Beowulf Cluster is a supercomputer that anyone can
build and use. More specifically, a Beowulf Cluster is a parallel computer built from commodity components. This
approach takes advantage of the astounding performance now available in commodity personal computers. By many
measures, including computational speed, size of main memory, available disk space and bandwidth, a single PC of
today is more powerful than the supercomputers of the past. By harnessing the power of tens to thousands of such low-
cost but powerful processing elements, you can create a powerful supercomputer. In fact, the number 5 machine on
the "Top500" list of the world's most powerful supercomputers is a Beowulf Cluster.
A Beowulf cluster is a form of parallel computer, which is nothing more than a computer that uses more than one
processor. There are many different kinds of parallel computer, distinguished by the kinds of processors they use and
the way in which those processors exchange data. A Beowulf cluster takes advantage of two commodity components:
fast CPUs designed primarily for the personal computer market and networks designed to connect personal computers
together (in what is called a local area network or LAN). Because these are commodity components, their cost is
relatively low. As we will see later in this chapter, there are some performance consequences, and Beowulf clusters are
not suitable for all problems. However, for the many problems for which they do work well, Beowulf clusters provide an
effective and low-cost solution for delivering enormous computational power to applications and are now used virtually
everywhere. This raises the following question: If Beowulf clusters are so great, why didn't they appear earlier?
Many early efforts used clusters of smaller machines, typically workstations, as building blocks in creating low-cost
parallel computers. In addition, many software projects developed the basic software for programming parallel
machines. Some of these made their software available for all users, and emphasized portability of the code, making
these tools easily portable to new machines. But the project that truly launched clusters was the Beowulf project at the
NASA Goddard Space Flight center. In 1994, Thomas Sterling, Donald Becker, and others took an early version of the
Linux operating system, developed Ethernet driver software for Linux, and installed PVM (a software package for
programming parallel computers) on 16 100MHz Intel 80486-based PCs. This cluster used dual 10-Mbit Ethernet to
provide improved bandwidth in communications between processors, but was otherwise very simple—and very low
cost.
Why did the Beowulf project succeed? Part of the answer is that it was the right solution at the right time. PCs were
beginning to become competent computational platforms (a 100MHz 80486 has a faster clock than the original Cray 1,
a machine considered one of the most important early supercomputers). The explosion in the size of the PC market
was reducing the cost of the hardware through economies of scale. Equally important, however, was a commitment by
the Beowulf project to deliver a working solution, not just a research testbed. The Beowulf project worked hard to "dot
the i's and cross the t's," addressing many of the real issues standing in the way of widespread adoption of cluster
technology for commodity components. This was a critical contribution; making a cluster solid and reliable often
requires solving new and even harder problems; it isn't just hacking. The contribution of the community to this effort,
through contributions of software and general help to others building clusters, made Beowulf clustering exciting.
Since the early Beowulf clusters, the use of commodity-off-the-shelf (COTS) components for building clusters has
mushroomed. Clusters are found everywhere, from schools to dorm rooms to the largest machine rooms. Large
clusters are an increasing percentage of the Top500 list. You can still build your own cluster by buying individual
components, but you can also buy a preassembled and tested cluster from many vendors, including both large and
well-established computer companies and companies formed just to sell clusters.
This book will give you an understanding of what Beowulfs are, where they can be used (and where they can't), and
how they work. To illustrate the issues, specific operations, such as installation of a software package are described.
However, this book is not a cookbook; software and even hardware change too fast for that to be practical. The best
use of this book is to read it for understanding; to build a cluster, then go out and find the most up-to-date information
file:///I|/a1/MIT.Press.-.Beowulf.Cluster.Computing.with.Linux,.Second.Edition.chm/7017final/LiB0007.html (1 of 2)2005/8/17 上午 11:12:12